search.overview.html 12 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147
  1. <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
  2. <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
  3. <head>
  4. <meta http-equiv="Content-Type" content="text/html; charset=gbk" />
  5. <meta name="language" content="zh-cn" />
  6. <link rel="stylesheet" type="text/css" href="../api/css/style.css" />
  7. <link rel="stylesheet" type="text/css" href="../api/css/guide.css" />
  8. <link rel="stylesheet" type="text/css" href="../api/css/highlight.css" />
  9. <title>搜索概述</title>
  10. </head>
  11. <body>
  12. <div id="apiPage">
  13. <div id="apiHeader">
  14. <a href="http://www.xunsearch.com" target="_blank">Xunsearch PHP-SDK</a> v1.3.2 权威指南
  15. </div><!-- end of header -->
  16. <div id="content" class="markdown">
  17. <div class="toc"><ol><li><a href="#ch0">如何开始使用搜索?</a></li><li><a href="#ch1">典型搜索做法</a></li><li><a href="#ch2">关于快捷搜索</a></li><li><a href="#ch3">搜索中的串接操作</a></li><li><a href="#ch4">搜索日志</a></li><li><a href="#ch5">如何进行多库搜索?</a></li></ol></div><h1 id="-">搜索概述</h1>
  18. <p>在索引库建立完成后,现在开始学习使用搜索功能,这也是最核心的部分。</p>
  19. <h2 id="ch0">1. 如何开始使用搜索?<a name="ch0" class="anchor">?</a></h2>
  20. <p>在 <code>PHP-SDK</code> 中,搜索功能由类型为 <a href="../api/XSSearch.html">XSSearch</a> 的对象所维护。在 <a href="../api/XS.html">XS</a> 项目中,通过读取
  21. <a href="../api/XS.html#search">XS::search</a> 属性来获取搜索操作对象,然后展开使用,而不是自行创建对象。后面章节中的
  22. 相关测试代码如果没有特别编写,其中的 <code>$search</code> 均为通过类似以下的方式获取的索引对象:</p>
  23. <div class="hl-code"><div class="php-hl-main"><pre><span class="php-hl-reserved">require</span> <span class="php-hl-quotes">'</span><span class="php-hl-string">$prefix/sdk/php/lib/XS.php</span><span class="php-hl-quotes">'</span><span class="php-hl-code">;
  24. </span><span class="php-hl-var">$xs</span><span class="php-hl-code"> = </span><span class="php-hl-reserved">new</span> <span class="php-hl-identifier">XS</span><span class="php-hl-brackets">(</span><span class="php-hl-quotes">'</span><span class="php-hl-string">demo</span><span class="php-hl-quotes">'</span><span class="php-hl-brackets">)</span><span class="php-hl-code">; </span><span class="php-hl-comment">//</span><span class="php-hl-comment"> 建立 XS 对象,项目名称为:demo</span>
  25. <span class="php-hl-var">$search</span><span class="php-hl-code"> = </span><span class="php-hl-var">$xs</span><span class="php-hl-code">-&gt;</span><span class="php-hl-identifier">search</span><span class="php-hl-code">; </span><span class="php-hl-comment">//</span><span class="php-hl-comment"> 获取 搜索对象</span></pre></div></div>
  26. <blockquote class="info">
  27. <p><strong>Info:</strong> 搜索对象内置了字符集智能转换,如果您使用的字符集和项目默认的字符集 <a href="../api/XS.html#defaultCharset">XS::defaultCharset</a>
  28. 不一致,请调用 <a href="../api/XSSearch.html#setCharset">XSSearch::setCharset</a> 在开始其它搜索前设置正确的字符集。</p>
  29. </blockquote>
  30. <h2 id="ch1">2. 典型搜索做法<a name="ch1" class="anchor">?</a></h2>
  31. <p>一个典型的搜索基本流程是把构建好的搜索语句,通过合适的 <code>API</code> 进行必要的修饰,
  32. 再传递给底层的搜索服务器进行处理,然后把匹配的结果返回。具体包括以下步骤:</p>
  33. <ul>
  34. <li>构建搜索查询语句 <code>query</code>,然后调用 <a href="../api/XSSearch.html#setQuery">XSSearch::setQuery</a> 设定它</li>
  35. <li>根据需要设置附加的查询条件:通过 <a href="../api/XSSearch.html#addWeight">XSSearch::addWeight</a> 干扰排名权重,
  36. 通过 <a href="../api/XSSearch.html#addRange">XSSearch::addRange</a> 添加字段搜索区间或范围,
  37. 通过 <a href="../api/XSSearch.html#setFuzzy">XSSearch::setFuzzy</a> 开启模糊匹配,以获取更多搜索结果</li>
  38. <li>进行必要的搜索结果限定:通过 <a href="../api/XSSearch.html#setLimit">XSSearch::setLimit</a> 设置搜索结果数量和偏移,
  39. 通过 <a href="../api/XSSearch.html#setSort">XSSearch::setSort</a> 设置搜索结果的排序方式,等等</li>
  40. <li>执行搜索,并获取搜索结果,关于搜索结果的处理详见后面的章节</li>
  41. </ul>
  42. <p>代码如下:</p>
  43. <div class="hl-code"><div class="php-hl-main"><pre><span class="php-hl-var">$query</span><span class="php-hl-code"> = </span><span class="php-hl-quotes">'</span><span class="php-hl-string">项目测试</span><span class="php-hl-quotes">'</span><span class="php-hl-code">; </span><span class="php-hl-comment">//</span><span class="php-hl-comment"> 这里的搜索语句很简单,就一个短语</span>
  44. <span class="php-hl-var">$search</span><span class="php-hl-code">-&gt;</span><span class="php-hl-identifier">setQuery</span><span class="php-hl-brackets">(</span><span class="php-hl-var">$query</span><span class="php-hl-brackets">)</span><span class="php-hl-code">; </span><span class="php-hl-comment">//</span><span class="php-hl-comment"> 设置搜索语句</span>
  45. <span class="php-hl-var">$search</span><span class="php-hl-code">-&gt;</span><span class="php-hl-identifier">addWeight</span><span class="php-hl-brackets">(</span><span class="php-hl-quotes">'</span><span class="php-hl-string">subject</span><span class="php-hl-quotes">'</span><span class="php-hl-code">, </span><span class="php-hl-quotes">'</span><span class="php-hl-string">xunsearch</span><span class="php-hl-quotes">'</span><span class="php-hl-brackets">)</span><span class="php-hl-code">; </span><span class="php-hl-comment">//</span><span class="php-hl-comment"> 增加附加条件:提升标题中包含 'xunsearch' 的记录的权重</span>
  46. <span class="php-hl-var">$search</span><span class="php-hl-code">-&gt;</span><span class="php-hl-identifier">setLimit</span><span class="php-hl-brackets">(</span><span class="php-hl-number">5</span><span class="php-hl-code">, </span><span class="php-hl-number">10</span><span class="php-hl-brackets">)</span><span class="php-hl-code">; </span><span class="php-hl-comment">//</span><span class="php-hl-comment"> 设置返回结果最多为 5 条,并跳过前 10 条</span>
  47. <span class="php-hl-var">$docs</span><span class="php-hl-code"> = </span><span class="php-hl-var">$search</span><span class="php-hl-code">-&gt;</span><span class="php-hl-identifier">search</span><span class="php-hl-brackets">(</span><span class="php-hl-brackets">)</span><span class="php-hl-code">; </span><span class="php-hl-comment">//</span><span class="php-hl-comment"> 执行搜索,将搜索结果文档保存在 $docs 数组中</span>
  48. <span class="php-hl-var">$count</span><span class="php-hl-code"> = </span><span class="php-hl-var">$search</span><span class="php-hl-code">-&gt;</span><span class="php-hl-identifier">count</span><span class="php-hl-brackets">(</span><span class="php-hl-brackets">)</span><span class="php-hl-code">; </span><span class="php-hl-comment">//</span><span class="php-hl-comment"> 获取搜索结果的匹配总数估算值</span></pre></div></div>
  49. <blockquote class="tip">
  50. <p><strong>Tip:</strong> 除了调用 <a href="../api/XSSearch.html#search">XSSearch::search</a> 获取搜索结果外,在某些情况我们可能只想知道结果的命中数量,
  51. 那么可以直接调用 <a href="../api/XSSearch.html#count">XSSearch::count</a> 来获取。但要指出的是,这个结果计数只是一个估算值。</p>
  52. </blockquote>
  53. <h2 id="ch2">3. 关于快捷搜索<a name="ch2" class="anchor">?</a></h2>
  54. <p>除了上述的典型搜索方式外,我们还提供一种称为快捷搜索的方式。其实就是直接将 <code>query</code> 语句作为参数传递给相应的
  55. <code>API</code> 调用 <a href="../api/XSSearch.html#count">XSSearch::count</a> 和 <a href="../api/XSSSearch.html#search">XSSSearch::search</a>。由于不经过 <code>setQuery</code> 因此有些其它辅助的功能受到
  56. 限制,比如不能进行结果高亮、不能通过 <code>addWeight</code>、<code>addRange</code> 增加辅助搜索条件。</p>
  57. <div class="hl-code"><div class="php-hl-main"><pre><span class="php-hl-var">$count</span><span class="php-hl-code"> = </span><span class="php-hl-var">$search</span><span class="php-hl-code">-&gt;</span><span class="php-hl-identifier">count</span><span class="php-hl-brackets">(</span><span class="php-hl-quotes">'</span><span class="php-hl-string">项目测试</span><span class="php-hl-quotes">'</span><span class="php-hl-brackets">)</span><span class="php-hl-code">;
  58. </span><span class="php-hl-var">$docs</span><span class="php-hl-code"> = </span><span class="php-hl-var">$search</span><span class="php-hl-code">-&gt;</span><span class="php-hl-identifier">search</span><span class="php-hl-brackets">(</span><span class="php-hl-quotes">'</span><span class="php-hl-string">项目测试</span><span class="php-hl-quotes">'</span><span class="php-hl-brackets">)</span><span class="php-hl-code">;</span></pre></div></div>
  59. <h2 id="ch3">4. 搜索中的串接操作<a name="ch3" class="anchor">?</a></h2>
  60. <p>由于 <code>Xunsearch PHP-SDK</code> 全面采用面向对象的编程思想,在搜索对象中对部分搜索语句构建、
  61. 搜索结果修饰加入了串接操作支持。支持串接操作的方法有:</p>
  62. <ul>
  63. <li><code>addDB($name)</code> - 用于多库搜索,添加数据库名称</li>
  64. <li><code>addRange($field, $from, $to)</code> - 添加搜索过滤区间或范围</li>
  65. <li><code>addWeight($field, $term)</code> - 添加权重索引词</li>
  66. <li><code>setCharset($charset)</code> - 设置字符集</li>
  67. <li><code>setCollapse($field, $num = 1)</code> - 设置搜索结果按字段值折叠</li>
  68. <li><code>setDb($name)</code> - 设置搜索库名称,默认则为 <code>db</code></li>
  69. <li><code>setFuzzy()</code> - 设置开启模糊搜索, 传入参数 false 可关闭模糊搜索</li>
  70. <li><code>setLimit($limit, $offset = 0)</code> - 设置搜索结果返回的数量和偏移</li>
  71. <li><code>setQuery($query)</code> - 设置搜索语句</li>
  72. <li><code>setSort($field, $asc = false)</code> - 设置搜索结果按字段值排序</li>
  73. </ul>
  74. <p>如果采用串接操作,那么上面的搜索语句可以改写如下,有种一气呵成的感觉:</p>
  75. <div class="hl-code"><div class="php-hl-main"><pre><span class="php-hl-var">$docs</span><span class="php-hl-code"> = </span><span class="php-hl-var">$search</span><span class="php-hl-code">-&gt;</span><span class="php-hl-identifier">setQuery</span><span class="php-hl-brackets">(</span><span class="php-hl-quotes">'</span><span class="php-hl-string">项目测试</span><span class="php-hl-quotes">'</span><span class="php-hl-brackets">)</span><span class="php-hl-code">-&gt;</span><span class="php-hl-identifier">addWeight</span><span class="php-hl-brackets">(</span><span class="php-hl-quotes">'</span><span class="php-hl-string">subject</span><span class="php-hl-quotes">'</span><span class="php-hl-code">, </span><span class="php-hl-quotes">'</span><span class="php-hl-string">xunsearch</span><span class="php-hl-quotes">'</span><span class="php-hl-brackets">)</span><span class="php-hl-code">-&gt;</span><span class="php-hl-identifier">setLimit</span><span class="php-hl-brackets">(</span><span class="php-hl-number">5</span><span class="php-hl-code">, </span><span class="php-hl-number">10</span><span class="php-hl-brackets">)</span><span class="php-hl-code">-&gt;</span><span class="php-hl-identifier">search</span><span class="php-hl-brackets">(</span><span class="php-hl-brackets">)</span><span class="php-hl-code">;</span></pre></div></div>
  76. <h2 id="ch4">5. 搜索日志<a name="ch4" class="anchor">?</a></h2>
  77. <p>在每一次正常搜索之后,系统内部均对相应的关键词做了记录和一并分析。但这个行为并不是实时的,
  78. 而是积累一定的量后再统一分析和处理。</p>
  79. <p>搜索日志保存在 <code>$prefix/项目名/log_db</code> 中,它是一个独立的索引库,通过它实现了包括相关搜索、
  80. 拼音搜索、纠错建议等功能。</p>
  81. <blockquote class="tip">
  82. <p><strong>Tip:</strong> 如果您需要强制同步搜索日志库,请参见 <a href="util.indexer.html">Indexer 索引管理工具</a> 的 <code>--flush-log</code> 选项。<br />
  83. 此外,只有当您的搜索代码中调用了 <a href="../api/XSSearch.html#setQuery">XSSearch::setQuery</a> 并配合不带参数的 <a href="../api/XSSearch.html#search">XSSearch::search</a> 调用,
  84. 才会记录本次搜索关键词。</p>
  85. </blockquote>
  86. <h2 id="ch5">6. 如何进行多库搜索?<a name="ch5" class="anchor">?</a></h2>
  87. <p>在<a href="index.overview.html">索引概述</a>中我们曾经提到,如果您的索引数据量非常大,那么应当适当
  88. 考虑分割数据,在服务端采用多个库来保存索引数据。您可以调用 <a href="../api/XSSearch.html#addDb">XSSearch::addDb</a> 添加
  89. 其它搜索库。</p>
  90. <p>关于超大数据量的多库搜索及分布式设计,由于涉及的知识和范围比较广。我们提供了专门的商业支持方案,
  91. 在论坛中也会开辟专门的讨论,在此略过不述。</p>
  92. <div class="revision">$Id$</div>
  93. <div class="clear"></div>
  94. </div><!-- end of content -->
  95. <div id="guideNav">
  96. <div class="prev"><a href="index.dict.html">&laquo; 自定义SCWS词库</a></div>
  97. <div class="next"><a href="search.query.html">构建搜索语句 &raquo;</a></div>
  98. <div class="clear"></div>
  99. </div><!-- end of nav -->
  100. <div id="apiFooter">
  101. Copyright &copy; 2008-2011 by <a href="http://www.xunsearch.com" target="_blank">杭州云圣网络科技有限公司</a><br/>
  102. All Rights Reserved.<br/>
  103. </div><!-- end of footer -->
  104. </div><!-- end of page -->
  105. <div style="display:none;">
  106. <img src="../api/css/info.gif" />
  107. <img src="../api/css/tip.gif" />
  108. <img src="../api/css/note.gif" />
  109. </div>
  110. </body>
  111. </html>