...f = log($tf); $idf = log(5000000000/$count); //if ($tf > 13) $idf *= 1.4; return array($tf, $idf); } 有几个问题: 1、当一个词在baidu搜索中找到小于1000篇文章包含该词时,为什么要重新计算count,“21000 - $count * 18” 其中21000是...
...f = log($tf); $idf = log(5000000000/$count); //if ($tf > 13) $idf *= 1.4; return array($tf, $idf); } 有几个问题: 1、当一个词在baidu搜索中找到小于1000篇文章包含该词时,为什么要重新计算count,“21000 - $count * 18” 其中21000是...
...我去试试[hr] 加上 --enable-maintainer-zts 参数也不行 php 5.2.13 [root@zjx phpext]# /usr/local/php/bin/phpize Configuring for: PHP Api Version: 20041225 Zend Module Api No: 20060613 Zend Extension Api No: 220060519 [root@zjx phpext]# ./configure --with-scws=/usr/local/s...
...m':7 'href':3 'http':4,12 'net':19 'org':15 'pgsql':1 'pgsqldb':6,14 'www':13 '中国':8 '社区':9 '论坛':10 (1 row) 下面是用鬼佬写的分词的结果,email,网址正确合并,而且标签被正确的去除,但中文却没有分词 postgres=# SELECT to_tsvector('simple','pgsql中国...
... ["type"]=> int(0) ["vno"]=> int(13) ["tokenizer":"XSFieldMeta":private]=> int(0) ["flag":"XSFieldMeta":private]=> int(0) } ["posttime"]=> object(XSFieldMeta)#32 (7) { ["name"]...
...新测试结果: 硬件信息: R410 至强四核E5620 2.4GHz*2/4G*4 1333MHz/600G*2/SAS-15Krpm Raid1 导入索引DEBUG信息: [code] 1998001, channelId=11, CostTime=2895.08, UpdateIndexTime=1226.33, UpdateProductTime=2742.26, ProductCount=10000, ProcessCount=2000000.00, MemoryUsage=46063.0...
...[len] => 2 [idf] => 0 [attr] => un ) [2] => Array ( [word] => 腑 [off] => 13 [len] => 2 [idf] => 0 [attr] => un ) [3] => Array ( [word] => 鍥 [off] => 15 [len] => 2 [idf] => 0 [attr] => un ) [4] => Array ( [word] => 戒 [off] => 17 [len] => 2 [idf] => 0 [attr] => un ) [5] => Array ( [word] => 汉 ...
...[len] => 2 [idf] => 0 [attr] => un ) [2] => Array ( [word] => 腑 [off] => 13 [len] => 2 [idf] => 0 [attr] => un ) [3] => Array ( [word] => 鍥 [off] => 15 [len] => 2 [idf] => 0 [attr] => un ) [4] => Array ( [word] => 戒 [off] => 17 [len] => 2 [idf] => 0 [attr] => un ) [5] => Array ( [word] => 汉 ...
...威力。 主要更新如下: 1. 升级整合最新的 xapian-1.2.13、scws-1.2.1 2. 新增支持项目自定义词库,[url=http://www.xunsearch.com/doc/php/guide/index.dict]参见文档[/url] 3. 改进网络 IO 读取,优化搜索内存运用等大幅提升性能和稳定性 4. 净...
...d... no checking for re2c... no configure: WARNING: You will need re2c 0.13.4 or later if you want to regenerate PHP parsers. checking for gawk... gawk checking for scws support... yes, shared checking for scws.h... yes, found in /usr/local/scws checking for scws_new in -lscws... no configure...