郭承坤, 陈国松, 阮怀军,等.基于Heritrix+Solr的农业信息垂直搜索引擎研究与设计[J].广东农业科学,2015,42(5):139-144 |
查看全文
HTML
基于Heritrix+Solr的农业信息垂直搜索引擎研究与设计 |
Research and design of agricultural information vertical search engine based on Heritrix+Solr |
|
DOI: |
中文关键词: 农业垂直搜索引擎 Heritrix Solr 中文分词 页面排序 |
英文关键词: agricultural vertical search engine Heritrix Solr Chinese word segmentation page ranking |
基金项目: |
|
摘要点击次数: 1453 |
全文下载次数: 777 |
中文摘要: |
随着农业信息化、智能化的不断发展‘农业信息量呈现井喷式增长‘为广大农业从业者和农业科
研人员提供便捷有效的信息检索方法是目前农业搜索引擎亟需解决的问题。 为此‘ 本文提出了基于 Heritrix+
Solr 的农业信息垂直搜索引擎框架‘并设计了适用于农业信息垂直搜索引擎的隐马尔科夫 Web 信息抽取模块
和基于词典的 mmseg4j 中文分词模块‘同时改进了页面排序算法‘对进一步提升农业垂直搜索引擎的用户体验
和工作效率具有一定的参考价值。 |
英文摘要: |
The agricultural information blooms rapidly with the development of agriculture in information and
intelligence, therefore, a convenient and effective agricultural information search method and search engine for
agricultural researchers, producers and managers is in need. A search engine framework based on Heritrix and Solr
was put forward, in which Hidden Markvo Model based web information extraction and mmseg4j agricultural
dictionary based Chinese word segmentation were involved, moreover, the page ranking algorithm was improved
according to the characteristics of agricultural information search. Finally, this paper provided suggestions for
improving the user experience and efficiency of agricultural vertical search engine. |
查看/发表评论 下载PDF阅读器 |