Code Monkey home page Code Monkey logo

amazonsearchengine's Introduction

An Search Engine Face to Reviews of Goods in Amazon

SZU Information Retrival Final Project

TODO

BackEnd

  • 后端起服务框架,路由
  • 接口层

FrontEnd

  • 检索首页,只带有搜索框和确认按钮,点击跳转检索列表页
  • 检索列表页,展示返回的前n条结果。点击跳转检索详情页
  • 检索详情页。商品信息

Algo

  • 功能层
  • redis 读取重构。目前只写了json格式cache的读取
  • index初始化的排序,和每次加入新数据时的排序。前面作业已经实现过两个有序倒排索引的合并,时间复杂度O(n)
  • 存储层
  • 计算层

NOTE

  • 两数组都无序。先合并,再排序。
  • 一有序,一无序。先排序,再合并。
  • 对于指向同一个商品的多条评论,可以视作一条长文本。这个时候,词(如形容词)的复现频率,实际上是多条短文本中的共现,与相似度是可以看作正相关的。每一次复现,都是对相关性的贡献。特别是形容词更符合这个假设,代表评论用户【们】对这个形容的认可。如检索词”鞋 耐磨“,某商品1000条评论中有999条包含”耐磨“,”耐磨“的tf中,该商品”耐磨“词频极高,也应该符合”耐磨“。对于名词,可能会稍弱,比如每条评论可能都带有”鞋“。但是起码保证,与”鞋“相关的评论,可以联系到”鞋“类商品,保证召回。形容词更能体现信息需求满足的精细程度。

amazonsearchengine's People

Contributors

kkchannel-kk avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.