Code Monkey home page Code Monkey logo

lsse's Introduction

Serelex -- a lexico-semantic search engine

This system is a kind of "lexico-semantic search engine". Given a text query it provides a list of related words. For instance, for the word "python" it will return words, such as "Ruby", "C++", "Java", "snake", "boa", etc. Instead, a traditional search engine provides as a results a list of related documents. The system provides visual interface to systems like word2vec. Originally, back to 2012, the system used a graph of related words derived based using the pattern-based semantic similarity measure PatternSim. Lated, in 2013, when word2vec was introduced, we added some models based on the Skip-Gram model. In principle, the system is able to represent results of any other method for computing similarities, as it takes as an input a distributional thesaurus represented in the form word_i<TAB>word_j<TAB>similarity_ij. If you would like to know more about the system or would like to refer to it in a publication, please refer to the following paper:

Panchenko et al. (2013) Serelex: Search and visualization of semantically related words.. In Proceedings of the European Conference on Information Retrieval, ECIR'2013. Springer.

@inproceedings{panchenko2013serelex,
  title={Serelex: Search and visualization of semantically related words},
  author={Panchenko, Alexander and Romanov, Pavel and Morozova, Olga and Naets, Hubert and Philippovich, Andrey and Romanov, Alexey and Fairon, C{\'e}drick},
  booktitle={European Conference on Information Retrieval},
  pages={837--840},
  year={2013},
  organization={Springer}
}

API

All models can be accessed using the RESTful API.

How to install

  1. Install Node.JS (https://github.com/joyent/node/wiki/Installing-Node.js-via-package-manager).
  2. Install MySQL.
  3. Clone this repository (git clone ...).
  4. Go to the directory with lsse and type "npm install" to install all Node.JS dependencies of the system.
  5. Configure database access in config.js file
  6. Use lsse.sql script to create tables.
  7. Use PORT environment variable to set port (e.g. "export PORT=8080" for Linux, "set PORT=8080" for Windows). By default -- 80.
  8. Start the application: "node app".

Additional:

  1. Use "node import_v2" to import all CSV files with semantic relations, described in data_models.js to MongoDB.
  2. Use "node generate_access_log [count] [file name]" to generate access log for JMeter with random data.

Example insallation for Ubuntu 16.04

# install the database
sudo apt install mysql-server mysql-client
wget http://panchenko.me/data/serelex/lsse-backup-28-12-2016.sql.gz
gunzip lsse-backup-28-12-2016.sql.gz 
mysql -u lsse -p -h localhost < lsse-backup-28-12-2016.sql
# mysql privilegies for the lsse user:
GRANT ALL ON *.* to 'lsse'@'localhost' identified with '';

# install the application 
curl -sL https://deb.nodesource.com/setup_6.x | sudo -E bash -
sudo apt-get install -y nodejs
https://github.com/PomanoB/lsse.git
cd lsse
sudo npm install

# run in a screen
cd ..
wget http://panchenko.me/data/serelex/serelex-restart.sh
sudo bash serelex-restart.sh

# run using supervisord
sudo apt-get install supervisor
sudo vim /etc/supervisor/conf.d/serelex.conf
# enter the following:
[program:serelex]
command=killall -s 9 node; cd /home/ubuntu/lsse; node ./app.js
autostart=true
autorestart=true
stderr_logfile=/home/ubuntu/serelex.err.log
stdout_logfile=/home/ubuntu/serelex.out.log

sudo supervisorctl reread
sudo supervisorctl update

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.