Code Monkey home page Code Monkey logo

elasticsearch-analysis-synonym's Introduction

Elasticsearch Analysis Synonym

Overview

Elasticsearch Analysis Synonym Plugin provides NGramSynonymTokenizer. For more details, see LUCENE-5252.

Version

Versions in Maven Repository

Issues/Questions

Please file an issue. (Japanese forum is here.)

Installation

For 5.x

$ $ES_HOME/bin/elasticsearch-plugin install org.codelibs:elasticsearch-analysis-synonym:5.3.0

For 2.x

$ $ES_HOME/bin/plugin install org.codelibs/elasticsearch-analysis-synonym/2.4.0

Getting Started

Create synonym.txt File

First of all, you need to create a synonym dictionary file, synonym.txt in $ES_CONF(ex. /etc/elasticsearch). (The following content is just a sample...)

$ cat /etc/elasticsearch/synonym.txt
あ,かき,さしす,たちつて,なにぬねの

Create Index

NGramSynonymTokenizer is defined as "ngram_synonym" type. Creating an index with "ngram_synonym" is below:

$ curl -XPUT localhost:9200/sample?pretty -d '
{
  "settings":{
    "index":{
      "analysis":{
        "tokenizer":{
          "2gram_synonym":{
            "type":"ngram_synonym",
            "n":"2",
            "synonyms_path":"synonym.txt"
          }
        },
        "analyzer":{
          "2gram_synonym_analyzer":{
            "type":"custom",
            "tokenizer":"2gram_synonym"
          }
        }
      }
    }
  },
  "mappings":{
    "item":{
      "properties":{
        "id":{
          "type":"string",
          "index":"not_analyzed"
        },
        "msg":{
          "type":"string",
          "analyzer":"2gram_synonym_analyzer"
        }
      }
    }
  }
}'

and then insert data:

$ curl -XPOST localhost:9200/sample/item/1 -d '
{
  "id":"1",
  "msg":"あいうえお"
}'

Check Search Results

Try searching...

$ curl -XPOST "http://localhost:9200/sample/_search" -d '
{
   "query": {
      "match_phrase": {
         "msg": "あ"
      }
   }
}'

$ curl -XPOST "http://localhost:9200/sample/_search" -d '
{
   "query": {
      "match_phrase": {
         "msg": "あい"
      }
   }
}'

$ curl -XPOST "http://localhost:9200/sample/_search" -d '
{
   "query": {
      "match_phrase": {
         "msg": "かき"
      }
   }
}'

$ curl -XPOST "http://localhost:9200/sample/_search" -d '
{
   "query": {
      "match_phrase": {
         "msg": "かきい"
      }
   }
}'

Reload synonyms_path File Dynamically

To add "dynamic_reload" property as true, NGramSynonymTokenizer reloads synonyms_path file on the fly(actually, it's reload on reset() method call). If you want to change an interval time to check a file timestamp, add "reload_interval".

$ curl -XPUT localhost:9200/sample?pretty -d '
{
  "settings":{
    "index":{
      "analysis":{
        "tokenizer":{
          "2gram_synonym":{
            "type":"ngram_synonym",
            "n":"2",
            "synonyms_path":"synonym.txt",
            "dynamic_reload":true,
            "reload_interval":"10s"
          }
        },
...

elasticsearch-analysis-synonym's People

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.