Solr River Plugin for ElasticSearch

The Solr River plugin allows to import data from Apache Solr to elasticsearch.

In order to install the plugin, simply run: bin/plugin -install javanna/elasticsearch-river-solr/1.0.0.

Versions

Solr River Plugin	ElasticSearch
master	0.19.3 -> master
1.0.0	0.19.3 -> master

You might be able to use the river with older versions of elasticsearch, but the tests included with the project run successfully only with version 0.19.3 or higher, the first version using Lucene 3.6.

Getting Started

The Solr River allows to query a running Solr instance and index the returned documents. It uses the SolrJ library to communicate with Solr. The SolrJ version in use and distributed with the plugin is 3.6.1. Although it's recommended to send queries using the same version that is installed on the Solr server, it's possible to query other Solr versions. The default format used is javabin but you can solve compatibility issues just switching to the xml format using the wt parameter.
All the common query parameters are supported.

Installation

Here is how you can easily create the river and index data from Solr, just providing the solr url and the query to execute:

curl -XPUT localhost:9200/_river/solr_river/_meta -d '
{
    "type" : "solr",
    "solr" : {
        "url" : "http://localhost:8080/solr/",
        "q" : "*:*"
    }
}'

All parameters are optional. The following example request contains all the possible parameters that you can use together with all the default values.

{
    "type" : "solr",
    "solr" : {
        "url" : "http://localhost:8983/solr/",
        "q" : "*:*",
        "fq" : "",
        "fl" : "",
        "wt" : "javabin",
        "qt" : "",
        "uniqueKey" : "id",
        "rows" : 10
    },
    "index" : {
        "index" : "solr",
        "type" : "import",
        "bulk_size" : 100,
        "max_concurrent_bulk" : 10,
        "mapping" : "",
        "settings": ""
    }
}

The fq and fl parameters can be provided as either an array or a single value. You can provide your own mapping while creating the river, as well as the index settings, which will be used when creating the new index if needed. The index is created when not already existing, otherwise the documents are added to the existing one. The documents are indexed using the bulk api. You can control the size of each bulk (default 100) and the maximum number of concurrent bulk operations (default is 10). Once the limit is reached the indexing will slow down, waiting for one of the bulk operations to finish its work; no documents will be lost.

Limitations

only stored fields can be retrieved from Solr, therefore indexed in elasticsearch
the river is not meant to keep elasticsearch in sync with Solr, but only to import data once. It's possible to register the river multiple times in order to import different sets of documents though, even from different solr instances.
it's recommended to create the mapping given the existing solr schema in order to apply the correct text analysis while importing the documents. In the future there might be an option to auto generating it from the Solr schema.

License

This software is licensed under the Apache 2 license, quoted below.

Copyright 2012 Luca Cavanna

Licensed under the Apache License, Version 2.0 (the "License"); you may not
use this file except in compliance with the License. You may obtain a copy of
the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
License for the specific language governing permissions and limitations under
the License.

netconstructor / elasticsearch-river-solr Goto Github PK

elasticsearch-river-solr's Introduction

Solr River Plugin for ElasticSearch

Versions

Getting Started

Installation

Limitations

License

elasticsearch-river-solr's People

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent