Code Monkey home page Code Monkey logo

rethinkdb-elasticsearch-stream's Introduction

rethinkdb-elasticsearch-stream

🔄 sync RethinkDB tables to Elasticsearch using changefeeds

Build Status Greenkeeper badge

A JavaScript-based replacement for the deprecated Elasticsearch RethinkDB River plugin. This can populate your Elasticsearch instance using data from a RethinkDB instance, keep it up to date using changefeeds, and allow you to modify the documents before they're copied.

✨ Features:

  • Simple: specify connections and tables to copy as-is to Elasticsearch
  • Flexible: accepts a transform function for each table to modify what's copied
  • Tested

Usage

Simple example:

import rethinkdbElasticsearchStream from 'rethinkdb-elasticsearch-stream'

await rethinkdbElasticsearchStream({
  backfill: true,
  elasticsearch: { host: '127.0.0.1', port: 9200 },
  rethinkdb: { host: '127.0.0.1', port: 28015 },
  tables: [{ db: 'megacorp', table: 'users' }],
  watch: true
});

Everything:

import rethinkdbElasticsearchStream from 'rethinkdb-elasticsearch-stream'

await rethinkdbElasticsearchStream({
  // If the Elasticsearch instance should be populated with existing RethinkDB data
  backfill: true,

  // Connection details for an Elasticsearch instance
  elasticsearch: {
    host: '127.0.0.1',
    port: 9200,
    // (optional) protocol for connection (`http` or `https`).  Defaults to `http`.
    protocol: 'http'
  },

  // Connection details for the RethinkDB instance to be copied
  // See `rethinkdbdash` (https://github.com/neumino/rethinkdbdash) for all possible options.
  rethinkdb: {
    host: '127.0.0.1',
    port: 28015,
    // (optional) protocol for connection (`http` or `https`).  Defaults to `http`.
    protocol: 'http'
  },

  // Tables to duplicate and watch for changes
  tables: [
    {
      // Database containing table
      db: 'megacorp',
      // (optional) Handle when a document is deleted in Rethink
      // This is detected when the new value for a document is null
      // If this is not specified, a DELETE is sent to Elasticsearch for the
      // id of the old value
      deleteTransform: async ({db, document, oldDocument, table }) => {
        if (await someImportantCheck()) {
          return oldDocument;
        }

        // this is the default behavior for a delete
        return {
          // import { _delete } from 'rethinkdb-elasticsearch-stream';
          //
          // this is a special Symbol that tells the library that this should
          // be a DELETE. It can also be used in the regular transform function
          _delete
          id: oldDocument.id,
        }
      },
      // (optional) Type field for Elasticsearch.  This is similar to a "table" in
      // RethinkDB, and is the second portion of the URL path (index/db is the first).
      esType: 'webUsers',
      // (optional) ID field.  If specified, changes are upserted into Elasticsearch
      // Note: Elasticsearch-specific field names cannot be used (e.g. `_id`)
      // If that's important to you, open an issue.
      idKey: 'id',
      // Table to copy
      table: 'users',
      // (optional) Modify what will be saved in Elasticsearch.
      // This can be either a function or a Promise.
      // If `null` or `undefined` is returned, the document is not saved.
      // `db` and `table` are specified for convenience
      transform: async ({ db, document, oldDocument, table }) => {
        await doSomethingImportant()
        return document;
      }
    }
  ],

  // If the Elasticsearch instance should be updated when RethinkDB emits a changefeed event
  watch: true
});

Install

With Yarn or npm installed, run:

yarn add rethinkdb-elasticsearch-stream

# ...or, if using `npm`
npm install rethinkdb-elasticsearch-stream

See Also

rethinkdb-elasticsearch-stream was inspired by:

License

MIT

rethinkdb-elasticsearch-stream's People

Contributors

blakek avatar meyer-mcmains avatar greenkeeper[bot] avatar dependabot[bot] avatar ruebel avatar zstickles-gsandf avatar

Stargazers

netop://ウエハ avatar Matthias Andrasch avatar Ali Torki avatar Stephen von Takach avatar Kyle Farris avatar Denys Pavlov avatar Signo avatar Nils Bergmann avatar  avatar

Watchers

Seth Gunnells avatar James Cloos avatar  avatar

rethinkdb-elasticsearch-stream's Issues

final mapping would have more than 1 type

I am trying to do full text search on rethinkdb. I took this package for a spin. I am getting the following error.

Rejecting mapping update to [ta3] as the final mapping would have more than 1 type
rethinkdbElasticsearchStream({
    backfill: true,
    elasticsearch: { host: '127.0.0.1', port: 9200 },
    rethinkdb: { host: '127.0.0.1', port: 28015 },
    tables: [
      { db: 'db1', table: 'lectures' },
      { db: 'db1', table: 'speakers' },
      { db: 'db2', table: 'clipnshare' }
],
    watch: true
  });
  return elastic;
};

I am pretty new to elastic search but would venture to guess that this is related to this es update

Do you have any plans on updating this repo?

What do you suggest?

error while transfer

UnhandledPromiseRejectionWarning: Error: Could not connect to Elasticsearch server

i get the above error inspite i am able to use curl and query my data from elasticsearch localhost

Add a way to delete documents from Elasticsearch

Currently, the transform can only return a document, which will update Elasticsearch, or null, which does nothing. There needs to be a way to return something from the transform that signals the document needs to be removed from Elasticsearch

Add .npmignore

Need too add a .npmignore and ignore things that aren't useful when being used as a library.

Action required: Greenkeeper could not be activated 🚨

🚨 You need to enable Continuous Integration on all branches of this repository. 🚨

To enable Greenkeeper, you need to make sure that a commit status is reported on all branches. This is required by Greenkeeper because we are using your CI build statuses to figure out when to notify you about breaking changes.

Since we did not receive a CI status on the greenkeeper/initial branch, we assume that you still need to configure it.

If you have already set up a CI for this repository, you might need to check your configuration. Make sure it will run on all new branches. If you don’t want it to run on every branch, you can whitelist branches starting with greenkeeper/.

We recommend using Travis CI, but Greenkeeper will work with every other CI service as well.

Once you have installed CI on this repository, you’ll need to re-trigger Greenkeeper’s initial Pull Request. To do this, please delete the greenkeeper/initial branch in this repository, and then remove and re-add this repository to the Greenkeeper integration’s white list on Github. You'll find this list on your repo or organiszation’s settings page, under Installed GitHub Apps.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.