Incremental loading resets last_record field in the config file if no new record is uploaded

MongoDB input plugin for Embulk

MongoDB input plugin for Embulk loads records from MongoDB. This plugin loads documents as single-column records (column name is "record"). You can use filter plugins such as embulk-filter-expand_json or embulk-filter-add_time to convert the json column to typed columns. Rename filter is also useful to rename the typed columns.

Overview

This plugin only works with embulk >= 0.8.8.

Plugin type: input
Guess supported: no

Configuration

Connection parameters One of them is required.
- use MongoDB connection string URI
  - uri: MongoDB connection string URI (e.g. 'mongodb://localhost:27017/mydb') (string, required)
- use separated URI parameters
  - hosts: list of hosts. hosts are pairs of host(string, required) and port(integer, optional, default: 27017)
  - auth_method: Auth method. One of scram-sha-1, mongodb-cr, auto (string, optional, default: null)
  - auth_source: Auth source. The database name where the user is defined (string, optional, default: null)
  - user: (string, optional)
  - password: (string, optional)
  - database: (string, required)
  - tls: true to use TLS to connect to the host (boolean, optional, default: false)
  - tls_insecure: true to disable various certificate validations (boolean, optional, default: false)
    - The option is similar to an option of the official mongo command.
    - See also: https://www.mongodb.com/docs/manual/reference/connection-string/#mongodb-urioption-urioption.tlsInsecure
collection: source collection name (string, required)
fields: (deprecated) ~~hash records that has the following two fields (array, required)~~ ~~- name: Name of the column~~ ~~- type: Column types as follows~~ ~~- boolean~~ ~~- long~~ ~~- double~~ ~~- string~~ ~~- timestamp~~
id_field_name Name of Object ID field name. Set if you want to change the default name _id (string, optional, default: "_id")
query: A JSON document used for querying on the source collection. Documents are loaded from the colleciton if they match with this condition. (string, optional)
projection: A JSON document used for projection on query results. Fields in a document are used only if they match with this condition. (string, optional)
sort: Ordering of results (string, optional)
aggregation: Aggregation query (string, optional) See Aggregation query for more detail.
batch_size: Limits the number of objects returned in one batch (integer, optional, default: 10000)
incremental_field List of field name (list, optional, can't use with sort option)
last_record Last loaded record for incremental load (hash, optional)
stop_on_invalid_record Stop bulk load transaction if a document includes invalid record (such as unsupported object type) (boolean, optional, default: false)
json_column_name: column name used in outputs (string, optional, default: "record")

Example

Authentication

Use separated URI prameters

in:
  type: mongodb
  hosts:
  - {host: localhost, port: 27017}
  user:  myuser
  password: mypassword
  database: my_database
  auth_method: scram-sha-1
  auth_source: auth_db
  collection: "my_collection"

If you set auth_method: auto, The client will negotiate the best mechanism based on the version of the server that the client is authenticating to.

If the server version is 3.0 or higher, the driver will authenticate using the SCRAM-SHA-1 mechanism.

Otherwise, the driver will authenticate using the MONGODB_CR mechanism.

Use URI String

in:
  type: mongodb
  uri: mongodb://myuser:mypassword@localhost:27017/my_database?authMechanism=SCRAM-SHA-1&authSource=another_database

Exporting all objects

Specify with MongoDB connection string URI.

in:
  type: mongodb
  uri: mongodb://myuser:mypassword@localhost:27017/my_database
  collection: "my_collection"

Specify with separated URI parameters.

in:
  type: mongodb
  hosts:
  - {host: localhost, port: 27017}
  - {host: example.com, port: 27017}
  user: myuser
  password: mypassword
  database: my_database
  collection: "my_collection"

Filtering documents by query and projection

in:
  type: mongodb
  uri: mongodb://myuser:mypassword@localhost:27017/my_database
  collection: "my_collection"
  query: '{ field1: { $gte: 3 } }'
  projection: '{ "_id": 1, "field1": 1, "field2": 0 }'
  sort: '{ "field1": 1 }'