Code Monkey home page Code Monkey logo

mongo-es's People

Contributors

dependabot[bot] avatar renzholy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mongo-es's Issues

tailOpLog Error: should not complete

I was trying to sync with Alicloud Mongodb service (replicaset),
when I run the example and got the error as below:

run 2018-08-17T02:46:30.317Z put mapping banner_v1 banner from checkpoint books.books___banner_v1.banner CheckPoint { phase: 'tail', time: 2017-08-16T10:55:24.474Z } (node:15569) DeprecationWarning: current URL string parser is deprecated, and will be removed in a future version. To use the new parser, pass option { useNewUrlParser: true } to MongoClient.connect. tail books.books___banner_v1.banner from 2017-08-16T10:55:24.474Z tail books.books___banner_v1.banner Error: should not complete at AnonymousObserver.tail.bufferWithTimeOrCount.subscribe [as _onCompleted] (/home/michael/another-connector/node_modules/mongo-es/dist/src/processor.js:302:33) at AnonymousObserver.Rx.AnonymousObserver.AnonymousObserver.completed (/home/michael/another-connector/node_modules/rx/dist/rx.js:1843:12) at AnonymousObserver.Rx.internals.AbstractObserver.AbstractObserver.onCompleted (/home/michael/another-connector/node_modules/rx/dist/rx.js:1782:14) at AnonymousObserver.tryCatcher (/home/michael/another-connector/node_modules/rx/dist/rx.js:63:31) at AutoDetachObserverPrototype.completed (/home/michael/another-connector/node_modules/rx/dist/rx.js:5897:56) at AutoDetachObserver.Rx.internals.AbstractObserver.AbstractObserver.onCompleted (/home/michael/another-connector/node_modules/rx/dist/rx.js:1782:14) at MergeAllObserver.completed (/home/michael/another-connector/node_modules/rx/dist/rx.js:3751:37) at MergeAllObserver.Rx.internals.AbstractObserver.AbstractObserver.onCompleted (/home/michael/another-connector/node_modules/rx/dist/rx.js:1782:14) at MergeAllObserver.tryCatcher (/home/michael/another-connector/node_modules/rx/dist/rx.js:63:31) at AutoDetachObserverPrototype.completed (/home/michael/another-connector/node_modules/rx/dist/rx.js:5897:56) tailOpLog Error: should not complete at AnonymousObserver.tail.bufferWithTimeOrCount.subscribe [as _onCompleted] (/home/michael/another-connector/node_modules/mongo-es/dist/src/processor.js:302:33) at AnonymousObserver.Rx.AnonymousObserver.AnonymousObserver.completed (/home/michael/another-connector/node_modules/rx/dist/rx.js:1843:12) at AnonymousObserver.Rx.internals.AbstractObserver.AbstractObserver.onCompleted (/home/michael/another-connector/node_modules/rx/dist/rx.js:1782:14) at AnonymousObserver.tryCatcher (/home/michael/another-connector/node_modules/rx/dist/rx.js:63:31) at AutoDetachObserverPrototype.completed (/home/michael/another-connector/node_modules/rx/dist/rx.js:5897:56) at AutoDetachObserver.Rx.internals.AbstractObserver.AbstractObserver.onCompleted (/home/michael/another-connector/node_modules/rx/dist/rx.js:1782:14) at MergeAllObserver.completed (/home/michael/another-connector/node_modules/rx/dist/rx.js:3751:37) at MergeAllObserver.Rx.internals.AbstractObserver.AbstractObserver.onCompleted (/home/michael/another-connector/node_modules/rx/dist/rx.js:1782:14) at MergeAllObserver.tryCatcher (/home/michael/another-connector/node_modules/rx/dist/rx.js:63:31) at AutoDetachObserverPrototype.completed (/home/michael/another-connector/node_modules/rx/dist/rx.js:5897:56)

here is config.json

{
"controls": {
"mongodbReadCapacity": 10000,
"elasticsearchBulkSize": 5000,
"elasticsearchBulkInterval": 5000,
"indexNameSuffix": "_v1"
},
"mongodb": {
"url": "mongodb://michael:[email protected]:3717,dds-xxxxxx.mongodb.rds.aliyuncs.com:3717/books?replicaSet=mgset-xxxxx",
"options": {
"authSource": "admin",
"readPreference": "secondaryPreferred"
}
},
"elasticsearch": {
"options": {
"host": "http://localhost:9200",
"apiVersion": "6.3"
},
"indices": [
{
"index": "banner",
"body": {
"settings": {
"index": {
"number_of_shards": 3,
"number_of_replicas": 1,
"mapper.dynamic": false
}
}
}
}
]
},
"tasks": [
{
"from": {
"phase": "tail",
"time": "2017-08-16T10:55:24.474Z"
},
"extract": {
"db": "books",
"collection": "books",
"projection": {
"name": 1
}
},
"transform": {
"mapping": {
"name": "name"
}
},
"load": {
"index": "banner",
"type": "banner",
"body": {
"dynamic": false,
"properties": {
"name": {
"type": "text",
"fields": {
"exact": {
"type": "keyword"
}
}
}
}
}
}
}
]
}

有2秒的延迟处理

可以做到实时吗,今天刚试用了下,感觉还不错,同步上比mongo-connctor慢 有时时间差很多,是配置原因吗?

使用反馈

我运行mongo-es ./config.json命令,结果在同步的不是我数据库中的数据而是
image

linux中运行的是
image

请问怎样才能同步真正的数据

数据同步问题

数据应该是同步过去了,但是在ES的ui界面看到的是
image,为什么显示不出字段?

请教value替换的方法

你们好,感谢你们的付出,我现在正在试用此插件,感觉很实用。
不过有个问题想咨询一下,目前有没有对抽取的数据进行值替换的方法呢,就是把符合条件的值替换成其他的值,比如把时间项中不符合时间格式的值替换成空值,或者丢弃该field呢?如果有的话,如何使用,请指点一下,非常感谢!

mongodb 做drop操作

mongodb 做drop操作的时候mongo-es未同步删除,怎么解决?求大神指点!!!!

win10下出现这个错误 是怎么回事呢

D:\mongo-es>mongo-es config.json
run 2017-06-20T02:43:49.685Z
run { Error: [mapper_parsing_exception] analyzer [ik_max_word] not found for field [property0]
at respond (C:\Users\tangniyuqi\AppData\Roaming\npm\node_modules\mongo-es\node_modules\elasticsearch\src\lib\transport.js:295:15)
at checkRespForFailure (C:\Users\tangniyuqi\AppData\Roaming\npm\node_modules\mongo-es\node_modules\elasticsearch\src\lib\transport.js:254:7)
at HttpConnector. (C:\Users\tangniyuqi\AppData\Roaming\npm\node_modules\mongo-es\node_modules\elasticsearch\src\lib\connectors\http.js:159:7)
at IncomingMessage.bound (C:\Users\tangniyuqi\AppData\Roaming\npm\node_modules\mongo-es\node_modules\elasticsearch\node_modules\lodash\dist\lodash.js:729:21)
at emitNone (events.js:110:20)
at IncomingMessage.emit (events.js:207:7)
at endReadableNT (_stream_readable.js:1047:12)
at _combinedTickCallback (internal/process/next_tick.js:102:11)
at process._tickCallback (internal/process/next_tick.js:161:9)
status: 400,
displayName: 'BadRequest',
message: '[mapper_parsing_exception] analyzer [ik_max_word] not found for field [property0]',
path: '/index0_v1/_mapping/type0',
query: {},
body: '{"dynamic":false,"_parent":{"type":"type1"},"properties":{"property0":{"type":"text","norms":false,"analyzer":"ik_max_word","search_analyzer":"ik_smart"},"property1":{"type":"keyword"}}}',
statusCode: 400,
response: '{"error":{"root_cause":[{"type":"mapper_parsing_exception","reason":"analyzer [ik_max_word] not found for field [property0]"}],"type":"mapper_parsing_exception","reason":"analyzer [ik_max_word] not found for field [property0]"},"status":400}',
toString: [Function],
toJSON: [Function] }

running fine but can't see any data in Elasticsearch

I was running in scan mode, and always logging like this:

scan db.collection -> index.type 5000 59840e8edcbfc715cd9380b7
scan db.collection -> index.type 5000 59840e87dcbfc715cd936d2f
scan db.collection -> index.type 5000 59840e87dcbfc715cd9359a7
scan db.collection -> index.type 5000 59840e80dcbfc715cd93461f
scan db.collection -> index.type 5000 59840e80dcbfc715cd933297

But if I running in tail mode, and logging like this:

tail db.collection -> index.type start from xxxx

I never see the error message, so I think my config was correct and mongo-es was running fine, so why I can't see any data write to Elasticsearch?

FYI, I have used mongo-connector to import data to Elasticsearch to a same index, but now I'm using mongo-es to import data to Elasticsearch to different type which they are parent-child relationship

调试模式不能运行

调试模式不能运行 NODE_ENV=dev mongo-es ./config.json
非调试模式正常运行,需要怎么设置吗?

数据更新频繁,数据量大的数据库同步问题

主要问题

目前有一个数据量很大数据更新频繁的数据库,进行一次scan同步需要2天以上。但是MongoDB的oplog默认保留24小时,因此在scan同步的时候,如何才能做到不丢数据呢?(例如刚同步完一个文档,这个文档就被更新了)

  • 是要同时启用tailscan进行同步么?
  • 一个实例同时跑tailscan还是两个实例分别跑比较好?
  • scan同步完之后,会自动转换成tail,这时两个tail如何处理?(同步完成时间不能确定,就算手动关闭也会有一段双tail的时间)

一切正常 但是就是ES中没有数据

按照文档配置了 一切正常 但是就是ES中没有数据 为何?

# 启动mongo-es 表中添加了五条记录
➜  ~ mongo-es ./mongo_es/config.json
run 2018-08-13T10:01:11.672Z
put mapping foo anyong

tail test.anyong___foo.anyong 1 2018-08-13T10:18:31.000Z
tail test.anyong___foo.anyong 1 2018-08-13T10:24:04.000Z
tail test.anyong___foo.anyong 1 2018-08-13T10:28:15.000Z
tail test.anyong___foo.anyong 1 2018-08-13T10:29:31.000Z
tail test.anyong___foo.anyong 1 2018-08-13T10:41:42.000Z

# mongo表中有五条记录
rs0:PRIMARY> db.anyong.count()
5

为什么查询ES的时候 却是0呢?

➜  ~ curl -XGET '192.168.0.25:9200/foo/anyong/_count?pretty&pretty'
{
  "count" : 0,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "failed" : 0
  }
}

tail 3小时前的时间如何限制频率

tail 3小时前的数据(数据量很大)报如下错误:

FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory

<--- Last few GCs --->

764250 ms: Mark-sweep 1274.1 (1434.2) -> 1274.0 (1434.2) MB, 2383.4 / 0 ms [allocation failure] [GC in old space requested].
766655 ms: Mark-sweep 1274.0 (1434.2) -> 1274.0 (1434.2) MB, 2404.9 / 0 ms [allocation failure] [GC in old space requested].
769073 ms: Mark-sweep 1274.0 (1434.2) -> 1274.0 (1434.2) MB, 2418.3 / 0 ms [last resort gc].
771554 ms: Mark-sweep 1274.0 (1434.2) -> 1273.9 (1434.2) MB, 2480.8 / 0 ms [last resort gc].

<--- JS stacktrace --->

==== JS stack trace =========================================

Security context: 0x19ec2fcc9fa9
1: /* anonymous /(aka / anonymous */) [/root/.nvm/v6.0.0/lib/node_modules/mongo-es/dist/src/processor.js:6] [pc=0x62a4bb3a8d4] (this=0x19ec2fc04189 ,resolve=0x156d24938489 <JS Function CreateResolvingFunctions.value (SharedFunctionInfo 0x35c5ec268111)>)
2: arguments adaptor frame: 2->1
3: new Promise [native promise.js:53] [pc=0x62a4ad90b45] (this=0x19ec2fc041e9 <the ho...

关于权限问题

您好,
我在配置config.json并且运行时,总是报权限错误。
我已经用了admin的账户了,不知道为什么

报错信息看不懂

···
run 2018-11-16T15:38:59.190Z
put mapping douban_v1 movie
from checkpoint douban.movie___douban_v1.movie CheckPoint {
phase: 'scan',
id: 000000000000000000000000,
time: 2018-11-16T15:38:59.189Z }
scan douban.movie___douban_v1.movie from 000000000000000000000000
scan douban.movie___douban_v1.movie end
tail douban.movie___douban_v1.movie from 2018-11-16T15:38:59.189Z
tail douban.movie___douban_v1.movie Error: should not complete
at AnonymousObserver.tail.bufferWithTimeOrCount.subscribe [as _onCompleted] (/usr/local/lib/node_modules/mongo-es/dist/src/processor.js:302:33)
at AnonymousObserver.Rx.AnonymousObserver.AnonymousObserver.completed (/usr/local/lib/node_modules/mongo-es/node_modules/rx/dist/rx.js:1843:12)
at AnonymousObserver.Rx.internals.AbstractObserver.AbstractObserver.onCompleted (/usr/local/lib/node_modules/mongo-es/node_modules/rx/dist/rx.js:1782:14)
at AnonymousObserver.tryCatcher (/usr/local/lib/node_modules/mongo-es/node_modules/rx/dist/rx.js:63:31)
at AutoDetachObserverPrototype.completed (/usr/local/lib/node_modules/mongo-es/node_modules/rx/dist/rx.js:5897:56)
at AutoDetachObserver.Rx.internals.AbstractObserver.AbstractObserver.onCompleted (/usr/local/lib/node_modules/mongo-es/node_modules/rx/dist/rx.js:1782:14)
at MergeAllObserver.completed (/usr/local/lib/node_modules/mongo-es/node_modules/rx/dist/rx.js:3751:37)
at MergeAllObserver.Rx.internals.AbstractObserver.AbstractObserver.onCompleted (/usr/local/lib/node_modules/mongo-es/node_modules/rx/dist/rx.js:1782:14)
at MergeAllObserver.tryCatcher (/usr/local/lib/node_modules/mongo-es/node_modules/rx/dist/rx.js:63:31)
at AutoDetachObserverPrototype.completed (/usr/local/lib/node_modules/mongo-es/node_modules/rx/dist/rx.js:5897:56)
tailOpLog Error: should not complete
at AnonymousObserver.tail.bufferWithTimeOrCount.subscribe [as _onCompleted] (/usr/local/lib/node_modules/mongo-es/dist/src/processor.js:302:33)
at AnonymousObserver.Rx.AnonymousObserver.AnonymousObserver.completed (/usr/local/lib/node_modules/mongo-es/node_modules/rx/dist/rx.js:1843:12)
at AnonymousObserver.Rx.internals.AbstractObserver.AbstractObserver.onCompleted (/usr/local/lib/node_modules/mongo-es/node_modules/rx/dist/rx.js:1782:14)
at AnonymousObserver.tryCatcher (/usr/local/lib/node_modules/mongo-es/node_modules/rx/dist/rx.js:63:31)
at AutoDetachObserverPrototype.completed (/usr/local/lib/node_modules/mongo-es/node_modules/rx/dist/rx.js:5897:56)
at AutoDetachObserver.Rx.internals.AbstractObserver.AbstractObserver.onCompleted (/usr/local/lib/node_modules/mongo-es/node_modules/rx/dist/rx.js:1782:14)
at MergeAllObserver.completed (/usr/local/lib/node_modules/mongo-es/node_modules/rx/dist/rx.js:3751:37)
at MergeAllObserver.Rx.internals.AbstractObserver.AbstractObserver.onCompleted (/usr/local/lib/node_modules/mongo-es/node_modules/rx/dist/rx.js:1782:14)
at MergeAllObserver.tryCatcher (/usr/local/lib/node_modules/mongo-es/node_modules/rx/dist/rx.js:63:31)
at AutoDetachObserverPrototype.completed (/usr/local/lib/node_modules/mongo-es/node_modules/rx/dist/rx.js:5897:56)
···
这是我的报错日志,经过搜索之后发现,似乎是我的mongo没有开oplog导致的。我并不会js,单看这段报错日志完全不知道是这个原因,希望能提供更好的报错信息来方便进行调试与排错。

出现大量的Request Timeout after 30000ms 如何避免呢

tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data 500
tail dna.dna_data -> dna_v1.dna_data 500
tail dna.dna_data -> dna_v1.dna_data 500
tail dna.dna_data -> dna_v1.dna_data 500
tail dna.dna_data -> dna_v1.dna_data 500
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms
tail dna.dna_data -> dna_v1.dna_data Request Timeout after 30000ms

关于task.mapping字段问题

一个MongoDB的嵌套字段加入不在task.mapping配置里,就不会被同步。
例如:

{"data": {"number": 1, "geo": [0, 0]}}

(在配置的task.mapping部分已存在"data": "data"的映射)
这个MongoDB文档,在仅修改data.geo的值的时候,程序会进行ignoreUpdate。必须手动将"data.geo": "data.geo"添加进task.mapping映射只后,才会实时同步。

希望在做ignoreUpdate检查时,对key的检测可以从其第一个元素开始,而不仅仅是完整的key字段匹配。

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.