Comments (21)
An index can contain a dot. The whole thing is the index. You can try it.
from monstache.
You probably don’t need to worry too much about that comment. There is a feature that monstache deletes the index when the collection or db is dropped. You will just need to do that delete index manually. Drop collections and db not too common event.
The ObjectId are converted to string before your function and is called.
from monstache.
Hi, I don't see a reason yet why it wouldn't already support this (with some work on your part). Monstache will not create the index mapping that contains the join field. You would need to do that yourself upfront before indexing. Then after that join field is in place you would need a Javascript or Golang plugin to alter the source document appropriately to include that join field AND ensure that routing is set such that parents and children are routed to the same shard.
See https://rwynn.github.io/monstache-site/advanced/#middleware and https://rwynn.github.io/monstache-site/advanced/#routing
from monstache.
Thank you! I'll give it a try!
from monstache.
Sorry to bother, but my "parent" documents and my "child" documents are in two different MongoDB collections.
Can Monstache bring them both under the same Elasticsearch namespace (i.e. both index and type) without conflict?
Thank you
from monstache.
@aguyinmontreal I've just added some docs around this at https://rwynn.github.io/monstache-site/advanced/#joins
To do what you want to accomplish you would have 2 mapping functions in your TOML config file. One for each collection. Then on the _meta_monstache attribute, in addition to a routing
property set the index
and type
properties to what you want. You will be overriding the defaults where index is ${db}.${collection}
and type is ${collection}
.
from monstache.
This link shows more about overriding the index and type. https://rwynn.github.io/monstache-site/advanced/#indexing-metadata
from monstache.
And it will not conflict on stateful resume?
from monstache.
The only conflict I can think of when you merge 2 collections into 1 index like this is if document ids collide across those collections. MongoDB only ensures that _id is unique within a single collection. So it's possible, though unlikely if you are using auto-generated _ids, that they can collide and write to the same document in the converged target Elasticsearch index.
I don't think there is any problem with resuming related to this. What case are you thinking about? The stateful resume only saves timestamps of processed documents. When resuming it begins tailing after the last processed timestamp instead of fast-forwarding to the end of the oplog. You will notice if you don't replay, then it starts tailing from the end. If you replay it starts processing from the beginning. If you resume it starts processing from the last timestamp it has saved. But regardless of where it starts, it always evaluates these functions before indexing.
You will notice that when you do custom routing like this you get additional documents in MongoDB in the collection monstache.meta
. These documents are there so that if you perform a delete in MongoDB the routing attribute is set correctly on the delete request to Elasticsearch. Monstache only needs to preserve the routing information for possible deletes when the default routing is overridden. The default is to route by Elasticsearch _id (which is done by Elasticsearch automatically). The _id is in the oplog so no need to save it additionally.
from monstache.
Alright!
I just wanted to double-check before I start importing all my mongo-connector stuff to Monstache!
Thank you so much for your guidance!
from monstache.
@aguyinmontreal let me know how your transition to Monstache ends up going. I'm very interested in any gaps that exist between it and other solutions.
from monstache.
@rwynn Small comment, if I may.
When reading the documentation, I noticed that you use "test.test" as the example Elasticsearch namespace.
Elasticsearch recommends that we name new types as "_doc" from now on (see here) from version 6.2.1.
So it should be "test._doc", I guess. (But does Monstache allow types starting with "_"? I know mongo-connector doesn't.)
from monstache.
You can name the index and type whatever you want them to be. I don’t think monstache imposes restictions other than elastic/elasticsearch#6736
from monstache.
So actually no monstache would not allow type to be _doc because it used to be a rule that type could not start with underscore and monstache is still following.
from monstache.
Would you consider allowing it? (see
ElasticSearch6.x:
ElasticSearch7.x
from monstache.
Yeah I’ll take a look at it. I may just remove the code that checks any of this and just let ES determine what is valid and not.
You can actually use _doc as the type if you set it from the javascript function. I think that is a “bug” that works in your favor here.
from monstache.
@aguyinmontreal this should be fixed now and the docs updated. Use the 4.0.0 release. Thanks and let me know if you run into any problems.
from monstache.
@rwynn Alright! I'll take a look at it!
So the default index name for Elastic 6.2+ becomes "mongodb database name . mongodb collection name"? So the full default namespace (index + type) becomes "mongodb database name . mongodb collection name . _doc"?
Are you sure two dots are allowed in a Elasticsearch namespace?
from monstache.
ES index is ‘db.collection’ and I’ve used dots with no problem. I don’t think elasticsearch has namespace just index and type. Type is _doc.
from monstache.
Sorry for the confusion.
For me:
1- index + type equals namespace
2-index is the part before the dot
3-type is the part after the doc
from monstache.
Hi rwynn!
I saw this note here in the docs, and I'm a little confused:
In my case, as we discussed, my parent and my child documents are not coming from the same namespace. So even if I can correctly prefix the dynamic index of either the parent or the child, the remaining one will be prefixed with the wrong namespace.
What is best to do?
Another roadblock I hit is that my parent document IDs (i.e. the routing info) are stored as ObjectId
in my child documents.
Can I .toString()
them somehow inside the TOML Javascript script? Or does Monstache automatically .toString()
the doc._meta_monstache.routing
? Or do I absolutely need to store my parent document IDs as strings in MongoDB?
Thanks!
from monstache.
Related Issues (20)
- Failed to find unique document using index pattern While document exists
- elasticsearch-max-bytes not effective, still error : Error 413 (Request Entity Too Large)
- Support for Kibana SSL authentication HOT 1
- Map index on DELETE action - Not accessing plugin HOT 2
- CVE-2022-37434 HOT 2
- Records are missing in sync HOT 5
- linux/arm64 docker images HOT 4
- Creating multiple indices for one collection on resume HOT 1
- Configure monstache to sync all collections in database
- How save in index data stream
- Monstache did not back off writing data when ElasticSearch disk was full (http code 429), causing log spam HOT 3
- Can't connect Monstache(local machine) with my MongoDB containers(3 replicas) and elasticsearch containers.
- Version conflict on collection relation
- Monstache starts backoff when getting 404 (deleted object is already deleted in ES) HOT 2
- Bug: Setting mongodb field value to null does not index it in Elasticsearch HOT 5
- Add an option to include mongo change stream in health check
- Obsessive-compulsive reading disorder
- golang plugin can't be mounted without building plugin from source code
- Is there a way to know lag / total pending sync
- decending sorting HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from monstache.