Code Monkey home page Code Monkey logo

Comments (14)

rwynn avatar rwynn commented on August 17, 2024

When you customize routing you cannot do a get without the routing info. Do you need to support deletes in mongo propagating to ES? If not then you don’t need meta. Inserts and updates are fine cause your JS sets the routing. On a delete all we have from mongo is the mongo Id which doesn’t give us the routing.

from monstache.

rwynn avatar rwynn commented on August 17, 2024

See https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-routing-field.html#_making_a_routing_value_required

from monstache.

benan789 avatar benan789 commented on August 17, 2024

What about ids search? It's probably not as fast as get but shouldn't be that much slower.

from monstache.

rwynn avatar rwynn commented on August 17, 2024

Why do you think meta is an issue? Do you see errors?

from monstache.

benan789 avatar benan789 commented on August 17, 2024

Are the meta upserts to mongo bulked? Not sure if that's a bottleneck, but syncing is very slow, I have yet to fully sync a db of 10mil docs without it breaking, i think the fix you did last night helped, it was able to sync to 4mil whereas before it could only do 2. Also it takes a lot of space on the db.

from monstache.

rwynn avatar rwynn commented on August 17, 2024

Got you. Definitely could be bulked. But if the indexing count going up slowly?

from monstache.

benan789 avatar benan789 commented on August 17, 2024

Ya i think it gets slower as the count goes higher. It's at 2.7 right now and I started syncing like 6 hrs ago.

from monstache.

rwynn avatar rwynn commented on August 17, 2024

Can you try with direct-read-limit really high? Less queries. Read up on the direct-* options. Also I noticed from you comment yesterday about the error, the direct read query errored with a timeout. The query actually sorts the entire collection by _id and then seeks to the offset applies the limit. That is why I suggest a really high limit. Default is 5000 I think. That’s still 2000 queries and as it gets higher it has to seek past more documents so gets slower.

from monstache.

rwynn avatar rwynn commented on August 17, 2024

Also did you up the thread pool on the ES side?
https://rwynn.github.io/monstache-site/start/
thread_pool:
bulk:
queue_size: 200

And consider setting the refresh interval to -1?

from monstache.

benan789 avatar benan789 commented on August 17, 2024

Are you using skip?

The cursor.skip() method is often expensive because it requires the server to walk from the beginning of the collection or index to get the offset or skip position before beginning to return results. As the offset (e.g. pageNumber above) increases, cursor.skip() will become slower and more CPU intensive. With larger collections, cursor.skip() may become IO bound.

Consider using range-based pagination for these kinds of tasks. That is, query for a range of objects, using logic within the application to determine the pagination rather than the database itself. This approach features better index utilization, if you do not need to easily jump to a specific page.

$gt: id should fix this

from monstache.

rwynn avatar rwynn commented on August 17, 2024

Skip used yes. And $gt is a good idea. I wonder if $gt would work if someone used a strange Id like an object? Query would be like { _id : $gt: {x: 1 } }. Would have to try it cause it needs to work in general case. I guess if we’re sorting by _id it must work for any value of _id.

from monstache.

rwynn avatar rwynn commented on August 17, 2024

I think using the range selector instead of skip is a huge performance gain! I’ll fix it and publish a new release on Monday. Thanks for your help!

from monstache.

rwynn avatar rwynn commented on August 17, 2024

@benan789 git it another try with the latest release when you get a chance. I'm seeing collections with millions of documents getting synced pretty quickly now.

from monstache.

benan789 avatar benan789 commented on August 17, 2024

Much better! Thank you for fixing it so fast!

from monstache.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.