Code Monkey home page Code Monkey logo

Comments (17)

rwynn avatar rwynn commented on July 18, 2024

I think that would work generally. You would obviously have more indexing load on your ES cluster because each monstache process would be indexing the same data. Basically you would have 3 processes competing to replay an ordered set of events. For a single document, a relatively lagging monstache process might undo a update that a faster monstache process had written (go backwards in that document's history) . The problem would be if the lagging process when down before it was able to write the next update (catch up) you could see stale data.

from monstache.

rwynn avatar rwynn commented on July 18, 2024

Another possible option would be to have 1 monstache instance with resume set to true. If you could setup your cloud provider to detect if that instance went down and spin up another instance with the same configuration (resume true with same resume name) then the new process should resume indexing where the last one left off (by mongodb oplog timestamp).

You might be able to combine https://cloud.google.com/compute/docs/instances/setting-instance-scheduling-options with some sort of process restart monitoring (monit, supervisod, systemd, etc) to ensure a process is always alive.

from monstache.

Crispy1975 avatar Crispy1975 commented on July 18, 2024

Understood, thanks for the ideas. I think I might do something like a Node.js process on each box where they talk to each other and work out who's running. So there would only be one monstsche instance running, that way can use Node.js (or whatever) to make sure that only one prices is running and it's a master ES node too.

from monstache.

rwynn avatar rwynn commented on July 18, 2024

@Crispy1975

I made an experimental version of monstache that attempts to solve your problem. I introduced a new configuration option (and flag) cluster-name. When you set cluster-name to the same value for a set of monstache processes, the processes coordinate. They do this by racing to insert a document in a special collection monstache.cluster.

Only one process will be able to successfully write to monstache.cluster at a time (due to unique id constraint). The one that successfully writes to this collection will run as usual. The other ones will pause reading from the oplog.

The monstache.cluster collection has a special index on it which uses a feature of mongodb. The document will expire after a period of time. Also there is a timer which runs in each process that periodically attempts to write to this collection even if the oplog tailing is paused.

So when a process that is currently master (identified in the monstache.cluster collection) dies, then eventually mongodb will automatically time out the document and other monstache processes will then be able to successfully write to that collection and become the new master.

As long as at least 1 monstache process is running there should always be a master process that is currently tailing and syncing from the oplog.

I've attached a binary that you can test this with. You should start all 3 with this minimum config:

monstache -cluster-name 3_NODE -verbose true

Then you can look into the collection monstache.cluster to see which one is currently master. You can then bring this process down by stopping it and shortly you should notice that another one takes its place.

If this works out for you please let me know. I'll then incorporate into a release.

from monstache.

rwynn avatar rwynn commented on July 18, 2024

monstache.zip

from monstache.

Crispy1975 avatar Crispy1975 commented on July 18, 2024

Wow, awesome. I will try this our first thing tomorrow morning (late here). Thanks!

from monstache.

rwynn avatar rwynn commented on July 18, 2024

Forgot to mention that when a process that was waiting to become master becomes the master it reads the last time stamp synced by the old master. So it doesn't lose documents indexed between the time a master goes down and the new one resumes.

from monstache.

Crispy1975 avatar Crispy1975 commented on July 18, 2024

Right now having some issues with monstache running on one ES node. Using the following toml config (minus script stuff).

mongo-url = "mongodb://MONGODB_USER:[email protected]:27017,10.128.0.8:27017,10.128.0.9:27017/?authMechanism=SCRAM-SHA-1&authSource=admin&replicaSet=RS-0"
mongo-pem-file = "/mongodb-rs.pem"
elasticsearch-url = "https://es0:9200"
elasticsearch-pem-file = "/etc/elasticsearch/root-ca/root-ca.pem"
cluster-name = "beamery-preview"
replay = false
resume = true
resume-name = "testing"
namespace-regex = "^testing.coll$"
gtm-channel-size = 200
verbose = true

I have confirmed I can connect to both MongoDB and ES from the node using the PEM files. I've also tried with the cluster-name param removed and the last release. I suspect this is down to SSL with MongoDB (perhaps). I have a non-SSL MongoDB replicating to a TLS aware ES cluster and all seems to be ok there.

Edit: Being more descriptive I start monstache and there are no errors, it sits there with no output (verbose on). If I look at the MongoDB clients I can see a connection being made from the correct IP address where monstache is running.

from monstache.

Crispy1975 avatar Crispy1975 commented on July 18, 2024

Definitely seems to be MongoDB based with this repeated over and over in the logs...

2017-01-10T01:56:17.457+0000 I NETWORK  [initandlisten] connection accepted from 10.128.3.1:41082 #22029 (77 connections now open)
2017-01-10T01:56:17.459+0000 I NETWORK  [conn22029] end connection 10.128.3.1:41082 (76 connections now open)

We're using self-signed certs for this RS, perhaps this might be the issue?

from monstache.

Crispy1975 avatar Crispy1975 commented on July 18, 2024

@rwynn ok, ran a test, from the test ES cluster I tunnelled back from a node to my local machine and used the MongoDB install there as the source for replication. Connected fine and things seem to be doing what they should. This should mean that using ssl with a self-signed certificate is the problem... I figure it's the lines here:

https://github.com/rwynn/monstache/blob/master/monstache.go#L609

The mgo client driver has an option from what I can see to disable the checking of cert validity with InsecureSkipVerify bool, so adding something like...

tlsConfig.InsecureSkipVerify = true

...after line 609 should do the trick, of course it makes sense that it would be a config option exposed to give users the choice?

from monstache.

Crispy1975 avatar Crispy1975 commented on July 18, 2024

This PR solves the certificate validation: #17

from monstache.

rwynn avatar rwynn commented on July 18, 2024

@Crispy1975

Just released a new version with your PR and the clustering mode option. It's strange that you had to turn on InsecureSkipVerify for it to work. That TLS code for mongodb was shaped by this article https://www.compose.com/articles/connect-to-mongo-3-2-on-compose-from-golang/. Is it possible that each of your mongo servers has a unique cert or that the pem was incomplete?

Anyway, let me know if you get a chance to test out the clustering option. I've just been running multiple processes in the same VM, stopping the active one, and watching the others take over. So hopefully you'll see the same result.

from monstache.

rwynn avatar rwynn commented on July 18, 2024

@Crispy1975

I did more debugging on the mongo TLS connection issue. I think the problem may be related to the Common Name set on the certificate not matching the connection host name. You may be able to fix the issue by regenerating the PEM file making sure that the Common Name set when creating the certs is the host name you will be connecting to. Also it looks like you have multiple mongo servers in the connection string - so you may need to cat all of the certs together into one PEM.

I found this by generating a cert and accepting all the defaults. I was not able to connect with this, but I was once I regenerated the certs and set localhost for Common name. I will add more logging so that the error message I initially received is printed - x509: certificate is valid for , not localhost.

from monstache.

Crispy1975 avatar Crispy1975 commented on July 18, 2024

I am pretty sure it would be the CN causing it, as our connection string is with IP addresses, can make use off /etc/hosts to get that working ok.

Clustering seemed to work ok for me earlier when I tunneled to my local non-SSL MongoDB, just have to try them all together and all good. :)

from monstache.

Crispy1975 avatar Crispy1975 commented on July 18, 2024

All working nicely here. Great stuff, thanks! :)

from monstache.

rwynn avatar rwynn commented on July 18, 2024

@Crispy1975 Great to hear it's working for you. I just pushed a release to fix an edge case around cluster mode where a process resuming work for the cluster would remain paused. This would happen if the cluster had never saved a timestamp and the primary went down.

from monstache.

Crispy1975 avatar Crispy1975 commented on July 18, 2024

@rwynn Awesome, will grab that down. Thanks!

from monstache.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.