Comments (2)
I'm able to replicate it on AWS ECS (any container environment should do)
- Setup configuration with 1 leader and 2 followers.
- When setting up the follow, use a DNS record to locate the leader (this is the important part).
- Startup the servers and begin writing keys and generally futzing with it. Make sure you write at least one OBJECT and one STRING and issue SET and DEL and DROP so that it appears in the AOF. Before the next step make sure a STRING and a OBJECT type exist.
- kill the leader with
kill -9 <pid>
- ECS/Docker should see the task died and reboot the container. If it doesn't, do so manually. Since there is no data attached it should reboot with a clean
/data
directory and reinitialize based on your config. - The leader should come online and reattach to followers but with an empty AOF/DB.
- The followers report that everything is all good even though they are no longer in sync with each other as the master is empty and the followers have records.
- Begin writing records and notice that the followers attempt to stay in sync but don't really as the old keys are never cleared.
- Repeat the kill on one of the followers and notice that it now comes in sync with the leader properly as it downloads a new AOF.
- At this point it is in the state i described above, the leader and one of the followers are in sync but the last remaining follower is holding onto old records.
So what really happened here is that the leader will not cleanly killed and when it comes back online, its empty. The followers don't notice this change and continue along thinking everything is good. This means the issue is that the IP address changed of the leader during the reboot and the followers didn't re-verify that they weren't connecting to a leader who's AOF didn't match theirs (or even their server id).
In my case the leader was dying because AOFSHRINK was not properly running so it ran out of drive space. Its a solvable problem but still reveals that there is an issue.
from tile38.
I can confirm this on my side. Normally, immediately after connecting to a leader, a follower will issue some md5 checks to the leader to determine if they share the same AOF, and it not the follower AOF should be reset to match the leader.
I'll need to dig a little deeper, but I suspect the hard reset of the leader, which changes the server_id and empties the AOF, may be confusing the followers.
from tile38.
Related Issues (20)
- INTERSECTS does not respond correctly with geometry with an inner ring HOT 5
- nearby objects within a polygon HOT 8
- Fatal error on running tile38-server after installation HOT 3
- memory leak in azure eventhub endpoint HOT 1
- Missing "fields" in version 1.30.2 HOT 1
- Using HTTP to call command scan to filter json field not working.
- webhook detect is coming as undefined HOT 1
- outdated replace usage in go.mod HOT 1
- Filter array on an object fields HOT 4
- No support for batched/bulk calls to tile38 with java-client HOT 1
- Get hooks for a key of moving points to that of a key of Geofences HOT 6
- What is the performance/complexity when we add MATCH and WHERE to a command HOT 3
- multipolygon support HOT 3
- Issue with intersect when defining detect exit/enter HOT 3
- Apologies this was an accident HOT 1
- Protocol error: expected '$', got '*' on large bulk SETs HOT 2
- Heap size not coming down after objects are removed HOT 20
- Support the `ROLE` command so that Redis.Sentinel is fully supported. HOT 5
- Support any Redis GUI manager HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tile38.