Comments (3)
The solution I am exploring is to do uncommitted reads for a few minutes after the first event comes in after start up. Using the below sequence of events:
Server 1 requests next auto inc number, gets 20
Server 2 requests next auto inc number, gets 21
Server 2 commits row
Server 3 reads list of events at start up
Server 1 writes row to transaction buffer
Server 3 would miss event 20. The next cache refresh it would do an uncommitted read and see event 20 and then can wait for it to commit. If server 1 never writes the row it wouldn't show up in the transaction buffer. The time between when the auto inc number is requested and the row gets written is going to be small as its part of the same statement. so we wont need to check for a long time for a missed event to come in.
from spire.
Faisal asked me to write up his design.
TLDR: Improve tracking of the skipped database events by detecting them directly instead of detecting them by noticing skipped elements.
The changes in this effort will include reading the database using a special READ_UNCOMMITTED option in the read transaction. This will present the data with the committed and uncommitted data combined, such that uncommitted data is present (hiding the older committed row). A second read will also be performed, only on the committed data. By comparing the two, we can determine which Event IDs are not yet committed, and add those as tracked elements without potentially skipping any.
The prior solution might potentially skip the first uncommitted elements if they had an EventID that preceded the first committed EventID. The previous solution heavily uses the "last seen EventID" to keep a marker where it would then pick up from in the next scan for database events. While the work in #5071 would notice skipped EventIDs and periodically rescan for them each polling loop, it could not detect skipped items prior to the "last seen EventID" being set.
By reading uncommitted EventIDs, the "skipped" list of EventIDs will not be surmised, but read directly from the database. This presents a small issue, as the only means of determining which elements are uncommitted is to difference them against the same read without the READ_UNCOMMITTED option set.
The overall algorithim to determine which elements need period polling is roughly:
- Read the events with READ_UNCOMMITTED
- Read the events without READ_UNCOMMITTED
- Difference the two to discover which EventIDs are issued by the AUTOINCREMENT generator of the database into events that are not yet committed.
- Add those items to the "uncommitted event id list" which replaces the skipped list.
- Act on the uncommitted event id list upon the following scenarios
A. If the item disappears from the READ_UNCOMMITTED and doesn't appear in the non-READ_UNCOMMITTED queries, the transaction holding it was cancelled. Drop it from the uncommited event id list.
B. If the item disappears from the READ_UNCOMMITTED and appears in the non-READ_UNCOMMITTED, process it as a new database event, before processing events beyond the last processed event id.
Items that appear in the non-READ_UNCOMMITTED query without ever appearing in the READ_UNCOMMITTED query will not see different processing, as they aren't in long lived transaction. The existing algorithm already properly handles such items.
from spire.
- We currently have a scanning system for all events after the last seen event id.
- We currently have a scanning system for all events skipped between the first seen event id and the last seen event id.
- To close the gap, we need to scan all event ids prior to the first seen event id, but we don't need to scan for those longer than a transaction can stay uncommitted.
For MySQL that is 8 hours, postgres (since version 9.6) defaults to 24 hours, sqlite doesn't support a timeout (and is unsupported a shared database setup). For this reason, the maximum effective scanning time should be 24 hours.
- Upon detecting the first database event, that event id will be stored as the "first event".
- Upon each "new event" scanning cycle, if that cycle is less than 24 hours after the server start time, a query selecting all database event ids below the "first event" id is executed. This should be relatively fast, as the database event id is indexed, being the primary key of the table. The response should be relatively fast too, as it is expected to return zero entries on most scans.
- If a response contains database event entries:
A. The "first event" id is updated.
B. If gaps between the new "first event" and the prior "first event" are detected, those are added to the skipped id list and processed as skipped events.
C. Scanning in future cycles uses the new "first event" value. - After a 24 hours pass for the server, the query for "prior to the first event" can be disabled to conserve database resources, or can stay active based on developer / maintainer preference (but it would be impossible for it to obtain new records as any potential "in flight" transaction would either have completed or failed past its timeout.).
from spire.
Related Issues (20)
- Consider supporting rotation of database credentials in the datastore HOT 2
- The `image_id` selector name in the docker workload attestor is misleading HOT 1
- Release SPIRE v1.9.6
- Report the use of components with vulnerabilities in spire HOT 1
- make images fails for rootless docker in docker
- Do not use grpc.Dial and deprecated DialOptions
- Support file-based plugin configuration in catalog with hot reload HOT 4
- Feature Request: Use Kubernetes secrets as a SPIRE Server KeyManager HOT 1
- Integrate new API call with plugin configuration reloading. HOT 8
- Update SPIRE Arch/Design diagrams HOT 1
- debug API logs metrics misconfiguration errors
- Integration tests for database events
- Feature: Request for making nodes re-attestable when using NodeAttestor `aws_iid` plugin HOT 13
- Change example configurations to use a persistent KeyManager
- Release SPIRE 1.10.0
- The Bottom Turtle Reference Architecture(s) HOT 10
- Additional CA's in regular TrustBundle HOT 4
- k8s_psat node attestor: allow empty list of clusters HOT 2
- Update the docker image from golang:1.22.3-alpine3.18 to golang:1.22.4-alpine3.20
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from spire.