Comments (36)
From an implementation perspective this could be a cleaner and easier design:
- Every time an invalidation happens, the invalidation is being stored in a specific
invalidation
table in the datastore, in the following format:
INSERT INTO invalidations (id_to_invalidate, type_to_invalidate, created_at) VALUES (263af28e-72b7-402f-c0f1-506a70c420e6, "plugins", now())
- Asynchronously every node checks the invalidations table, and stores the time of the check in memory in a variable like
last_check_at
. Everyn
seconds it checks again all the invalidation that have been created in the meanwhile wherecreated_at > last_check_at
, executes those invalidations and updates thelast_check_at
value again. - When a new node starts,
last_check_at
is being set to the time the node was started, so that new nodes will only execute the newer invalidations and not the older ones. - The table can have a
TTL
set to an appropriate value, like one hour or one day, so that the table doesn't grow too much.
from kong.
The first step for building invalidations has been merged with #42. It implements a mechanism for time based cache expiration.
To fully implement invalidations, the time-based expiration should be removed in favor of an application-based invalidation.
from kong.
Following up on this - do you see any problems with the proposed solution (the invalidation
table)? Invalidations could be stored in the table for a week and then automatically expire leveraging Cassandra TTLs.
Advantages:
- Not having to introduce yet another layer (like Serfdom or Consul).
- Not introducing cluster awareness to Kong now would also mean keeping it very simple to scale up and down (cluster awareness would mean improving the CLI to add
cluster-join
,cluster-remove
,cluster-info
properties, and also execute those operations every time a new machine is being added/removed).
Disadvantages:
- Kong needs to have a job that every second queries for the latest invalidations. Not necessarily expensive, but in a way not super elegant.
- The invalidation need to expire at one point because we don't want that table to keep growing indefinitely. A TTL of a week or a few days could be set, with the assumption that if a machine had a network problem that lasts for more than the TTL it means that machine is dead and should be restarted. This will avoid data inconsistencies.
from kong.
Really against this. It looks very clumsy and using techs that are not built for this kind of job (Cassandra and querying it every second... This feels like manually doing something clustered databases are supposed to deal with, and this is not a good sign...). Sadly I think our only real option is to use something actually built for that kind of job, and step away from Cassandra.
from kong.
if you're leaning towards clustering Kong nodes, then might as well ditch the database layer and just rely on local memory and local storage of each node, and implement data syncing algorithm...
otherwise, low hanging fruit is to keep with the database provided clustering and do selective caching based on entity type (api
vs consumer
vs plugin
)
another approach that might be simpler and more robust, is a separate process that talks to the database (on a recurring intervals) on behalf of Kong, and just updates in-memory objects that Kong can access (shared memory space)
from kong.
as well ditch the database layer and just rely on local memory and local storage of each node, and implement data syncing algorithm
👍. Which is why I suggest using already existing solutions like service directories (@thefosk quoted Consul).
from kong.
another approach that might be simpler and more robust, is a separate process that talks to the database (on a recurring intervals) on behalf of Kong, and just updates in-memory objects that Kong can access.
This would be the solution proposed above.
The only problem with having Kong querying Cassandra every n
seconds, is that as the Kong cluster grows, more and more connections are being sent to Cassandra. This becomes a problem when the number of Kong machines grows very big (100+ nodes, but then Cassandra can be scaled too).
To effectively solve the problem we need a good implementation of a gossip protocol, so that for example when one machine receives the HTTP request to update an API, we can communicate this change to every other node. So we don't need to replace our database, we need a new feature on top of our database. If we decide to go thru this route, then something like Serfdom that can live in the machine where Kong is running would be ideal, because from a user's perspective there wouldn't be nothing else to setup, scale or maintain.
from kong.
The only problem with having Kong querying Cassandra every n seconds, is that as the Kong cluster grows, more and more connections are being sent to Cassandra. This becomes a problem when the number of Kong machines grows very big (100+ nodes, but then Cassandra can be scaled too).
correct, however you wont be querying with every request, you'll just be querying to update the Kong shared memory space, meaning you can make one big query to get all the data every time (in theory) and just update what's needed
from kong.
However loading the entire records from Cassandra in nginx's lua memory zone could be an issue if too many records are in Cassandra. This approach also feels like reinventing the wheel (a "implement data syncing algorithm" like @ahmadnassri said), a complex problem (we don't have solid knowledge on distribution algorithms at Mashape, or am I overthinking this?) already solved.
from kong.
Not only that, nginx's memory zone is a simple key=value store with no indexes or support for complex queries. We would need to drop any query other than (find, delete, update, *)_by_id
.
from kong.
As well ditch the database layer and just rely on local memory and local storage of each node, and implement data syncing algorithm.
The idea is nice, not sure how feasible it is though, as @thibaultcha said.
Let's say that we have multiple Kong nodes connected through a gossip protocol, and each node has a local datastore (let's say Cassandra, since we already have the DAO for it), we could have detached Cassandra nodes on each Kong node, in a clustered Kong environment, where basically Cassandra only stores the data and the gossip protocol takes care of the propagation (effectively replacing Cassandra's clusterization and propagation).
Kong would ship with in an embedded nginx + Cassandra + Serfdom configuration, no external DBs. Each node would have it's own Cassandra (or any other datastore). We would still be able to make complex queries since the underlying datastore would be Cassandra.
from kong.
I don't understand the role of Cassandra in your latest comment @thefosk.
from kong.
Also does serfdom implement a storage solution? I think we are here talking about 2 different approaches:
- the one you described: a database, Kong, and a new component responsible for telling each Kong node when to reload data.
- the one I suggest: leave the gossiping to a tool that already does that, so a database, aka a service directory, like etcd or Consul, use Cassandra (or anything else) only for cold storage. Then our DAO becomes a serializer to load data in Cassandra, or anything else (even a configuration file like #528) to the service directory. Letting the job of distribution to tools that already solve this problem.
from kong.
@thibaultcha Cassandra would only be the store medium, living standalone in the machine. Basically you could replace Cassandra with SQLite and the role would be the same. Serfdom and its gossip protocol implementation would tell each node the data to store locally into Cassandra, without having a centralized datastore.
The reason why I said Cassandra is just for conveniency: we already have the DAO for it.
from kong.
the one you described: a database, Kong, and a new component responsible for telling each Kong node when to reload data.
There is a variant of this, and it's the last option I was suggesting.
Kong, serfdom and Cassandra all living in the same machine. We have 4 nodes? Then 4 Kong, 4 Cassandra, 4 Serfdom instances.
Each Cassandra node lives on it's own, without knowing anything about the other Cassandra instances. It's only being used as a powerful local storage medium (like SQLite was in the beginning), leaving all the propagation job to serfdom.
Cassandra would be only used for conveniency here, since we already have everything working for it. We could use another datastore, but we can't use a simple key=value storage with no indexes like Consul, etcd, or the in-memory nginx cache.
from kong.
leave the gossiping to a tool that already does that, so a database, aka a service directory, like etcd or Consul, use Cassandra (or anything else) only for cold storage. Then our DAO becomes a serializer to load data in Cassandra, or anything else (even a configuration file like #528) to the service directory
Maybe we are suggesting the same thing with different terminology.
from kong.
(we don't have solid knowledge on distribution algorithms at Mashape, or am I overthinking this?)
yeah, and I don't think its a viable approach anyways, we're not in the business of building databases.
from kong.
Yeah that's my point
from kong.
Each Cassandra node lives on it's own, without knowing anything about the other Cassandra instances. It's only being used as a powerful local storage medium (like SQLite was in the beginning), leaving all the propagation job to serfdom.
why not just simply couple Cassandra within Kong builds (exactly like nginx) and just hide it away from the end user, the Kong <-> Cassandra connection would become local and with little latency and the clustering logic is outsourced to Cassandra's internals.
no need for caching, no need for validation.
from kong.
Okay, we do agree that we need some sort of cluster awareness, and that we need something capable of supporting a gossip protocol.
I would like to better understand one thing. @thibaultcha as you know tools like Consul, or etcd, support a very simple implementation of a key-value store. I don't think we will be able to get rid of Cassandra, because we still want to be able to do complex queries (and increments). I +1 the idea of introducing a new tool, but I don't believe we can't get rid of a real database.
@thibaultcha @ahmadnassri - thoughts?
from kong.
why not just simply couple Cassandra within Kong builds (exactly like nginx) and just hide it away from the end user, the Kong <-> Cassandra connection would become local and with little latency and the clustering logic is outsourced to Cassandra's internals.
We can't have Cassandra run on each node and use nginx's resources, nor expect users to have the ressources for that.
@thefosk My idea was just to put API routing and plugins in it, and keep Cassandra. Data is retrieved from Cassandra and serialized on boot, so that we have different families of keys (for APIs, for plugins, etc). But we wouldn't have cache for any other entities, like the ones used by plugins (API keys, etc...). I agree it is not ideal either. We are pretty much stuck on this issue tbh.
from kong.
We can't have Cassandra run on each node and use nginx's resources, nor expect users to have the ressources for that.
I agree.
So, we can't make too many changes all at once, I want to proceed step by step and see where this leads us to. My idea is to:
- Integrate Kong with Serfdom
- Every time the admin API is being invoked that changes something, the node that processes the HTTP request will also tell Serfdom to send an invalidation event to the other nodes. The other nodes will invalidate the data only when they receive such event (as opposed of using a time expiration invalidation).
- Introducing Serfdom, means introducing cluster awarness (otherwise Serfdom doesn't know where to send the events to). In this first version, the CLI needs to be updated with functions to:
- Join a node to a cluster
- Remove a node from a cluster
- Check the cluster status
- (something else that I am missing?)
This will not change much the foundations (where the datastore is located, for example), but will achieve the invalidation goal on top of the Gossip protocol of Serfdom.
Serfdom will be shipped in the distributions (like dnsmasq), so nothing will change in the installation procedure. Serfdom opens a couple of ports, that the user will need to make sure they are properly firewalled.
Thoughts?
from kong.
I like the idea, couple of thoughts:
- License: need to verify we're able to package and ship Serfdom alongside Kong: Mozilla Public License, version 2.0
- Footprint: how many more resources would Serfdom require?
- Debugging and Support: since this is a completely separate entity (and much bigger than dnsmasq) I worry about supporting it, so we have to also think of ways to incorporating debug dumps and logging into Kong's own.
from kong.
- Need to evaluate any alternative to Serfdom and consider them.
- Need to figure out how to make Kong and Serfdom communicate between each other.
from kong.
Need to evaluate any alternative to Serfdom and consider them.
this probably should be priority, to make sure we have picked the right tool, not just based on hype.
from kong.
License: need to verify we're able to package and ship Serfdom alongside Kong: Mozilla Public License, version 2.0
Can you investigate this? @ahmadnassri
Footprint: how many more resources would Serfdom require?
Not much. A couple of ports open, and that would be pretty much it.
Debugging and Support: since this is a completely separate entity (and much bigger than dnsmasq) I worry about supporting it, so we have to also think of ways to incorporating debug dumps and logging into Kong's own.
It's okay. I am not too worried about this - we are using a pretty basic functionality among the ones that it provides, nothing crazy.
Need to evaluate any alternative to Serfdom and consider them.
Yes - to date Serfdom is my best option. Happy to discuss about other options. When I was looking for a solution to this problem, Serfdom was my pick because is decentralized (no centralized servers to setup, like Apache Zookeeper) and very straightforward to use. Also, support for every platform. (Consul itself is also built on top of Serfdom for cluster awareness).
Need to figure out how to make Kong and Serfdom communicate between each other.
Serfdom will trigger a script every time an event is received. The script can then whatever we want to do, start an HTTP request (my first idea), TCP, etc. Invalidation on our scale could happen over HTTP in my opinion, because it's not going to be too much intensive.
from kong.
Can you investigate this? @ahmadnassri
best suited for a lawyer ... my go-to source for understanding licenses is: http://choosealicense.com/licenses/mpl-2.0/
seems permissive, but I would still consult an expert.
Not much. A couple of ports open, and that would be pretty much it.
I meant in terms of memory, hard disk and CPU usage :)
from kong.
I meant in terms of memory, hard disk and CPU usage
Negligible.
from kong.
@thefosk ping Glaser for legal. can apache 2.0 wrap a mpl2?
from kong.
Just did. No red flags in the licenses.
from kong.
Since we reached an agreement on Serfdom for now and there are no licensing issues, closed in favor of #651 which covers the technical integration aspects.
from kong.
@thefosk you look at this? https://github.com/airbnb/synapse
from kong.
That serves a different purpose @sinzone
from kong.
already using serf why not go all the way and make consul a prerequisite for Kong? Would allow us to rely on local memory and local storage of each node
withou reinventing a wheel already invented elsewhere
from kong.
@hutchic because Consul is a centralized dependency and we don't want to introduce any more dependencies to Kong besides the datastore.
from kong.
sure I mean consul is no more centralized then serf https://www.consul.io/intro/vs/serf.html but I can understand wanting to avoid adding more dependencies. Forever growing the number of queries to psql doesn't seem like it'll scale past a certain point so might need distributed in-memory some day regardless
from kong.
Related Issues (20)
- http-log create incorrect elasticsearch mapping and trigger parse error HOT 3
- kong gateway doc, upstream_keepalive_max_requests and nginx_http_keepalive_requests default value not match in nginx-kong.conf HOT 8
- kong 500 error:received: 500; {"message":"An unexpected error occurred"} HOT 11
- Kong ingress controller failed to fetch secrets for kong consumers if restarted (either due to some crash or planned) HOT 2
- URI is sometimes `/kong_error_handler` on upstream status 502s HOT 2
- lua/5.1/kong/db/schema/init.lua:1244: attempt to index a nil value HOT 2
- [kong] schedule.lua:172 [job prefetch]Redis bgsave failed. Error: ERR unknown command 'bgsave' HOT 2
- Control Plane fails to pull information from database pending "migrations finish" command HOT 2
- In Kong 3.6, X-Kong-Request-Debug-Output is not provided in the response header for successful scenarios HOT 3
- AI-Proxy plugin: "An unexpected error occurred" when upstream URL is missing port/path HOT 6
- AI Prompt Guard plugin: Unexpected error due to malformed `messages` array in request body HOT 4
- Problems with Kong installation via Helm Chart 2.38.0 HOT 3
- Admissionwebhook misses faulty regex HOT 1
- Exceptions while trying to store secrets using environment variables option HOT 8
- kong lua-resty-lock lock timeout 500 error {"message":"An unexpected error occurred"} HOT 2
- custom proxy_access_log still not working in 3.4.* HOT 2
- Kong prometheus plugin does not record 404 response codes from proxy HOT 1
- Timeout when running migrations from 3.5 to 3.6 leading to corrupted data (migration ran twice) HOT 3
- Admin API address in "New Connnection" form only support IP, not DNS. HOT 2
- Upsert target is not an upsert HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kong.