Comments (1)
Connection Pooling is now implemented in the C# driver.
The design notes are outlined here:
High Level Design Ideas
After thinking about the issue for a while, an important fundamental idea behind connection pooling is metrics. Metrics play an important role in connection pooling because a measured connection can provide useful information and give us the ability to make important decisions. At a very basic level, detecting when a connection goes UP or DOWN can be turned into a discrete kind of metric. So, I've kept this this in mind as a guiding principal.
Also, the design must take into account different kinds of connection pooling strategies. The implementation should allow a developer to swap out a basic Round Robin node selection strategy for a more advanced Epsilon Greedy algorithm for host/node selection.
The overall implementation should be lock-free in a multi-threaded environment when node selection takes place for super-fast performance.
Lastly, the driver connection pool should not retry a failed query. The callers to the driver should be notified via Exception that a failure has occurred. The caller can then decide if they query warrants retry or not.
Implementation
The underlying Connection
class was very good already. So, it was just a matter of bundling connections together. A minimal interface called IConnection
was extracted from Connection
:
public interface IConnection
{
Task<dynamic> RunAsync<T>(ReqlAst term, object globalOpts);
Task<Cursor<T>> RunCursorAsync<T>(ReqlAst term, object globalOpts);
Task<T> RunAtomAsync<T>(ReqlAst term, object globalOpts);
void RunNoReply(ReqlAst term, object globalOpts);
}
These methods are all minimal Run methods necessary for ReqlAst
to execute a query on a connection. Mostly, expr.run()
or expr.runHelper()
all, in one shape or another, use a conn.Run*()
method. In order to avoid exposing the Connection.Run
publically and polluting the Connection
API, the IConnection
methods are implemented explicitly.
ReqlAst
run methods were changed to accept an IConnection
instead of Connection
.
Next, a ConnectionPool
class was created and implements IConnection
. However, it is not the responsibility of the ConnectionPool
to execute the query (more on this later). The main responsibility of the ConnectionPool
is to 1) Supervise Connections and 2) Discover New Hosts. Typically, the ConnectionPool
is seeded with well-known nodes in a cluster and then attempts to connect to the seeded hosts. The developer can choose to enable discovery of new nodes when nodes are added to the cluster.
Supervisor
The supervisor is a dedicated thread that looks for failed connections and tries to reconnect them.
Discoverer
The discoverer is a dedicated thread that sets up a change feed on a system table. Initially, all nodes from a system table are examined to discover pre-existing nodes. When a new node is added to a cluster, the discovery thread will add the new node to the pool. _Important note: Nodes are never removed from the pool. Only marked as dead._ The rationale for this design decision was mainly performance. In order to maintain very fast lookup and node selection, object-locking must be kept to a minimum. If node removal from the host pool was allowed in a multi-threaded environment locking on the node/host[] array would have been a huge issue impacting performance. In the case of a monotonically increasing host[] array, iterative for
loops would never get out of bounds on a dirty/late read of the host[] array as long as the array continued increasing in size.
The ConnectionPool
simply forwards queries for execution down to the IPoolingStrategy
interface. The ConnectionPool
does not select nodes, it only forwards queries to the IPoolingStrategy
.
IPoolingStrategy
The interface has 3 contract methods:
public interface IPoolingStrategy : IConnection
{
HostEntry[] HostList { get; }
void AddHost(string host, Connection conn);
void Shutdown();
}
Notice, the IPoolingStrategy
also forces implementers to implement IConnection
(aka the conn.Run*
methods mentioned earlier). Briefly, AddHost
is called by the Discoverer to notify the host pool that a new node has been added to the cluster (and the host pool should consider it as part of a node selection). Shutdown
provides a signal to shut down and clean up resources and HostEntry[] HostList
requires IPoolingStrategy
to return a list of HostEntry
to be examined by the Supervisor
for inspecting dead hosts and retry/reconnecting them.
RoundRobinHostPool
IPoolingStrategy
The RoundRobin host pool has a variable that is Interlocked.Incremented()
on query request as to obtain a value that's mod-ed with the length of the host[] array. More importantly is the execution of the query:
public override Task<Cursor<T>> RunCursorAsync<T>(ReqlAst term, object globalOpts)
{
HostEntry host = GetRoundRobin();
try
{
return host.conn.RunCursorAsync<T>(term, globalOpts);
}
catch
{
host.MarkFailed();
throw;
}
}
Notice, incident threads will be immediately be notified via try/catch
when a host has failed. This is the main metric we are interested in (the failure of a node), in the RoundRobin host pool.
Lastly, if there is no incident thread inertia that can immediately fault the connection, we still take steps to mark the down node as dead in the pool. (In other words, "If a tree falls in a forest and no one is around to hear it, does it make a sound?") In this context, yes, the ResponsePump
thread (which is dedicated to each connection) will error and the Supervisor (in about 1 second interval) will inspect all connections for any errors and mark errored connections as dead. We could probably improve this by having the ResponsePump
actually invoke a callback to the host pool that an error has occurred, but it would add too much complexity with how Connection -> ConnectionInstant -> SocketWrapper
are architected. We're already 3 layers deep in abstraction.
I think 1 second detection is enough since we don't guarantee that a query will succeed anyway.
EpsilonGreedy
IPoolingStrategy
Another connection pooling strategy is Epsilon Greedy. Epsilon Greedy is an algorithm that allows the connection pool to learn about "better" options based on speed and how well they perform. This algorithm gives a weighted request rate to better performing nodes in the cluster, while still distributing requests to all nodes (proportionate to their performance).
The host pooling code for Epsilon Greedy mostly came from Go Host Pool which GoRethink uses under the hood. One major issue I found with the implementation was the underlying Go Host Pool extensively used object locks. Object locking has significant performance consequences. After initially porting over the code in C# 1-for-1, it took EpsilonGreedy about 4 seconds for processing 600K node selections. Wow. Insanely slow. The implementation was refactored by removing all locking constructs in critical path of node selection and got the data structure usage down to a single host[] array (and no dictionary lookup). Also removed response structs that could potentially eat up a lot of memory. Also, the calculation of EpsilonValue
for every host on _every selection_, was moved to the EpsilonDecay
timer. It was way too expensive to calculate EpsilonValue
every time a node needed to be selected. Additionally, the accumulation of counts & measured responses times use CPU Interlocked.Add
OP-codes to accumulate EpsilonCount
and EpsilonValue
without locking. Along with these changes, among others, 600K node selections now execute under 200 milliseconds. Hell yeah.
public override async Task<dynamic> RunAsync<T>(ReqlAst term, object globalOpts)
{
HostEntry host = GetEpsilonGreedy();
try
{
var start = DateTime.Now.Ticks;
var result = await host.conn.RunAsync<T>(term, globalOpts).ConfigureAwait(false);
var end = DateTime.Now.Ticks;
MarkSuccess(host, start, end);
return result;
}
catch
{
host.MarkFailed();
throw;
}
}
Now it should be clear, following with the metrics playing a central role: Having each host pool handle the direct execution of a query on a connection is beneficial because the host pool has locality of variables to measure the response of a node. When host pools are directly involved in the execution of a query, different host pooling implementations benefit from 1) before execute / after execution contexts as well as 2) variable locality. Both help avoid high memory usage and extra stress on the GC managed which otherwise would require callbacks or HostPoolResponse objects.
from rethinkdb.driver.
Related Issues (20)
- Create a method to return result as raw json HOT 4
- Resolution function on OptArg insert conflict HOT 1
- ConnectionPool.Builder.ConnectAsync does not respect InitialTimeout() HOT 4
- Add Error codes to Exceptions HOT 2
- Additional exceptions thrown when trying to cancel Cursor.MoveNextAsync() HOT 6
- Object's primary key and key of RunAtom() is different (Client-side generation). HOT 4
- Support for System.Text.Json.JsonDocument
- Cursor<T> should implement IAsyncEnumerable<T> in .net core 3
- RethinkDB 2.4 Release
- Best options: RunResult, RunAtom or RunCursor HOT 3
- Get list from Table that contains 'A' in the 'message' field HOT 1
- ASL 2.0 HOT 1
- Implement lambda functions as a possible argument in Insert().OptArgs("conflict",...) HOT 1
- System.NullReferenceException HOT 1
- Add EntityFramework support to Linq provider HOT 5
- Polymorphism HOT 1
- Various exceptions when updating a document HOT 1
- System.Configuration.ConfigurationErrorsException: HOT 1
- Wrong OptArg serialisation after OrderBy
- Insert not working with RunNoReply & Can't use runOpts is always null -- solved
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rethinkdb.driver.