There is a lot of fragmentation in this space, which is an especially big problem for something like buffering where we should be aiming for all libs to be able to share their buffers to reduce copying. However, traits for IO in general are still not really a solved problem it seems. Most things seem to be coming down to a difference between (Async)(Buf)Read/Write
and Stream/Sink
based solutions. We need to evaluate the pros and cons of each of these, and come to a decision, so that we can start using the same generic traits and patterns for IO across different crates and achieve consistency. I'm going to briefly go through the current solutions, and some that are being developed (please correct me if any of the code below is wrong), then we can compare and contrast them to find the best option.
Options
AsyncRead / AsyncWrite
trait AsyncRead {
fn poll_read(
&mut self,
cx: &mut Context,
buf: &mut [u8]
) -> Poll<Result<usize, Error>>;
// ... initializer and vectored reading
}
trait AsyncWrite {
fn poll_write(
&mut self,
cx: &mut Context,
buf: &[u8]
) -> Poll<Result<usize, Error>>;
fn poll_flush(&mut self, cx: &mut Context) -> Poll<Result<(), Error>>;
fn poll_close(&mut self, cx: &mut Context) -> Poll<Result<(), Error>>;
// ... vectored writing
}
Very minimal. Mirrors the standard library, so also very familiar. Can only work with bytes, which limits it to relatively low level operations. Can work without allocation, but does require copying of bytes.
Stream / Sink
trait Stream {
type Item;
fn poll_next(
self: PinMut<Self>,
cx: &mut Context
) -> Poll<Option<Self::Item>>;
}
trait Sink {
type SinkItem;
type SinkError;
fn poll_ready(
self: PinMut<Self>,
cx: &mut Context
) -> Poll<Result<(), Self::SinkError>>;
fn start_send(
self: PinMut<Self>,
item: Self::SinkItem
) -> Result<(), Self::SinkError>;
fn poll_flush(
self: PinMut<Self>,
cx: &mut Context
) -> Poll<Result<(), Self::SinkError>>;
fn poll_close(
self: PinMut<Self>,
cx: &mut Context
) -> Poll<Result<(), Self::SinkError>>;
}
Generic over more than just IO. Can work with things that aren't bytes. Needs ownership of data being sent through (for IO applications) which will typically mean allocations are required. IO types don't directly implement these traits, you'd need to create wrappers such as Framed.
AsyncBufRead
trait AsyncBufRead {
fn poll_fill_buff(&mut self, cx: &mut Context) -> Poll<Result<&[u8], Error>>;
fn consume(&mut self, size: usize) -> Result<(), Error>;
}
This isn't a fully fledged idea yet, but I found myself using AsyncRead
and a BytesMut
to roughly this effect a lot in my http parsing crate. Allows for all the benefits of AsyncRead
, as well as the advantages of buffering - increased performance and not needing to worry too much about over reading.
BufStream
trait BufStream {
type Item: Buf;
type Error;
fn poll(self: PinMut<Self>, cx: &mut Context) -> Poll<Result<Option<Self::Item>, Self::Error>>;
}
More IO focussed than stream on its own. Still looks like it will require ownership of the bytes being sent in most cases. Also, the caller cannot choose how many bytes are read in each go.
Comparison
Firstly, BufStream
and Stream
appear very similar, it's just that BufStream
is more specialised for IO than Stream
. As we are trying to find a good trait to make IO functions operate with, I think we can consider only BufStream
for reading, and perhaps a similar BufSink
equivalent.
For our Read
apis, we are then looking at AsyncRead
vs AsyncBufRead
vs BufStream
.
From an API consumer's perspective, the main difference between each of these is who chooses how much reading is done and when.
- Users of
AsyncRead
can choose an upper limit on how many bytes they receive per call, but must carefully set that upper limit so that the next attempt to read from the reader does not have the beginning of the message it is attempting to read cut off.
- Users of
AsyncBufRead
can choose an upper limit on how many bytes they receive per call, and if they accidentally read too many bytes they can simply choose not to consume that many.
- Users of
BufStream
have no control over how many bytes get sent through with each poll, and must adapt their code to be able to handle receiving extra bytes that are not part of the message they are trying to parse. This is especially difficult as we would need a generic interface for passing these extra unwanted bytes either back into the stream or onto the next function that tries to read from the BufStream
.
While an API consumer who is doing something simple could make all three APIs work, in more complex cases the AsyncBufRead
has definite advantages. Consider the case where a server is attempting to listen for two different types of messages on a single port - eg HTTP requests and websocket connections. It is necessary to be able to read exactly one HTTP request from the reader, and then immediately afterwards begin reading either further HTTP requests or websocket packets. It is therefore necessary that no excess bytes are consumed from the reader while parsing the HTTP request, as they would be missing from the start of the next message, and it is not known which parser will be used to parse that message.
From a library designer's perspective, the differences between each of these is how closely they mirror the read
APIs provided by the operating system, and therefore how much overhead in both performance and complexity is necessary to emulate the given API with operating system read sources.
- Users of
AsyncRead
can mirror the OS API exactly, and have no issues at all
- Users of
AsyncBufRead
need to extend AsyncRead
with a buffer implementation, but can do so without too much complexity by using crates such as bytes
.
- Users of
BufStream
would need to wrap an AsyncRead
-like API with something that allocates buffers and then emits them. This would not have high costs.
All three of these cases are reasonably straightforward, and have limited performance costs. There is real disadvantage to any solution from a library designer's perspective.
From a performance perspective, the main issues are how many read calls are performed to parse a message, how much allocation is needed, and how much copying of data occurs.
AsyncRead
will require lots of read calls, and will require that data is copied out of the reader once, and into the buffer provided. No allocation is needed for AsyncBufRead
.
AsyncBufRead
will require minimal read calls, and requires that data is copied into a buffer once, out of the inner reader. More reallocations than allocations would be necessary for AsyncBufRead
BufStream
will require minimum read calls*, and requires that data is read into owned buffers (some optimisations may be possible that prevent copying of memory here, apparently). Several small allocations are probably necessary for BufStream
.
*the caller of the BufStream
API has no control over how many bytes come in per read call, meaning that while it may be possible to read in fewer calls, it is not possible to prevent excess reading from occurring.
For callers of Write
APIs, we are looking at AsyncWrite
vs BufSink
.
The Sink
and AsyncWrite
APIs are very similar, with the only difference being whether attempting a write is one option or two (is it ready, followed by do the write). I think the extra complexity of Sink
makes it potentially harder to misuse, but as the APIs are so similar I think we should base our decision on keeping consistency with the read API we choose to use.
Summary
In most ways, all 3 APIs could be used to achieve the same results. However, in the case of reading exactly up to the end of a message (and no further), AsyncBufRead
is the only viable solution so far.
Therefore, I'm currently leaning towards adopting a recommendation that we use AsyncBufRead
(or in some cases AsyncRead
, with an impl provided to bridge the two) and AsyncWrite
(with a buffered alternative, similar to the standard library) for IO work, and standardise on making crates generic over this trait.