shadowjonathan / dustls Goto Github PK

View Code? Open in Web Editor NEW

10.0 7.0 1.0 19 KB

Pure-Rust DTLS

License: Apache License 2.0

Rust 100.00%

dtls rust rustls security tls

dustls's Introduction

`dustls`, a pure-rust DTLS implementation

A DTLS implementation in Rust, reusing rustls for cryptographic primitives and most message payload formats.

Currently targetting a PoC for DTLSv1.2, v1.3 will come after that. No plans to support DTLSv1.0.

Note: This library is a work in progress, and (possibly) not yet tuned to the ecosystem.

Note: This library directly works with TLS records, logic, and Cipher Suites, and hasn't had a security audit (yet), use at your own risk.

dustls's People

Contributors

Stargazers

Watchers

Forkers

valkum

dustls's Issues

Code duplication and dependency to rustls

Looking through the rustls interface, I realised that, to effectively make dtls work, I'd have to copy a lot of rustls interfaces, or effectively patch their behaviour.

I'm not sure how warranted that is, and/or if such a thing is even necessary.

The upside to copying the code is that it effectively would make this library easier to read/work with, as it's largely contained, other than the cipher suites exported from rustls.

The downsides is that this would result in logic divergence if the rustls crate updates its own logic, and it could get unnoticed in this library. Consequently it could also result in some errors, which could result in vulnerabilities.

Rustls has a lot of TLS-specific behaviour, and among them is also a lot of pub(crate) locked stuff.

I'd rather let this crate be some sort of consumer to the rustls crate, rather than them having to completely accommodate and promise API stability on interfaces they'd rather not.

I also don't know if this means that this library is better off being integrated as a rustls submodule, but that's to be seen.

no_std

In rustls/rustls#40 (comment), @lachlansneff expressed interest in an embeddable DTLS implementation, which I believe (together with #1) could work with smoltcp and friends.

However, rustls does not support no_std, while I think that both crates would need alloc nonetheless, no_std seems to not be a priority or possibility as of yet (see rustls/rustls#157, cc @gurry)

An alternative would be to be able to switch to different cryptographic backends, however, rustls is chosen as one because it can expose much of its innards, and I don't even know if using openssl's ciphersuites directly is possible.

DTLS v1.3

As of Draft 43, I don't foresee any issues implementing both versions transiently through the api design in #1, which should be generic enough to accommodate them.

I've already said I'll add support after the draft has been ratified, though I'll try to take some notes from the existing draft when implementing v1.2

CloseNotify on Drop / explicit close

#7 describes some wanted features with the following derivative text;

TCP has the concept of FIN packets, which let the client know that the connection they currently are using is dead and should be terminated. UDP has no equivalent. For this reason, restarting the server will take up to the connection timeout
value for clients to detect a dead connection. This can be shortened by modifying DTLS to send CloseNotify alerts when the UDP socket receives unrecognised data.

Tackling this from the other side, there should be a possibility to somehow send a "last gasp" CloseNotify on drop, or allow an explicit close function.

However, the design of #1 separates the transport layer with the record layer, so the connection object can be dropped separately than the socket, which doesn't make this much possible.

API Design

This is a draft for the API design I currently have in mind, i'm currently thinking of 5 main objects;

DClientAgent, a handshaker state machine that performs in much the same way rustls::ClientCommon does, my current idea is as follows;
- An application supplies it with datagram packets, then polls it, and in a black-box fashion, it'll only give one of three answers;
  - "Await until this deadline, then poll me again"
  - "I am ready to write to the socket"
  - "I have encountered an error"
    (todo: how would handling an alert + error look like? "write this pls", poll(), "i got an error", in quick succession, writing the fatal but buffering the error until next call?)
- This object handles retransmission timers internally
- When complete, the poll() will return with "success, here's your connection object"
DClientSession, a IO object for which the application can supply datagram packets, and supply it chunks of bytes it want to encode.
- This will likely function exactly the same as rustls::ClientCommon, only that it supplies exact datagrams, and no streams of data.
  - "function exactly the same" wrt write_tls, read_tls, wants_write, etc. I don't believe wants_read is really needed, as all records are supposed to fit into one datagram, though i'll have to consult the RFC for that again.
- The output will be datagram-based, so again, possibly no Read + Write traits
DServerListener, an object that handles all incoming "leftover" packets that do not belong to a connection, handles ClientHellos with HelloVerifyRequests, verifies ClientHellos with cookies, and if successful, then produces a handshaker object for the application to continue verifying with.
- This object needs some sort of "cookie generator", which needs to be cryptographically secure and hard to guess, else DoS amplification attacks can be performed with this listener.
DServerAgent, pretty much the same as DClientAgent, only on the different side of the transaction.
DServerSession, ditto.

These objects basically give an entirely hands-off buffer-based black-box approach to DTLS connection handling, with the application being who needs to handle underlying UdpSocket multiplexing, routing, buffering, and more.

This is to ensure this encryption layer can work over any kind of channel that is unreliable, unordered, but framed, this last guarantee is the most important one, and i'm not sure how to encode that into the IO. Rust doesn't give standard traits or libraries that "understand" a connections "framed-ness". Though luckily objects like UdpSocket don't provide Write + Read at all, and instead require manual polling of the recv and send functions, so i think it's plenty okay to leave that functionality to a wrapping connection manager (such as tokio-rustls is to rustls)

License

I currently chose the EUPL-1.2 license because I believe it could gain some recognition and interest as a government-created and legalised document, which would give projects using it more assured standing.

However, in the wider context of usability, compatibility, and other licensing, I don't know if this is the right choice, I'd like to see what some opinions are on licensing, and if I should change it to something like MIT or Apache-2.

(Of course, before doing that, I'll get consensus from all authors up until that point, so it's easier to change earlier than to do it later.

Also, I'm only interested if the current license is too restrictive for use, I am not going to change to a viral copyleft license like AGPLv3, as it would hinder library use, which is the primary motivation for this library for me.)

Requirements for use with WebRTC

The WebRTC spec defines some requirements for the used DTLS implementation:

At least the TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256 cipher suite needs to be supported with support for the P-256.
Both of these requirements are met in rustls (TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256 & P-256).

Firefox additionally has support for P-384 and x25519 as well as TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256, TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256, and TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256.
Chrome adds TLS_RSA_WITH_AES_128_GCM_SHA256 on top. Except the added one of chrome all are supported by rustls, so this is fine. Both Chrome and Firefox also have support for CBC ones, but I guess we can ignore them (at least for now).

Furthermore, "Implementations MUST NOT implement DTLS renegotiation and MUST reject it with a "no_renegotiation" alert if offered."
So I guess It would be nice if dtls-rs could allow to configure in which cases an alert should be created in addition to allow configuring stuff like renegotiation.

The spec has some more API requirements I will not copy here. Could you have a look at them too?

Optional CloseNotify on unrecognised data

As a reason for this library's creation is to support ruma/lb, one other goal is to be marginally compatible with matrix-org/lb, which is a reference implementation of the "MSC", which contains the following bit in the proposal text;

DTLS operates over UDP which is "connectionless". This makes restarting connections after restarting the server difficult.
TCP has the concept of FIN packets, which let the client know that the connection they currently are using is dead and should be terminated. UDP has no equivalent. For this reason, restarting the server will take up to the connection timeout value for clients to detect a dead connection. This can be shortened by modifying DTLS to send CloseNotify alerts when the UDP socket receives unrecognised data.

While one problem with this section is that, if this was taken at "face value", this would allow a DoS attack, I'd like to also add in the additional functionality;

Allow an optional configuration setting in the server listener, which will respond to unrecognised DTLS packets with CloseNotify, and only every X amount of times for connections in a timeout.

This would mitigate DoS amplification attacks, while eventually edging off existing DTLS connections after such a server restart.

Of course, the default behaviour would be to drop these packets, this is an opt-in behaviour.

Server `Handshaker` will not transition to `Connection` until client sends data

The DTLS 1.2 handshake looks like thus;

All of this is able to be encapsulated in Handshaker objects, which handle all of this (+ retransmission) for the clients.

Because of this, handshakes have a special API which makes them able to alert the program to wait until a certain moment to retransmit the packets they'd want to send.

However, the 6th flight could get lost, either the CCS or the Finished packet, which means the client can get in a deadlock, see the following passage from [RFC6347];

   In addition, for at least twice the default MSL defined for [TCP],
   when in the FINISHED state, the node that transmits the last flight
   (the server in an ordinary handshake or the client in a resumed
   handshake) MUST respond to a retransmit of the peer's last flight
   with a retransmit of the last flight.  This avoids deadlock
   conditions if the last flight gets lost.

However, this would block the server handshaker from transitioning into a connection object until the client has sent any application data.

It would be more "correct" to wait until a client sends data, but depending on the application, the server might want to immediately start to send data.

This could be especially problematic with resuming sessions, or #9, which might require this kind of "last state" to be certain before either side starts sending buffered data again.

For now, I think it's more clean to wait for application data, but in the future I could add a "server handshaker mode" which immediately transitions to a connection object after finishing the last flight, but keeps it sealed and keeps it ready to be sent if the client were to send flight 5 again, this would be a ton of extra logic to keep, though.

Published name

I'm still undecided as how to publish the crate on crates.io, I have already tried to get into contact with the crate owner of dtls, and if that comes through, I'll use that.

However, if it doesn't come through, I'd like an alternative name that can be instantly used. I've thought of rustls-dtls, but that would pledge and use the rustls namespace, which I am not 100% sure on yet.

Funny or clever names are appreciated though :D

Connection ID extension

As per Draft 13 of the Connection ID extension, DTLS wants to ensure that connections can persist over different server-client address+port tuples, however it does not yet specify a method via which to securely update these peer addresses, this could be seen as something analogous to QUICs persistency over multiple peer addresses, and/or TLS resumes.

Regardless, this is a problem for the design proposed in #1, as v1.3 requires and efficiently integrates these connection IDs, and so this is required to support v1.3.

The problem for #1, however, is that right now the application multiplexes the underlying transport to the DTLS is objects, and right now there is an implicit expectation of immutability of peer addresses after they've been established.

The connection ID proposal effectively says "no, everything is multiplexed as per the connection ID", with an asterisk that, to prevent attacks, the previous peer address should be considered canonical until a renegotiation has taken place.

This effectively requires a more complex back and forth between the application and DTLS layer, for which a good design would be necessary.

Use of CHACHA20 alongside GCM

RFC 6347 Section 3.1 mentions the following;

DTLS solves the first problem by banning stream ciphers.

From this, i concluded that the use of CHACHA20 (a stream cipher) would not work in DTLS.

However, from both looking at the source code in rustls, the IANA canonical list of cipher suites, and at RFC 7905, using CHACHA20 looks to be possible, as rustls plugs in sequence numbers into both algorithms just the same.

`DIO` and `SelectiveDIO` traits, and `DeMux`

#13 gave requirement for something like Connection objects to become fall-through, however, i don't think this works well with the black-box approach.

Furthermore, together with #10, I'd like to make something reusable and generic, so i'll settle with a set of traits, and an object/trait;

DIO, "Datagram In Out", would also immediately standardise a Datagram IO.
This'll only be on connection objects for now, but i could possibly look if i can implement this for things like UdpSocket.
This is only currently meant to signal "upwards" IO, inputs to this will go up an abstraction layer, outputs will go down towards more physical layers.
SelectiveDIO, basically DIO, but can reject a packet.
DeMux, i'm not entirely sure if this'll be an object or a trait, but effectively this'll store many SelectiveDIO objects, and try each one in sequence until one of them accepts. This way, a connection object (wrapped in a dynamic selector), or a SRTP sink/buffer, can be multiplexed over the same "path" that a router would select such an object for.

I don't know how well this would work/clash with #9, as partially this is also supposed to solve that bit, where multiple connections can exist over the same peering, but this could be a generic enough solution for now.

For #13, this could be implemented by providing a selector which only checks on the following;
127 < B < 192: forward to RTP
Checking the first byte, and moving on if it doesn't match.

Possibly, in the future, this could allow for a QUIC+DTLS dual endpoint and router.

PMTU discovery at time of handshake

One thing I'd like to implement, is a "smart" discovery mechanism of path MTU at handshake time.

While the normal behaviour for the client would be to back off and retransmit with smaller packets if the server hasn't replied yet, I'd like to add in an optional opt-in piggy-back mechanism which would have to be implemented on both the client and server level.

Basically, when the server receives flight 1, it derives the client-discovered MTU, and then performs a little experiment; while it responds with the ServerHello with packets in a similar MTU, it duplicates the first sequence of bytes it sends with a slightly increased MTU.

The client would then have to be configured to recognise overlapping packets, with one a higher MTU, and respond in that higher MTU, copying the first sequence in a slightly higher bound.

This could be bounced a few times, until the bigger-MTU packets go MIA, after which the increasing is stopped. This would expend some slightly more datagram packets to secure and same-time MTU discovery.

Then, the server and the client can give the discovered MTU time to the connection objects they spawn, as they do normally.

This would have to be optional and opt-in on both sides, the RFC tells that this behaviour would be allowed, and that it doesn't trigger any fatal alerts on either side.

Buffer application data when expecting flight 5 or 6

DTLS implementations are likely to send application data alongside flight 5 or 6, without waiting for confirmation, we should buffer this application data and give it to the connection object to then be immediately decrypted.

Replay Protection

Brainstormed some ideas for anti-replay, and they're as follows;

Have a VecDeque with epoch and sequence information on the packets, plus their arrival time.
Have a "max age" parameter by which all older packets are discarded.

The first would be done with a 24-byte structure as;

pub struct PacketToken {
  epoch_seq: u64, // epoch and sequence in one
  at: Instant, // u128 on linux, u64 on macos and windows
}

This would discard anything older than 2 minutes, or FIFO at a max amount (configurable).

The oldest packet will update the "max age" sequence counter when it is popped from the stack, but only if it is younger than the age parameter.

Searching the VecDeque for a match when a packet comes in would introduce a little overhead, but at the cost of replay protection.

It would also cause a little memory overhead. (On linux, approximately 2.4MB for 100k packets indexed (150MB at max MTU, so 2 minutes of +10MBps download speeds))

`UseSRTP` Extension

Implement the use_srtp extension as per RFC 5764

Robust router for UDP

While #1 provides a general API that is agnostic across all transports, I think it would be helpful for applications to have a "hit the ground running" UDP router which could take the task on polling and multiplexing the transport layer in their own way, via STD UDPsockets.

Because of this, I'll probably want to make this a separate crate (because of #5).

I also want to mention https://github.com/shadowjonathan/exit-left here, which would allow a thread-based blocking polling mechanism which only allows one threading reading the socket at any time, routing the results to the right waiting thread.

I could possibly supply an async solution in the future as well.

The main motivation for this is to prevent mistakes in applications, as else they'd have to re-implement the same router for every application, which could introduce data loss or vulnerabilities/problems.

shadowjonathan / dustls Goto Github PK

dustls's Introduction

dustls, a pure-rust DTLS implementation