Comments (4)
Here is a demonstration of using an encapsulated session/reliability protocol to persist a session across multiple TCP connections.
turbo-tunnel-reconnection-demo.zip
There are two implementations, reconnecting-kcp and reconnecting-quic. The client reads from the keyboard and writes to the server, then outputs whatever it receives from the server. The server is an echo server, except it swaps uppercase to lowercase and vice versa, and it sends a "[heartbeat]" line every 10 seconds (just so that there's some server-initiated traffic).
$ server 127.0.0.1:4000
$ client 127.0.0.1:4000
2019/10/16 19:40:05 begin KCP session a01140b7
2019/10/16 19:40:05 begin TCP connection 127.0.0.1:37738 -> 127.0.0.1:4000
Hello World.
hELLO wORLD.
test
TEST
[heartbeat]
abababa
ABABABA
[heartbeat]
It gets interesting when you interpose something that terminates TCP connections. The included lilbastard program is a TCP proxy that terminates connections after a fixed timeout, which technique has been reported to be used to disrupt long-lived tunnels. (You may remember I identified this as one of the problems that the Turbo Tunnel idea can help solve in the original post.) Here you see a client–server session persisting despite the carrier TCP connections being terminated every 10 seconds.
$ server 127.0.0.1:4000
$ lilbastard -w 10 127.0.0.1:3000 127.0.0.1:4000
$ client 127.0.0.1:3000
2019/10/16 19:56:11 begin KCP session f814d839
2019/10/16 19:56:11 begin TCP connection 127.0.0.1:52762 -> 127.0.0.1:3000
test
TEST
[heartbeat]
hello again
2019/10/16 19:56:29 end TCP connection 127.0.0.1:52762 -> 127.0.0.1:3000
2019/10/16 19:56:29 begin TCP connection 127.0.0.1:52766 -> 127.0.0.1:3000
HELLO AGAIN
[heartbeat]
2019/10/16 19:56:41 end TCP connection 127.0.0.1:52766 -> 127.0.0.1:3000
2019/10/16 19:56:41 begin TCP connection 127.0.0.1:52770 -> 127.0.0.1:3000
[heartbeat]
This overall paradigm is called "connection migration" in QUIC. However, neither kcp-go nor quic-go support connection migration natively. (kcp-go uses the client source address, along with the KCP conversation ID, as part of the key that distinguishes conversations; quic-go explicitly does not support the rather complicated QUIC connection migration algorithm.) Therefore we must layer our own connection migration on top. We do it in a way similar to how Mosh (Section 2.2) and Wireguard (Section 2.1). The server accepts multiple simultaneous TCP connections. When it needs to send a packet to a particular client, it sends the packet on whichever TCP connection most recently received a packet from that client. Connection migration is the purpose of the connMap
data type in the server.
In order to make connection migration work, we need a persistent "client ID" that outlives any particular transient TCP connection, lasting as long as the client's session does. With kcp-go, this is easy, as the kcp.UDPSession
type has a GetConv
method that exposes the 32-bit KCP conversation ID, and the conversation ID is easy to parse out of raw packets (it's just the first 4 bytes). With quic-go it's a little harder, because although QUIC connections natively have a connection ID, quic-go does not expose it; and it's not trivial to parse the connection ID from raw packets. So in the quic-go implementation, the client prefixes its QUIC packets with its own randomly generated client ID. This effectively adds a field to each QUIC packet without breaking any quic-go abstractions, at the cost of some network overhead. When the serverPacketConn
does a ReadFrom
or WriteTo
, the addresses it deals with are these "client IDs," not actual network addresses that would be bound to a particular TCP connection.
A note about combining kcp-go and smux: earlier I said "The separation of kcp-go and smux into two layers could be useful for efficiency... [If an application makes just one long-lived connection] you could omit smux and only use kcp-go." I tried doing that here, because in the demonstration programs, each client requires only one stream. I eventually decided that you really need smux anyway. This is because KCP alone does not define any kind of connection termination, so after a client disappears, the server would have a kcp.UDPSession
in memory that would never go away. smux has an idle timeout that ensures that dead sessions get removed.
from bbs.
Turbo Tunnel in obfs4proxy (survives TCP connection termination)
Recall from my first post one of the problems with existing circumvention designs, that the turbo tunnel idea can help solve: "Censors can disrupt obfs4 by terminating long-lived TCP connections, as Iran did in 2013, killing connections after 60 seconds."
Here are proof-of-concept branches implementing the turbo tunnel idea in obfs4proxy, one using kcp-go/smux and one using quic-go:
- https://dip.torproject.org/dcf/obfs4/tree/reconnecting-kcp
- https://dip.torproject.org/dcf/obfs4/tree/reconnecting-quic
As diffs:
- Changes needed to add kcp-go/smux to plain obfs4proxy
- Changes needed to adapt kcp-go/smux to quic-go
Using either of these branches, your circumvention session is decoupled from any single TCP connection. If a TCP connection is terminated, the obfs4proxy client will establish a new connection and pick up where it left off. An error condition is signaled to the higher-level application only when there's a problem establishing a new connection. Otherwise, transient connection termination is invisible (except as a brief increase in RTT) to Tor and whatever other application layers are being tunnelled.
I did a small experiment showing how a Tor session can persist, despite the obfs4 layer being interrupted every 20 seconds. I configured the "little bastard" connection terminator to forward from a local port to a remote bridge, and terminate connections after 20 seconds.
lilbastard$ cargo run -- -w 20 127.0.0.1:3000 192.81.135.242:4000
On the bridge, I ran tor using either plain obfs4proxy, or one of the two turbo tunnel branches. (I did the experiment once for each of the three configurations.)
DataDirectory datadir.server
SOCKSPort 0
ORPort auto
BridgeRelay 1
AssumeReachable 1
PublishServerDescriptor 0
ExtORPort auto
ServerTransportListenAddr obfs4 0.0.0.0:4000
ServerTransportPlugin obfs4 exec ./obfs4proxy -enableLogging -unsafeLogging -logLevel DEBUG
# ServerTransportPlugin obfs4 exec ./obfs4proxy.kcp -enableLogging -unsafeLogging -logLevel DEBUG
# ServerTransportPlugin obfs4 exec ./obfs4proxy.quic -enableLogging -unsafeLogging -logLevel DEBUG
On the client, I configured tor to use the corresponding obfs4proxy executable, and connect to the bridge through the "little bastard" proxy. (If you do this, your bridge fingerprint and cert
will be different.)
DataDirectory datadir.client
SOCKSPort 9250
UseBridges 1
Bridge obfs4 127.0.0.1:3000 94E4D617537C3E3CEA0D1D6D0BC852B5A7613B77 cert=6rB8kVd981U0G2b9nXioB5o0Zu7tDpDkoZyPe2aCmqFzGmfaSiNIfQvkJABakH+DfYwWRw iat-mode=0
ClientTransportPlugin obfs4 exec ./obfs4proxy -enableLogging -unsafeLogging -logLevel DEBUG
# ClientTransportPlugin obfs4 exec ./obfs4proxy.kcp -enableLogging -unsafeLogging -logLevel DEBUG
# ClientTransportPlugin obfs4 exec ./obfs4proxy.quic -enableLogging -unsafeLogging -logLevel DEBUG
Then, I captured traffic for 90 seconds while downloading a video file through the tor proxy.
$ curl -L -x socks5://127.0.0.1:9250/ -o /dev/null https://archive.org/download/ucberkeley_webcast_itunesu_390697355/1.%202007-12-07%20-%20Keynote%20Address%3A%20The%20China%20Sustainable%20Energy%20Renewable%20Energy%20Program.mp4
The graph below depicts the amount of network traffic in each direction over time. In the "plain" chart, see how the download stops after the first connection termination at 20 s. Every 20 s after that, there is a small amount of activity, which is tor reconnecting to the bridge (and the resulting obfs4 handshake). But it doesn't matter, because tor has already signaled the first connection termination to the application layer, which gave up:
curl: (18) transfer closed with 111535615 bytes remaining to read
In comparison, the "kcp" and "quic" charts keep on downloading, being only momentarily delayed by an connection termination. The "kcp" chart is sparser than the "quic" chart, showing a lower overall speed. The "plain" configuration downloaded 3711 KB before giving up at 20 s; "kcp" downloaded only 1359 KB over the full 90 s; and "quic" downloaded 22835 KB over the full 90 s. It should be noted that this wasn't a particularly controlled experiment, and I didn't try experimenting with any performance parameters. I wouldn't conclude from this that KCP is necessarily slower than QUIC.
Notes:
- How this works architecturally, on the client side, we replace the original TCP
Dial
call with eitherkcp.NewConn2
orquic.Dial
, over an abstract packet-sending interface (clientPacketConn
).clientPacketConn
runs a loop that repeatedly connects to the same destination and exchanges packets (represented as length-prefixed blobs in a TCP stream) as long as the connection is good, reporting an error only when a connection attempt fails. On the server side, we replace the TCPListen
call with eitherkcp.ServeConn
orquic.Listen
, over an abstractserverPacketConn
.serverPacketConn
opens a single TCP listener, takes length-prefixed packets from all the TCP streams that arrive at the listener, and feeds them into a single KCP or QUIC engine. Whenever we need to send a packet for a particular connection ID, we send it on the TCP stream that most recently sent us a packet for that connection ID. - There's no need for this functionality to be built into obfs4proxy itself. It could be done as a separate program:
But this kind of process layering is cumbersome with pluggable transports.
------------ client ------------ ------------ bridge ------------ tor -> turbotunnel -> obfs4proxy -> internet -> obfs4proxy -> turbotunnel -> tor
- I'm passing a blank client IP address to the
pt.DialOr
call—this information is used for geolocation in Metrics graphs. That's because an OR connection no longer corresponds to a single incoming IP address with its single IP address—instead it corresponds to an abstract "connection ID" that remains constant across potentially many TCP connections. In order to make this work, you would have to define some heuristic such as "the client IP address associated with the OR connection is that of the first TCP connection that carried that connection ID."
from bbs.
Thanks for the really great work on this!
Here are some thoughts I have after taking a stab at a simpler version of this for Snowflake.
here's no need for this functionality to be built into obfs4proxy itself.... But this kind of process layering is cumbersome with pluggable transports.
I could see the benefit of making some of these functions more generic and extensible so that Turbo Tunnel can be a separate library. In order to integrate it, PT developers would still have to make source code changes, but according to some well-defined API.
An example of how some the existing functions on the client side could be make into API calls would be to modify dialAndExchange
to take in a Dialer
interface:
func (c *clientPacketConn) DialAndExchange(d net.Dialer, network, address string) error {
addrStr := log.ElideAddr(c.addr)
conn, err := d.Dial(network, address)
It's pretty much just the dial functionality that's specific to obfs4 in this case. This would require some refactoring in obfs4 (and Snowflake or any other PT) to implement a Dialer interface in place of what's already there of course.
Perhaps the Dialer
interface required by net.Conn isn't expressive enough, it could be a wrapper interface with a Dialer member in addition to the other information or functions we'd need.
I'm passing a blank client IP address to the pt.DialOr call—this information is used for geolocation in Metrics graphs. That's because an OR connection no longer corresponds to a single incoming IP address with its single IP address—instead it corresponds to an abstract "connection ID" that remains constant across potentially many TCP connections. In order to make this work, you would have to define some heuristic such as "the client IP address associated with the OR connection is that of the first TCP connection that carried that connection ID."
Another way to handle this is to make a new net.Conn
interface on top of the underlying stream net.Conn
with it's own implementation of RemoteAddr
that returns a client address that makes sense to pt.DialOR
. In your current implementation, calls to RemoteAddr
seem to be used for just logging at the moment. The new interface could also expose the session address with an addition function SessionAddr
if needed. This is the route we went with the work-in-progress Snowflake sequencing layer in making a SnowflakeConn
interface that wraps an underlying net.Conn
: proto.go#L150
from bbs.
here's no need for this functionality to be built into obfs4proxy itself.... But this kind of process layering is cumbersome with pluggable transports.
I could see the benefit of making some of these functions more generic and extensible so that Turbo Tunnel can be a separate library. In order to integrate it, PT developers would still have to make source code changes, but according to some well-defined API.
My feeling is that it's premature to be thinking about a reusable API or library. I want to discourage thinking of "Turbo Tunnel" as a specific implementation or protocol. It's more of an idea or design pattern. Producing a libturbotunnel that builds in design decisions like QUIC vs. KCP is not really on my roadmap. In any case, I feel a requirement for doing something like that is experience gained in implementing the idea a few times not as a reusable library, and not by me only.
I'm passing a blank client IP address to the pt.DialOr call—this information is used for geolocation in Metrics graphs. That's because an OR connection no longer corresponds to a single incoming IP address with its single IP address—instead it corresponds to an abstract "connection ID" that remains constant across potentially many TCP connections. In order to make this work, you would have to define some heuristic such as "the client IP address associated with the OR connection is that of the first TCP connection that carried that connection ID."
Another way to handle this is to make a new
net.Conn
interface on top of the underlying streamnet.Conn
with it's own implementation ofRemoteAddr
that returns a client address that makes sense topt.DialOR
. In your current implementation, calls toRemoteAddr
seem to be used for just logging at the moment. The new interface could also expose the session address with an addition functionSessionAddr
if needed. This is the route we went with the work-in-progress Snowflake sequencing layer in making aSnowflakeConn
interface that wraps an underlyingnet.Conn
: proto.go#L150
There's a type mismatch here though. Protocols like QUIC and KCP are fundamentally not based on an underlying stream. It's all discrete packets; i.e., it's a PacketConn
, not a Conn
. There's no consistent well-defined remote address for a PacketConn
. You can call ReadFrom
and it will tell you where that single packet came from, but that remote address may change for every call. And what's more, those packets don't even all necessarily belong to the same QUIC or KCP connection. It happens that in the special case of the obfs4 implementation, there is secretly a Conn
underneath the PacketConn
, so we can break the abstraction a little bit and adopt a "first remote address wins" heuristic. I actually don't think that's a big deal and I'm not worried about solving it.
The RemoteAddr
of the the QUIC of KCP connection, which is a Conn
built on top of a PacketConn
, is actually used internally by the QUIC or KCP library: it's the address that gets passed to WriteTo
in the PacketConn
. So we can't change the definition of RemoteAddr
without really harming semantics. I would rather define this as a separate data field that is explicitly defined as ancillary information peeking through the abstraction, not using the standard Conn
interfaces.
from bbs.
Related Issues (20)
- Blocking of *.pages.dev in Russia HOT 4
- I have my own VPN application, and I published it in the app markets. What is the difference between LTE and Home internet? HOT 3
- Snowflake, a censorship circumvention system using temporary WebRTC proxies (USENIX Security 2024) HOT 3
- Bleeding Wall: A Hematologic Examination on the Great Firewall (FOCI 2024)
- Assistance Needed to Bypass Restrictions on Irancell Network HOT 5
- VPN blocking in Myanmar since 2024-05-30 reportedly implemented by a Chinese company, Geedge Networks HOT 6
- Is TLS fragment available in China? HOT 1
- Firefox Add-ons blocks access to some proxy extensions from Russia HOT 6
- vmess://
- Is it possible to implement a man-in-the-middle (MITM) tool to bypass censorship? HOT 11
- ss://
- Issues with Trading & Banking Apps and Google Services HOT 6
- Free livestream of FOCI, PETS, and HotPETs, 2024-07-15 to 2024-07-19 HOT 4
- Russia forces Apple to remove dozens of VPN apps from App Store HOT 5
- Turkmenistan:"Internet amnesty? 3 billion IP addresses, hosting and CDNs unblocked" (2024-07-17)
- Looking at the Clouds: Leveraging Pub/Sub Cloud Services for Censorship-Resistant Rendezvous Channels (Update)
- 使用Google新部署的W开头的中间证书签发的网站在TLS 1.2下100%阻断 / Sites issued with Google's newly deployed intermediate certificates starting with W are 100% blocked under TLS 1.2 HOT 7
- Throttling→blocking of YouTube in Russia, 2024-07-12 HOT 10
- Security Notions for Fully Encrypted Protocols (FOCI 2023) HOT 1
- shadowsocks 用户将被套杀,提前准备备用VPN / Shadowsocks will get killed, prepare a backup VPN in advance HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bbs.