Code Monkey home page Code Monkey logo

Comments (23)

fortuna avatar fortuna commented on August 28, 2024 7

I keep iterating on this visualization. I found this new tree the best to figure out how to fully evade the block:

decision_tree (2)

Now you just need to find a path to the green nodes.

How to bypass the blocking:
With that, it's easy to see that using Digital Ocean and a high port (we used 5555), fully bypasses blocking on Tele2, Bee Line and MTS, and by adding the POST%20 or %16%03%01%00%C2%A8%01%01 prefixes you can also bypass the blocking on MegaFon.

from bbs.

us254 avatar us254 commented on August 28, 2024 5

it seems the censors are using the following pattern to detect and block obfuscated proxy traffic like Shadowsocks and VMESS:

1. The client sends at least 3 packets, each of which is 411 bytes or larger.

2. The server sends packets more frequently than the client (the server's packet sizes don't matter).

The blocking appears to be based on the ratio of the size of the packets sent by the client compared to those received from the server, rather than absolute packet sizes.

Some key points:

  • The blocking was observed on major Russian mobile providers like Tele2, Megafon, MTS, Beeline and Yota in St. Petersburg. It affected traffic to foreign destinations.

  • HTTP traffic may not trigger the block because it typically has only one large client packet (the HTTP request), while the pattern expects at least two (e.g. ClientHello + HTTP request).

  • Adding a prefix to the Shadowsocks traffic to make the first packet look like a TLS ClientHello (over 411 bytes) helped avoid the blocking in some cases.

  • The blocking seems to only apply to TCP traffic. UDP traffic using QUIC did not appear to be blocked.

  • There are indications the blocking stopped in some regions like Bashkortostan and Tatarstan after April 30th, while it persisted in others.

the censors appear to be fingerprinting the traffic pattern of obfuscated proxy protocols, specifically looking for multiple large client packets and more frequent responses from the server. This allows blocking without needing to decrypt the traffic. Varying the traffic pattern, such as by adding dummy packets or varying packet sizes, may help circumvent this detection.

from bbs.

fortuna avatar fortuna commented on August 28, 2024 2

Here is a new dataset I generated where I put all in one single table:
https://gist.github.com/fortuna/41848697f0be93b2c2e222cd83096fcb

With the new dataset, I was able to generate this binary tree that characterizes the blocking in Russia:
image

Orange is no blocking ("no error"), blue is blocking ("error").

from bbs.

fortuna avatar fortuna commented on August 28, 2024 1

By the way, the opt-in traffic metrics from Outline Servers show slow drop in traffic from Russia after April 21, stabilizing after ~April 19:

image

from bbs.

wkrp avatar wkrp commented on August 28, 2024 1

@fortuna has done investigation and observed that the blocking of Shadowsocks-like protocols can depend on the remote server address range and the server port number.

https://ntc.party/t/7776/27

I ran some tests with the Outline SDK, which may work differently than other implementations. It looks like the blocking depends on the location of the server. It also depends on the port number. I was also able to confirm that the initial packet size makes a difference, but only in some ISPs.

  • Bee Line seems to be the only one considering the packet size, but only for the Vultr server, not DigitalOcean.
  • MTS blocked Shadowsocks access to a server on DigitalOcean, but not Vultr. They are likely using the IP address.
  • Blocking on DigitalOcean only happened for the key on port 443. No blocking for the key on the same server, but on port 5555.
  • MegaFon blocked Shadowsocks access to a server on Vultr, but not on DigitalOcean. They are likely using the IP address as well.
  • The packet size didn’t make a difference for MegaFon and MTS
  • Tele2 is not blocking.

There are details of specific tests in the linked NTC post.

Detection only affecting certain server address ranges is similar to what happened with the blocking of fully encrypted protocols in China in 2023:

https://gfw.report/publications/usenixsecurity23/en/#6-2-not-all-subnets-ases-are-affected-equally

6.2 Not All Subnets/ASes are Affected Equally

Of the 5.5 million processed IPs, 98% of them are unaffected by the GFW’s blocking, suggesting that China is fairly conservative in employing this new censorship.

Figure 4 shows the top affected ASes. While this is skewed toward larger ASes (which have more IPs in our scan), it shows both ASes that are heavily affected (e.g., Alibaba US, Constant) and ones that are not (Akamai, Cloudflare). In addition, some ASes have a mix of affected and not affected prefixes (Amazon, Digital Ocean, Linode). All of the affected or partly-affected ASes we see are popular VPS providers that could be used to host proxy servers while large unaffected ASes do not typically sell VPS hosting to individual customers (e.g. CDNs).

from bbs.

fortuna avatar fortuna commented on August 28, 2024 1

@wkrp your interpretation of the tree is correct, but the conclusion for MTS is not 100% correct. There are slight variations in some cases.

I was able to create an optimized decision tree that is a lot easier to understand. It clarifies the classification for MTS:

decision_tree_comparison_optimized_multiline_no_prefix

Of note, the EPHEMERAL_PORT, which corresponds to "Hetzner Online | 58987", fails for HTTPS.

You are right that if we use port 80 or 443 (I only had Digital Ocean with those), we can't find a strategy that works for both http and https. But we can find strategies for each of them individually.

from bbs.

Tw-C avatar Tw-C commented on August 28, 2024 1

Hey, guys! Already since the beginning of this week I noticed deterioration of VPN work. And since yesterday it is not possible to surf the Internet with VPN enabled. Outline VPN does not work with Wi-Fi and even through cellular operator Yota. I tried to change ports and add prefixes, and it didn't help either. For note, the location is Caucasus, Chechen Republic. Usually, Russian authorities test such restrictions on the Caucasus and then implement them on the rest of Russia. Looks like something big is about to happen.

Out of curiosity, had prefixes been added without enclosing in angle brackets?

  • I think it should be &prefix=%16%03%01%00%C2%A8%01%01 or &prefix=POST%20
  • Not &prefix=<%16%03%01%00%C2%A8%01%01> nor &prefix=<POST%20>

Unless @fortuna would confirm otherwise.

from bbs.

fortuna avatar fortuna commented on August 28, 2024 1

Ouline is just a Shadowsocks but with steroids! Blocking the Shadowsocks is easy now after so many years.

I'd say that after so many years, Shadowsocks is a lot harder to detect! The resistance is heavily based on the implementation, and Outline made numerous improvements with the help of the community over the years.

Shadowsocks is not easy to detect. When Shadowsocks is blocked, it's not because it's detected as Shadowsocks, but because the censor decided to allowlist protocols, and "looks random" is not in the allowlist. The blocking of "looks random" is often restricted to specific cloud providers that are popular with circumvention tools, to reduce collateral damage.

from bbs.

amircybersec avatar amircybersec commented on August 28, 2024 1

Been following this thread and wanted to add my two cents. You can kind of accomplish "look like any protocol of choice" by coming up with your own prefixes in the current implementation.

Basically prefixes are paddings for the initial part of the connection which typically include plain text header data that middle boxes scan to categorize the traffic type. The current implementation has comes with some limitations on prefix length. For example you cannot fit the whole client hello message inside it.

The reason for this is that prefixes use bytes that are allocated for Initialization Vector and having a fixing the IV to a fixed number can expose sessions and them them vulnerable.

But from my experience they are quite effective leaves room for experimentation since you can inject arbitrary data into the beginning of the connect.

The following literature has studied shadowsocks blocking in China:
https://dl.acm.org/doi/10.1145/3419394.3423644

In nut shell, detecting if traffic is fully encrypted or not is a non-trivial problem as many connections looking at their entropy of zeros and ones. The way China approaches it is to set an acceptable collateral damage (which would be false alarm rate of detecting fully encrypted traffic) which would set the detection threshold for their system. However they do not apply this detection rule to all outgoing traffic and only to certain data center IP ranges. I believe the paper quotes some numbers on this.

The prefix can easily circumvent this system as it injects plain text or know header types and fools it. However a later stage injection (is it really a TLS?) can still catch it.

Another important aspect to keep in mind is that, shadowsocks does not have any handshake or round trip behavior (similar to TLS) at the beginning of the connection so it cannot immediate a handshake. But technically it can be implemented at the higher level.

Regarding data center IP ranges, it is often the issue. I have Shadowsocks server running on my home network (residential IP) and the IP has never gotten blocked to incoming traffic from Iran. However I have multiple servers on DigitalOcean that could be pinged from Iran and the IP got banned after a few weeks of usage. There were a few cases where my homebrew shadowsocks server was being flagged on the protocol level BUT I could connect with plain TCP to listening socket. I use this website for these kinds of checks (which has a bunch of nodes in Russia too):

https://check-host.net/check-tcp

In those cases adding ?prefix=HTTP%2F1.1%20 bypassed the blocking. But if the IP is blocked, then there is nothing else to do that get a new IP address. One my home ISP, I can usually get a new IP by cycling the router power or a few other private trick ;)

from bbs.

fortuna avatar fortuna commented on August 28, 2024

I'm assuming you mean 3 packets after the TCP handshake.

The packet size signature depends on the Shadowsocks implementation. It would be helpful to distinguish them

Many Shadowsocks implementations will send the IV and the connect request before the application data. Those are smaller than 411 bytes. Does it mean they won't be blocked?

With Outline, we merge the IV, the connect request and the initial data in one packet. Less packets, but the first one will be larger.

Do they all get blocked?

Also, because it needs 3 packets, does it mean only TLS 1.2 gets blocked, but not TLS 1.3?
I guess it depends on the SS implementation?

Thanks for the reports, but this is still quite confusing, we need some more clarity.

from bbs.

irgfw avatar irgfw commented on August 28, 2024

We have observed the same blockings in Iran, except for the "port" part. IRGFW doesn't care about the Port number in most cases. But the AS whitelisting is happening in Iran. Most protocols on data centers, like Azure or AWS, won't get blocked, but the same configuration will be blocked on famous ones like Hetzner, DigitalOcean, and Linode,...

However, to the extent of Shadowsocks and VMESS, all VLESS (with or without TLS) combinations are affected, too.

from bbs.

fortuna avatar fortuna commented on August 28, 2024

I've done further investigation. Please find the results on this Gist, which allows you to filter by client ISP, server network, HTTPS, ...

Each file is a different transport I used. $key.tsv means a direct connection to the Outline Server. $key?prefix=....tsv uses the corresponding prefix. split:..|$key.tsv uses TCP stream splitting at the corresponding position (the number is the length of the first segment).

There are some remarkable findings:

  • The blocking behaves differently based on the tunneled application layer traffic. Tunneled HTTPS seems targeted.
  • TCP stream splitting affected the blocking differently in different cloud providers, and depending on the split position. Combining 5 and 300 splits did not improve evasion.
  • The POST%20 and a TLS prefix with a message length greater than the record length (TLS Record Fragmentation) bypasses almost all blocking. Prefix FOOBAR%20 helped in some cases and made things worse in others. This suggests that the prefix should look like a known protocol, since just not looking random is not as effective.
  • There seems to be multiple blocking mechanisms, given the different errors and how they react to different strategies. It would be helpful if the community could help characterize them all.

For now, Outline service providers should use one of the working prefixes.
It can also help to provide servers on different cloud providers, and on a high port number in addition to 443.

from bbs.

fortuna avatar fortuna commented on August 28, 2024

It seems that tree training was putting aside some training data.

Here is a tree with the full dataset and in SVG:
decision_tree_corrected (1)

The class is the curl exit codes.

from bbs.

fortuna avatar fortuna commented on August 28, 2024

Alternative view with only OK, TIMEDOUT and ERROR.
classifier_tree (1)

from bbs.

wkrp avatar wkrp commented on August 28, 2024

Alternative view with only OK, TIMEDOUT and ERROR.

Ok, so if I interpret this, the root node has the condition isp_Tele2 Russia ≤ 0.5. So if isp_Tele2 Russia = 1 (the ISP is Tele2), then we go right and hit a leaf with class = OK. In other words, there is no blocking on Tele2, which agrees with the table. If isp_Tele2 Russia = 0 (the ISP is not Tele2), then we go left.

From there, the condition is server_port_5555 ≤ 0.5, so if the server port is 5555, we go right; otherwise we go left, and so on.

It looks like every ISP then has a mini decision tree, something along the lines of Ex1–Ex5 in China:

Allow a connection to continue if the first TCP payload (pkt) sent by the client satisfies any of the following exemptions:

  • Ex1: popcount(pkt)/len(pkt)≤3.4 or popcount(pkt)/len(pkt)≥4.6.
  • Ex2: The first six (or more) bytes of pkt are [0x20,0x7e].
  • Ex3: More than 50% of pkt’s bytes are [0x20,0x7e].
  • Ex4: More than 20 contiguous bytes of pkt are [0x20,0x7e].
  • Ex5: It matches the protocol fingerprint for TLS or HTTP.

Block if none of the above hold.

For example, by inspection, it looks like the only failure cases for MTS PJSC are when the server is on Digital Ocean and the port is 80 or 443. It's independent of the strategy column. So the tree for MTS PJSC would be:

if (server_net == "Digital Ocean")
    if (port == 80 || port == 443)
        return TIMEOUT;
    else
        return OK;
else
    return OK;

from bbs.

khamsolt avatar khamsolt commented on August 28, 2024

Hey, guys! Already since the beginning of this week I noticed deterioration of VPN work. And since yesterday it is not possible to surf the Internet with VPN enabled. Outline VPN does not work with Wi-Fi and even through cellular operator Yota. I tried to change ports and add prefixes, and it didn't help either. For note, the location is Caucasus, Chechen Republic. Usually, Russian authorities test such restrictions on the Caucasus and then implement them on the rest of Russia. Looks like something big is about to happen.

from bbs.

khamsolt avatar khamsolt commented on August 28, 2024

Of course, I didn't add brackets to it, for example:

ss://Y2hhY2hhMjAtaWV0Zi1wb2x5MTMwNTpweXRpYjVyQTlHaUE0THlxbzFOMjNm@255.255.255.255:5555/?outline=1&prefix=POST%20

from bbs.

Tw-C avatar Tw-C commented on August 28, 2024

I keep iterating on this visualization. I found this new tree the best to figure out how to fully evade the block:

decision_tree (2)

Now you just need to find a path to the green nodes.

How to bypass the blocking: With that, it's easy to see that using Digital Ocean and a high port (we used 5555), fully bypasses blocking on Tele2, Bee Line and MTS, and by adding the POST%20 or %16%03%01%00%C2%A8%01%01 prefixes you can also bypass the blocking on MegaFon.

@khamsolt Do you think the visualisation here does not apply to your newfound situation? I would try different private IP to see if IP range of cloud services have been targeted rather than protocols being identified?

from bbs.

khamsolt avatar khamsolt commented on August 28, 2024

@Tw-C I have 5 vps servers in Europe, but it is unlikely that all of them will be blocked at once. Also, vpn works on some cell phone carriers, but wifi is blocked on all of them.

I have a Stockholm IP that is blocked by both my cell phone carrier and cable internet provider. But the other IP Vienna is only blocked by my cable provider. And I don't even know what to do.

I've asked friends, some have VLESS running, I don't know how much sense it makes to move the infrastructure to VLESS. I was somehow sure that Outline VPN will be difficult to block, I bought yesterday a new IP London, I will try through it to completely eliminate the blocking of IP addresses.

I'll report back later.

from bbs.

irgfw avatar irgfw commented on August 28, 2024

I was somehow sure that Outline VPN will be difficult to block,

Ouline is just a Shadowsocks but with steroids!
Blocking the Shadowsocks is easy now after so many years. Sure, you can bypass some ISP restrictions by using specific Prefixes. But you cannot completely hide the true nature of Shadowsocks under the hood.

from bbs.

Tw-C avatar Tw-C commented on August 28, 2024

I've set up some private tests with @khamsolt using custom prefixes on custom ports with positive results, we'll wait for his feedback report in due course.

Personally I try to avoid "mythify" capabilities of censors that could cloud judgement, a prime example being discussion at #129 vs disclosure below:

It looks like every ISP then has a mini decision tree, something along the lines of Ex1–Ex5 in China:

Allow a connection to continue if the first TCP payload (pkt) sent by the client satisfies any of the following exemptions:

  • Ex1: popcount(pkt)/len(pkt)≤3.4 or popcount(pkt)/len(pkt)≥4.6.
  • Ex2: The first six (or more) bytes of pkt are [0x20,0x7e].
  • Ex3: More than 50% of pkt’s bytes are [0x20,0x7e].
  • Ex4: More than 20 contiguous bytes of pkt are [0x20,0x7e].
  • Ex5: It matches the protocol fingerprint for TLS or HTTP.

Block if none of the above hold.

The steroids proves to be an effective countermeasure against protocol allowlisting.

from bbs.

Tw-C avatar Tw-C commented on August 28, 2024

Shadowsocks is not easy to detect. When Shadowsocks is blocked, it's not because it's detected as Shadowsocks, but because the censor decided to allowlist protocols, and "looks random" is not in the allowlist. The blocking of "looks random" is often restricted to specific cloud providers that are popular with circumvention tools, to reduce collateral damage.

That could explain tests using my private IP yielded positive results for @khamsolt

@Tw-C I have 5 vps servers in Europe, but it is unlikely that all of them will be blocked at once. Also, vpn works on some cell phone carriers, but wifi is blocked on all of them.

I have a Stockholm IP that is blocked by both my cell phone carrier and cable internet provider. But the other IP Vienna is only blocked by my cable provider. And I don't even know what to do.

I've asked friends, some have VLESS running, I don't know how much sense it makes to move the infrastructure to VLESS. I was somehow sure that Outline VPN will be difficult to block, I bought yesterday a new IP London, I will try through it to completely eliminate the blocking of IP addresses.

I'll report back later.

@fortuna Would you say the next milestone for Outline is to "look like any protocol of choice" after "custom salt prefix"?

from bbs.

Tw-C avatar Tw-C commented on August 28, 2024

@amircybersec Thanks for such insightful write-up & useful tricks up your sleeve, much appreciated. It does appear private IPs help to mitigate when cloud IP ranges are targeted.

If only we have as easy a way for private IP owners to setup proxies similar to what Snowflake does through simple installation of a browser extension. The closest I've found so far has been https://unredacted.org/blog/2024/06/freesocks-is-now-open-source/

from bbs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.