Comments (16)
Yeah... that's not so great.
from netinfo.
Note that switching between cellular and wifi is easily observable via IP address change.
from netinfo.
@jkarlin, how would one observe that within a web page?
from netinfo.
Right, it requires network requests. But the information is still easily available.
Also, this isn't just a downlinkMax issue, connection.type already exposes cellular vs wifi.
from netinfo.
Note that switching between cellular and wifi is easily observable via IP address change
This isn't the case if the user is using a VPN or proxy; we'd end up revealing state that the remote server operator could otherwise not obtain.
This is similar to the privacy issues surrounding WebRTC, and "detect if the user is on a VPN or proxy" is a solution we've explored at great depth and do not believe it to be a reasonable thing a UA can accomplish, nor a good solution across platforms.
from netinfo.
Quick recap of where are today:
- connection.type is already available in Chrome and FF OS and allows you to query the 'coarse' connection type (e.g. bluetooth, wifi, cellular, ethernet, ...).
- connection.downlinkMax exposes an Mbps value that does not distinguish between types but may rely on type+subtype to bootstrap the value (e.g. 10Mbps value could be either via LTE, WiFi, or ...). This value may also be determined by some ~Network Quality Estimator implementation which is based on past performance of the network, signal quality, etc.
Combined, these two signals allow the developer to get information like "the user is on a cellular network, with downlink of ~X Mbps". However, even without downlinkMax you can already get information about user switching coarse network types (e.g. wifi -> cellular transitions).
In terms of moving forward, I think there are a couple of separate threads here:
- NetInfo exposes information about the user's network - e.g. combination of type and downlinkMax allows the application to track transitions between network types. As such, it does sound like it should require a privileged context.
- This is not the case today for accessing
connection.type
, should we revoke this capability for non-privileged contexts?
- This is not the case today for accessing
- Should NetInfo require explicit user opt-in - e.g. "https://example.com wants to know your network information -- Yes / No"? Any pitfalls here?
from netinfo.
To be clear, what you pose as option 1 is merely a subset of option 2.
If we accept 2, then the only answer for 1 is yes. If we say no to 1, then the only possible answer for 2 is no.
from netinfo.
@sleevi yep. I guess the missing question here is whether there are any other in-between options?
from netinfo.
I've merged https://github.com/w3c/netinfo/pull/31, preview: http://w3c.github.io/netinfo/#privacy.
However, above text does not address Ryan's earlier point about VPN + connection.type transitions. Should we add an additional warning clause for this? I don't believe there is anything special we can do here.. as the UA may not know if its running over a VPN connection? Further, even if and when it does, changing behavior would leak the fact that the user is on VPN, which has its own issues?
from netinfo.
Closing due to inactivity. @sleevi feel free to reopen if you think there is more to be done here.
from netinfo.
I think one comment I'd make regarding #31 is that "knowing end-to-end properties reveals information about the first network hop" is not necessarily true, under various scenarios. For example, from an ISP proxy level (whether mobile or transparent), you can only get so much fidelity at the server side - at best, you know the performance metrics to the ISP, but not to the actual user. Now, if you combine that with JS (whether XHR, onload, or Resource Timing), you get more fidelity, but that's if and only if the user has JS involved.
I think it'd be ideal for the privacy section to actually flush out and spell out some of the attacks and mitigation - both to save discussion in the future and to show the threat model being addressed.
For example, the privacy section makes the claim "knowing end-to-end properties reveals information about the first network hop", but doesn't really address how / under what model. Are you presuming the availability of the Resource Timing API? Are you presuming an 'attacker' sniffing with img onload? The privacy section just sort of says "Yeah, there are privacy issues, but nothing more than existing stuff," which I know is the position you've taken, but it doesn't really elaborate.
Let's say someone wanted to mitigate the privacy concerns. They could turn this API off, but that may be either heavy-handed (has more adverse affects than intended) or it may be the wrong knob (e.g. the privacy issues only arise when coupled with other APIs / behaviours).
One way to frame it would be to examine the bits of unique information being offered by this API, and then show comparatively how this information can be obtained via other means, and why it's a similar privacy risk. Or show how 'new' attacks exist by combining with other aspects of information.
from netinfo.
I think one comment I'd make regarding #31 is that "knowing end-to-end properties reveals information about the first network hop" is not necessarily true, under various scenarios. For example, from an ISP proxy level (whether mobile or transparent), you can only get so much fidelity at the server side - at best, you know the performance metrics to the ISP, but not to the actual user. Now, if you combine that with JS (whether XHR, onload, or Resource Timing), you get more fidelity, but that's if and only if the user has JS involved.
Not sure I follow. NetInfo exposes last hop, not end-to-end. Also, an intermediate proxy can observe timing data in both directions, regardless of where it is in the routing chain?
Are you presuming the availability of the Resource Timing API?
No, just observing the timing of the any fetch (e.g. HTML document) reveals a lot of data.. RTT, throughput, etc. That's what existing applications are already using to get BW estimates and modify app behavior -- except, they're forced to do this after the fact / after one or more fetches.
from netinfo.
Not sure I follow. NetInfo exposes last hop, not end-to-end. Also, an intermediate proxy can observe timing data in both directions, regardless of where it is in the routing chain?
To be explicit: The threat model I'm presuming here is a hostile end-point server (evil.example.com), wishing to interrogate as much information as possible about the end user. For further sake of discussion, let's consider a browser that explicitly tries to be privacy preserving in all possible ways (such as Tor Browser Bundle). Finally, let's consider a reasonably paranoid user that is using an upstream proxy (such as over VPN) as a further anonymization tool.
With this model, let's think about:
- What information NetInfo exposes
- Whether that information was already available
- What steps can be taken to mitigate that information
In the current "Privacy Considerations", my concern is that it doesn't really enumerate the concerns, and sort of handwaves as "This information's already out there, so no biggy". But ideally, the privacy considerations would talk about that information, and explain how it's available, so that it can be clear that "If you mitigate X, also mitigate Y", and, conversely, "If you're concerned about Y, you should also be concerned about X"
For example, you mentioned the following:
No, just observing the timing of the any fetch (e.g. HTML document) reveals a lot of data.. RTT, throughput, etc
But this still feels hand-wavy.
For example, under the above model:
- The server knows the intermediate proxy server's address (via the socket)
- The server knows the timing of any fetch, based on when it sees the request start and stop, and can also compute information such as RTT, throughput, etc.
Alone, this seems to contradict the statement in the spec that
knowing end-to-end properties reveals information about the first network hop
However, I suspect you were presuming, under your threat model, that JS was enabled. This gives us:
- With JS enabled, the server can use JS events such as onload to measure the client's perspective, thus measuring information about the proxy (and thus the first network hop).
- With JS enabled, the server can use the Resource Timing API to measure the client's perspective, thus measuring information about the proxy (and thus the first network hop).
Is that a clearer explanation of the concerns?
Now let's talk about concrete changes to the "Privacy Considerations" sections that might be able to address these concerns, if they're deemed to be founded (I could, after all, be a crazy ranty person)
Privacy Considerations
The Network Information API exposes information about the first network hop between the
user agent and the server; specifically, the type of connection and the upper bound of the
downlink speed, as well as signals whenever this information changes.This opens up several privacy-hostile attacks that site operators may wish to mount:
- If it detects that the connection type is
other
,none
,mixed
, orunknown
, this may be
a signal that the user is attempting to access the site through means such as a VPN. A site
may use this as a signal to deny access to the user (e.g. due to geofencing policies)- Through detecting changes to the network type (such as transitions between
wifi
and
cellular
), this may be able to serve as means of:
- Geolocating a user (are they home, at work, or in transit)
- Fingerprinting a user (User X transitioned networks at 9:01 AM on Monday, User Y
transitioned networks at 9:01 AM on Tuesday, and User Z transitioned on Wednesday - with
enough signals, it may be possible to determine that X, Y, and Z are the same person and
they are taking the train to work each day)- Through detecting changes to the max downlink speed, they may be able to further refine such
information.However, these considerations are not new, and sufficiently motivated attackers may already
obtain and exploit such information using existing technologies.For example, in a UA that supports the Resource Timing API, the attacker can infer the measured
RTT and throughput of the overall network connection, and use that to infer the user's connection
type.Alternatively, an attacker could use the Fetch API to constantly make (small) requests to a service,
and look for changes in the IP address of the incoming requests to infer the user's network
connection. Further, information such as the source IP may also reveal geolocation or fingerprinting
opportunities similar to those exposed here.Yet another means of obtaining similar information would be through the WebRTC API and the
(whatever the list of IP addresses thing is called) to obtain information about the user's IP.While privacy-conscious users may attempt to mitigate existing techniques through the use of
a configured proxy server, if any scripting is allowed at all, then an evil server can leverage
existing technologies to obtain the same information offered directly by this API. As such, while
this API makes it easier to obtain this information, by avoiding the need for additional network
requests, this information is already available to a sufficiently-motivated attacker.
That's probably quite poorly worded, but tries to concretely spell out attacks offered by this API, demonstrate how these attacks are already possible with existing APIs, and thus support the final conclusion that "additional exposure is not substantial".
As the text currently reads, it's hand-waved away what the threat model the spec is concerned about, or how the attacker uses existing techniques, and as such, doesn't really feel like it provides meaningful guidance for implementors and reviewers as to what (some of) the possible privacy considerations are, or why proposed mitigations may not be.
from netinfo.
Thanks Ryan, this is a definitely a good improvement, reopening while we iterate on the wording..
If it detects that the connection type is other, none, mixed, or unknown, this may be a signal that the user is attempting to access the site through means such as a VPN. A site may use this as a signal to deny access to the user (e.g. due to geofencing policies)
You're implicitly assuming that we report "unknown" when user is on VPN.. what's the reasoning here? Note that we don't have this as a requirement in the spec today, and its entirely clear to me if this is actually enforceable by the browser? As in, do we know even know if we're being routed through a VPN tunnel on all platforms?
from netinfo.
@sleevi ptal: https://github.com/w3c/netinfo/pull/35/commits - I took the liberty of rewriting some of the points. Hopefully I didn't botch it too badly :-). Also, I skipped the VPN stuff for now, per my question above.
from netinfo.
Resolved via https://github.com/w3c/netinfo/pull/35.
from netinfo.
Related Issues (20)
- Update the thresholds for different effective connection types HOT 7
- IANA section with registrations HOT 2
- Somewhat mangled IDL for NetworkInfomation HOT 2
- Fix indent in EXAMPLE 1
- Support .toJSON() on NetworkInformation interface
- Remove [NoInterfaceObject] HOT 2
- Include a separate “metered network” property in addition to “save data” HOT 1
- Netinfo status
- Should we archive this? HOT 21
- Can we just do metered? HOT 3
- Should `effectiveConnectionType` be a fixed set of values? HOT 15
- Revise use cases
- Privacy review and standards track plan HOT 1
- Should ontypechange be added again?
- compatibilité d'affichage mobile HOT 1
- Update repo and spec to show this is no longer being actively incubated HOT 6
- [tomayac/netinfo 🍴] `isMetered` should be `metered` HOT 1
- [tomayac/netinfo 🍴] Remove typedef HOT 1
- [tomayac/netinfo 🍴] Need to document why `sustainedSpeed` can be `Infinity` HOT 1
- Client Hints should get Sec-CH- prefixes HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from netinfo.