googlechrome / ip-protection Goto Github PK
View Code? Open in Web Editor NEWLicense: Apache License 2.0
License: Apache License 2.0
This proposal appears to harm privacy by effectively routing all of a user's traffic through Google. Am I understanding this correctly?
We are considering using 2 hops for improved privacy. A second proxy would be run by an external CDN, while Google runs the first hop. This ensures that neither proxy can see both the client IP address and the destination. CONNECT & CONNECT-UDP support chaining of proxies.
Even though the content is encrypted, the network addresses cannot be. Google's proxy then becomes a potential place data can be aggregated, even if the user's authentication tokens are secure.
How is this concern addressed?
Having read the majority of issues/posts in this repo, I understand the following to be true:
Please could you confirm or correct each of the points above so that we are 100% clear.
Please could you also answer the following questions:
I really appreciate you taking the time to clarify my understanding and answer my questions.
Thank you
For example, Chrome on Android, Chrome on iOS, and Chrome on Windows.
Originally filed at spanicker/ip-blindness#21 by @smhendrickson
In addition to the targeting use cases described in spanicker/ip-blindness#20, IP Geolocation may also be used for regulatory and contractual requirements. Will the Geo granularity described create any anticipated difficulties in meeting regulatory needs?
There is some existing conversation in the original thread w/ @dmdabbs and @patmmccann, but let's continue the topic here.
Originally filed at spanicker/ip-blindness#15 by @spanicker.
Services that are embedded in a third-party context will now see distinct IPs for each top-level domain that the user is visiting. This negatively impacts the ability to count the number of distinct users across a set of sites, and makes it easier to inflate impressions and ad clicks by having these same users engage on multiple sites.
Some attributes, such as GeoIP, may allow sites to validate observed regional distributions against what is expected.
We are keen to discuss any suggestions that could improve defensibility within our privacy objective of preventing scaled cross-site tracking.
Note that there's already some conversation w/ @dmdabbs and @etrouton in the old issue, but let's continue discussion here.
Hello everyone,
As a follow-up to the following thread that clarified the behaviour of the allowlist logic: #31, I would like to ask how the IP Protection project will protect users from cooperating first-party origins.
For example, if a user browses to publisher.com
, the site will have access to the real IP address of the user, as expected.
It could then pass said IP address to tracker.com
as a query string parameter or HTTP header of the request to the tracking origin.
This would defeat the IP Protection mechanism in a relatively simple way if I'm not mistaken?
Thank you in advance four your answers and insights.
Google owns Chrome. So what's to prevent it from installing its own root certificate and MITMing all of your web browsing? Even if Google is able to restrain itself from acquiring all that juicy data (HAHA! As if...), it still collects oodles of metadata about someone's browsing habits.
This is a privacy and security nightmare, and I can't believe the cynicism of the Google marketing team that calls this a "privacy-enhancing" feature.
You all should be thoroughly ashamed of yourself. "Don't be evil" is now a sick joke.
I understand the proposal is about IP protection and masking the originating IP address. However, I'd like to understand if the original HTTP Referer
will be unchanged and accessible after traffic passes-through the proxy(s).
HTTP Referer
is sometimes used instead of IP address for allowlist / denylist behavior. I realize that using Referer
for access-control behavior comes with some risks (Referer
can be manipulated in some environments and scenarios), but it is considered "good enough" in some cases (your mileage may vary). This can be particularly important and useful for iframe scenarios.
I've read through the Github explainer and https://developer.chrome.com/en/docs/privacy-sandbox/ip-protection/, but didn't find any specific mention of Referer
. Will all proxy(s) involved (whether 1-hop, 2-hop, or more) pass-along HTTP Referer
unmodified, such that end-servers will still be able to know the original Referer
?
If the original value is preserved, will it still be available on the traditional Referer
header, or another header (e.g. *-Forwarded-*
)?
Originally filed at spanicker/ip-blindness#22 by @jbradl11:
IP addresses serve a variety of use cases beyond anti-fraud. IP can be used for analytics, measurement, regional preferences, tracking paid subscribers, and many more. As Chrome seeks to mask IP addresses, we are working to ensure that valid use cases are preserved while also improving privacy on the web.
How does blocking IP challenge your different use cases? What other ways could actors achieve these use cases without IP addresses used for fingerprinting? What potential replacement signals could we extract from an IP address and pass as a new, standalone (and possibly attested?) signal?
If so, will this option be available for all users or just enterprise?
Public sector networks in some states are required to block tracking of their users by TikTok, including by blocking TikTok at the router. If TikTok tracking scripts are "eligible third-party traffic" for purposes of IP Protection, then could IP Protection have the side effect of circumventing this required block?
related: #2
Our website captures user IP addresses for first-party data purposes.
We utilize ZoomInfo/Clearbit IP enrichment APIs to gather company information based on these IPs.
Questions:
IP Masking: Will there ever be an option to mask the captured first party IP addresses, or will we always receive the full, unmasked IP?
Company Enrichment: we are planning to use ZoomInfo or Clearbit for acquiring company details solely based on the captured IP addresses? This is alternative of Who Is. Could you please confirm if do you have any plan to add these in proxy list?
Will there be similar instructions as at https://developer.apple.com/support/prepare-your-network-for-icloud-private-relay/ (pasted below) for institutions that need to enable controls on their network, such as the TickTok issue? Note that the solution for TikTok assumed the enterprise controlled the browser; however there are many BYOD context where the network operators are required to restrict traffic to certain sites.
Some enterprise or school networks might be required to audit all network traffic by policy, and your network can block access to Private Relay in these cases. The user will be alerted that they need to either disable Private Relay for your network or choose another network.
The fastest and most reliable way to alert users is to return either a "no error no answer" response or an NXDOMAIN response from your network’s DNS resolver, preventing DNS resolution for the following hostnames used by Private Relay traffic. Avoid causing DNS resolution timeouts or silently dropping IP packets sent to the Private Relay server, as this can lead to delays on client devices.
It would be nice to provide more details about what the proposal considers to be "eligible third-party traffic". The README says:
We’ll explore leveraging methods similar to other browsers and existing lists that identify these third parties.
I assume this means using some list like https://github.com/lightswitch05/hosts?
Will client-side code that makes requests to non-advertiser domains and subdomains be impacted by the proxy? For example, if visiting en.wikipedia.org
, it's possible for client-side code to call other domains (login.wikimedia.org
, commons.wikimedia.org
) or subdomains (e.g. another language Wikipedia like el.wikipedia.org
): should we anticipate that all such calls would get routed through the IP Protection proxy, because Chrome would see these requests as third-party traffic?
Today at approximately 6:30pm - one of our home's users was browsing the web and went to "www.harveynorman.co.nz" and had a message pop up saying "Sorry - we don't support sales to the EU - please use the EU version of our website instead".
This stumped us a bit - as we're in NZ and had never seen anything like this before. On running a few "what is my ip" type websites - our IPv4 address was our typical ISP IP and was located from New Zealand as expected.
However - the ones that reported on IPv6 noted that there was an IPv6 address coming from the 2001:4860::/32 range - which was owned by Google LLC and was a "Shared Services Range" noted as coming from Europe.
This stumped us further as we could'nt figure where the IPv6 address came from - as we actually have IPv6 turned off on the home router.
On testing other PC's in the house using Chrome - they start exhibiting the same issue, however non-chrome browsers did not. After around 10 minutes, the behavior reverted and all PC's started once again reporting "No IPv6 address" when checking the tester sites.
After ALOT of digging around on Chrome Features, Proxying, etc - I hit this particular feature in development that seemed to match exactly what we saw - however it's in development and shouldn't be active right? - so why today did we see this behaviour.
If this is the case - I would deem this a total security and privacy risk - as I have no idea where our data via chrome was proxied to for that period - and was not made aware this was ocurring.
I'm aware chrome is able to remotely enable experimental features - we hit this a few years ago with something that was enabled that impacted some of the customers I was working for at the time - so I am suspecting the same thing has occurred here.
Currently running Chrome Official/Stable 120.0.06099.225.
I was thinking about this but Chrome has this feature already... In the settings page "Open your computer's proxy settings" so why do we need one more proxy option?
further more will users have an option to use other servers than "google servers"? Will APIs be available to manage this from a plugin should someone want to offer a custom server?
Many services in Azure have a built in access control list/service firewall feature that allows access to an instance of the service to be restricted based on the IP address of the client. Many management tasks are performed through the Azure Portal via the local browser and the local internet IP must be added to the allow list on the service in order to access it.
Entra ID has conditional access rules which can be configured to perform different authentication behaviours based on signals. One of those signals is the authenticating user's IP address and allows "safe locations" to be specified.
There are also many other cloud-based SaaS services which whitelist access based on source IP.
How will the implementation of IP Protection impact these security measures. If up to a million users will appear to come from a small number of shared IP addresses per geo, that doesn't provide the necessary granularity to enforce the best security.
Hi,
For the traffic that will go through the proxy servers,
Do we expect to see any other changes/additions in the HTTP request/response headers?
Or, all the headers will stay intact but only the IP will be of the proxy?
Thanks.
Will a user be given a choice to toggle IP Protection the first time they open Chrome?
Will it be explicitly mentioned that all data of the user from within their browser would be routed through a Google server?
Seems to me that if you publish a list of Google anonymized IPs, or have them under one ASN, you'll effectively make websites blacklist all VPNs but yours.
Hi,
I was wondering, do you have a rough timeline in mind for testing and deployment of the IP Protection mechanism? This will likely require certain adjustments to our infrastructure, and so a high level understanding of the timeline would be very helpful for us.
Best regards,
Jonasz
Originally filed at spanicker/ip-blindness#14 by @spanicker
At some point, a service may have confidence that a given request is associated with an abusive client. Perhaps the request is willfully causing quality of service issues, demonstrates intent to harm another user, or otherwise violates a site’s terms of use.
Historically, services would ban a user by their IP address. This has become less common with the rise of the mobile internet, but IP is still a surprisingly common tool in scaled abuse scenarios.
We would like to provide websites with the ability to request that the proxy no longer send traffic from the user of the proxy that issued the given request. We need to do this without re-introducing the cross-site tracking risk that the proxy is designed to counter.
Are there existing protocols or limitations relevant to your service that we should be mindful of? Would it be acceptable if embedded services would have to ban a user once for each top-level context (e.g. a.com on example1.com and a.com on example2.com would need to ban the user separately
Chrome has been used to mimic legitimate user traffic with the uses of plugins will this type of traffic be opted out when using automation tools and/or plugins to limit abuse on the proxy network? Or would a header be sent to inform the destination this particular request should be handled with care?
This appears to causing an issue for some of our on-prem clients behind our Sonicwall firewall. We are a school district and block external email access along with proxy servers and other site categories, applications, etc. Some of our users are getting error messages attempting to access their Google Workspace email. After much troubleshooting it seems we've isolated it to Google ChromeVariations being enabled and/or the "IP Protection Proxy" flag being Default (We do not know the Active Variation GUID to look for that on systems that our users report the issue).
Example of user complaints:
Issues were reported late this afternoon from {redacted} and {redacted} regarding an issue accessing email. {redacted} stated the issue started the week before Thanksgiving and has persisted.. Error message: This site can't be reached ... "The webpage at https://mail.google.com/mail/u/0/?authuser=0 might be temporarily down or it may have moved permanently to a new web address." "ERR_FAILED"
Email can not pull up. Keeps saying there is a firewall or proxy that is keeping it from working. Have reset my connection, cleared cache, and restarted computer.
Resolution for us is to:
Once we do this, we can terminate and relaunch Chrome and then the user can access email again.
Our first discovery of this was November 30, 2023 which reportedly started around November 20-24. This seems to be a growing issue with the above solution resolving each one.
Like the title says it would be great to have an idea break down how abuses will be handled from submission (automation preferably), who will decided valid vs invalid report. Will there be an appeal process for both side and who will manage it?
In the readme, you stated that you're using "a list-based approach and only domains on the list in a third-party context will be impacted"
Will this list be published to everyone?
Thanks.
I realize we are talking about chrome and not chromium, but in your policies do you plan to make the sources of the server code in which it runs available to the opensource community?
thank you for your reply
We have VPN's and DNS servers to reroute if necessary. Why is this under the guise of "Privacy" when this does the exact opposite? I'm totally against this proposal and hope it goes down the "Killed by Google" drain.
Countries like China, Iran, Russia, and others exhibit strict behavior towards censorship and show no regard for internet freedom.
As soon as it becomes available, it will be blocked in these countries
Is there a strategy or mechanism in place to address these conditions?
We provide fraud detection/prevention for our clients as a third party. We rely heavily on having full accessibility to the originating IP address to evaluate the risk of a respondent. We perform many checks that include determining time zone (many countries support many time zones), postal code proximity, residential proxy use, fast fluxing, etc., etc. How will you support not breaking that business model?
Our use case involves identifying the registered ASN organizations associated with an IP address visiting a website, and processing and reporting these back to the website owner. We currently use a third party tracker to collect the IP addresses and then a WHOIS service to find the registered owners. My understanding so far is that because I am using a third party tracker the traffic would go through Googles proxy, and I would instead get a Google IP address with almost equivalent geolocation data.
However, I assume all the WHOIS data will now all come out as Google as the owner instead of the actual owner. Seeing as WHOIS data is not the users personal information what assurances can Google provide that this data will still be accessible, and through what mechanism will we be able to obtain it?
Is the goal to only stop user cross-site identification? What about other types of tracking that uses IP addresses?
Do we have a timeline of when this will be implemented?
Is it correct to assume that the second proxy uses some form of NATing to assign end-user IP addresses different from the actual IP of the second proxy itself (ie the one that the first proxy would see and use to connect to it)?
If so, does the first proxy ever have visibility into the IP address assigned to a user on any granular connection/session basis?
Reason for asking is it seems the first proxy knowing both the user's original IP address and the one they end up assigned to seems like a vector for correlation by first proxy operators (Google) without collusion.
Thanks.
So this proposal raises a few key questions, most of which have clearly been addressed or are being addressed already.
However, currently I'm unclear about how much data is logged and for how long that data is retained.
As this feature brands itself as a proxy, I am assuming that no additional encryption is applied to traffic. This does mean that Google could use this to store information on who has visited what sites.
So, what data is logged, and for how long are those logs stored? What can those logs be used for? And how can we verify this?
And how can we trust Google to provide a feature like this? Your track record is not amazing and I appreciate that this will likely be an uphill struggle to justify, but I want to hear how you can protect users not only from third-parties but from yourselves.
Thank you for taking the time to run this as a proposal openly, and accepting feedback. Doing things this way is a lot more transparent and I do truly appreciate the opportunity to make my voice heard.
Originally filed at spanicker/ip-blindness#20 by @smhendrickson
IETF RFC 8805 allows for country, ‘region’ (state in US), and city level mappings to be advertised. While the IP Protection proposal will not retain all state/city level granularity, we would like to retain enough to keep inferred geographic content relevant to users, and GeoIP based performance optimizations functioning.
To achieve this, Chrome is considering mapping many unique cities/regions to groups of combined cities/regions. We call these Geos. We are considering using a threshold of 1 million estimated human population for each created geo. This geo will then be shared in the public IP Geolocation feed.
Which use cases of yours would 1 million people sufficiently cover and which use cases would not be sufficiently covered?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.