Code Monkey home page Code Monkey logo

ip-protection's People

Contributors

brgoldstein avatar bslassey avatar davidschinazi avatar jensenpaul avatar miketaylr avatar spanicker avatar terryednacot avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ip-protection's Issues

Weighing any perceived privacy benefit against the privacy of using Google's proxy

This proposal appears to harm privacy by effectively routing all of a user's traffic through Google. Am I understanding this correctly?

We are considering using 2 hops for improved privacy. A second proxy would be run by an external CDN, while Google runs the first hop. This ensures that neither proxy can see both the client IP address and the destination. CONNECT & CONNECT-UDP support chaining of proxies.

Even though the content is encrypted, the network addresses cannot be. Google's proxy then becomes a potential place data can be aggregated, even if the user's authentication tokens are secure.

How is this concern addressed?

We need 100% clarification on what traffic will go through the proxy

Having read the majority of issues/posts in this repo, I understand the following to be true:

  1. All first party traffic will not go through the proxy
    a. i.e. If you visit www.companyurl.com, all of the below example files would receive the originator IP address and not a proxy IP
    i. www.companyurl.com/image1.png
    ii. www.companyurl.com/file.js
    iii. www.companyurl.com/styles.css
  2. All third party traffic that is not listed by Google as a cross-site tracking script/tool will not go through the proxy
    a. i.e. If www.companyurl.com loads a JS file from a third party (www.thirdparty.com/calc.js) that is not listed by Google as a cross-site tracking domain, the third party (www.thirdparty.com) will receive the originator IP address and not a proxy IP
  3. Only domains listed by Google will go through the proxy
    a. There is no plan to send everything through the proxy

Please could you confirm or correct each of the points above so that we are 100% clear.

Please could you also answer the following questions:

  1. Is it just cross-site tracking domains that Google plans to send through the proxy?
  2. Do Google plan to send other website analytics domains through the proxy that don't partake in cross-site tracking?
  3. Do Google have a proposed list of domains that will be added to the proxy list that you can share with us?

I really appreciate you taking the time to clarify my understanding and answer my questions.
Thank you

IP Geolocation granularity impacts regulatory and contractual use cases

Originally filed at spanicker/ip-blindness#21 by @smhendrickson

In addition to the targeting use cases described in spanicker/ip-blindness#20, IP Geolocation may also be used for regulatory and contractual requirements. Will the Geo granularity described create any anticipated difficulties in meeting regulatory needs?

There is some existing conversation in the original thread w/ @dmdabbs and @patmmccann, but let's continue the topic here.

Detecting fraudulent engagement

Originally filed at spanicker/ip-blindness#15 by @spanicker.

Services that are embedded in a third-party context will now see distinct IPs for each top-level domain that the user is visiting. This negatively impacts the ability to count the number of distinct users across a set of sites, and makes it easier to inflate impressions and ad clicks by having these same users engage on multiple sites.

Some attributes, such as GeoIP, may allow sites to validate observed regional distributions against what is expected.
We are keen to discuss any suggestions that could improve defensibility within our privacy objective of preventing scaled cross-site tracking.

Note that there's already some conversation w/ @dmdabbs and @etrouton in the old issue, but let's continue discussion here.

Bypassing IP protection through first-party cooperation

Hello everyone,

As a follow-up to the following thread that clarified the behaviour of the allowlist logic: #31, I would like to ask how the IP Protection project will protect users from cooperating first-party origins.

For example, if a user browses to publisher.com, the site will have access to the real IP address of the user, as expected.
It could then pass said IP address to tracker.com as a query string parameter or HTTP header of the request to the tracking origin.

This would defeat the IP Protection mechanism in a relatively simple way if I'm not mistaken?

Thank you in advance four your answers and insights.

Cannot believe the chutzpah

Google owns Chrome. So what's to prevent it from installing its own root certificate and MITMing all of your web browsing? Even if Google is able to restrain itself from acquiring all that juicy data (HAHA! As if...), it still collects oodles of metadata about someone's browsing habits.

This is a privacy and security nightmare, and I can't believe the cynicism of the Google marketing team that calls this a "privacy-enhancing" feature.

You all should be thoroughly ashamed of yourself. "Don't be evil" is now a sick joke.

Will original HTTP Referer be preserved?

I understand the proposal is about IP protection and masking the originating IP address. However, I'd like to understand if the original HTTP Referer will be unchanged and accessible after traffic passes-through the proxy(s).

HTTP Referer is sometimes used instead of IP address for allowlist / denylist behavior. I realize that using Referer for access-control behavior comes with some risks (Referer can be manipulated in some environments and scenarios), but it is considered "good enough" in some cases (your mileage may vary). This can be particularly important and useful for iframe scenarios.

I've read through the Github explainer and https://developer.chrome.com/en/docs/privacy-sandbox/ip-protection/, but didn't find any specific mention of Referer. Will all proxy(s) involved (whether 1-hop, 2-hop, or more) pass-along HTTP Referer unmodified, such that end-servers will still be able to know the original Referer?

If the original value is preserved, will it still be available on the traditional Referer header, or another header (e.g. *-Forwarded-*)?

Loss of IP Metadata

Originally filed at spanicker/ip-blindness#22 by @jbradl11:

IP addresses serve a variety of use cases beyond anti-fraud. IP can be used for analytics, measurement, regional preferences, tracking paid subscribers, and many more. As Chrome seeks to mask IP addresses, we are working to ensure that valid use cases are preserved while also improving privacy on the web.

How does blocking IP challenge your different use cases? What other ways could actors achieve these use cases without IP addresses used for fingerprinting? What potential replacement signals could we extract from an IP address and pass as a new, standalone (and possibly attested?) signal?

Possible circumvention of TikTok blocking

Public sector networks in some states are required to block tracking of their users by TikTok, including by blocking TikTok at the router. If TikTok tracking scripts are "eligible third-party traffic" for purposes of IP Protection, then could IP Protection have the side effect of circumventing this required block?

related: #2

Confirmation on first party real ip address

Our website captures user IP addresses for first-party data purposes.
We utilize ZoomInfo/Clearbit IP enrichment APIs to gather company information based on these IPs.

Questions:
IP Masking: Will there ever be an option to mask the captured first party IP addresses, or will we always receive the full, unmasked IP?

Company Enrichment: we are planning to use ZoomInfo or Clearbit for acquiring company details solely based on the captured IP addresses? This is alternative of Who Is. Could you please confirm if do you have any plan to add these in proxy list?

Network operators control

Will there be similar instructions as at https://developer.apple.com/support/prepare-your-network-for-icloud-private-relay/ (pasted below) for institutions that need to enable controls on their network, such as the TickTok issue? Note that the solution for TikTok assumed the enterprise controlled the browser; however there are many BYOD context where the network operators are required to restrict traffic to certain sites.

Some enterprise or school networks might be required to audit all network traffic by policy, and your network can block access to Private Relay in these cases. The user will be alerted that they need to either disable Private Relay for your network or choose another network.

The fastest and most reliable way to alert users is to return either a "no error no answer" response or an NXDOMAIN response from your network’s DNS resolver, preventing DNS resolution for the following hostnames used by Private Relay traffic. Avoid causing DNS resolution timeouts or silently dropping IP packets sent to the Private Relay server, as this can lead to delays on client devices.

Clarify what is "eligible third-party traffic"

It would be nice to provide more details about what the proposal considers to be "eligible third-party traffic". The README says:

We’ll explore leveraging methods similar to other browsers and existing lists that identify these third parties.

I assume this means using some list like https://github.com/lightswitch05/hosts?

Will client-side code that makes requests to non-advertiser domains and subdomains be impacted by the proxy? For example, if visiting en.wikipedia.org, it's possible for client-side code to call other domains (login.wikimedia.org, commons.wikimedia.org) or subdomains (e.g. another language Wikipedia like el.wikipedia.org): should we anticipate that all such calls would get routed through the IP Protection proxy, because Chrome would see these requests as third-party traffic?

Made Live today by mistake?

Today at approximately 6:30pm - one of our home's users was browsing the web and went to "www.harveynorman.co.nz" and had a message pop up saying "Sorry - we don't support sales to the EU - please use the EU version of our website instead".

This stumped us a bit - as we're in NZ and had never seen anything like this before. On running a few "what is my ip" type websites - our IPv4 address was our typical ISP IP and was located from New Zealand as expected.

However - the ones that reported on IPv6 noted that there was an IPv6 address coming from the 2001:4860::/32 range - which was owned by Google LLC and was a "Shared Services Range" noted as coming from Europe.

This stumped us further as we could'nt figure where the IPv6 address came from - as we actually have IPv6 turned off on the home router.

On testing other PC's in the house using Chrome - they start exhibiting the same issue, however non-chrome browsers did not. After around 10 minutes, the behavior reverted and all PC's started once again reporting "No IPv6 address" when checking the tester sites.

After ALOT of digging around on Chrome Features, Proxying, etc - I hit this particular feature in development that seemed to match exactly what we saw - however it's in development and shouldn't be active right? - so why today did we see this behaviour.

If this is the case - I would deem this a total security and privacy risk - as I have no idea where our data via chrome was proxied to for that period - and was not made aware this was ocurring.

I'm aware chrome is able to remotely enable experimental features - we hit this a few years ago with something that was enabled that impacted some of the customers I was working for at the time - so I am suspecting the same thing has occurred here.

Currently running Chrome Official/Stable 120.0.06099.225.

So what will happen to the proxy setting in the setting page?

I was thinking about this but Chrome has this feature already... In the settings page "Open your computer's proxy settings" so why do we need one more proxy option?

further more will users have an option to use other servers than "google servers"? Will APIs be available to manage this from a plugin should someone want to offer a custom server?

Impact to firewalls, access control lists and whitelisting via source IP

Many services in Azure have a built in access control list/service firewall feature that allows access to an instance of the service to be restricted based on the IP address of the client. Many management tasks are performed through the Azure Portal via the local browser and the local internet IP must be added to the allow list on the service in order to access it.

Entra ID has conditional access rules which can be configured to perform different authentication behaviours based on signals. One of those signals is the authenticating user's IP address and allows "safe locations" to be specified.

There are also many other cloud-based SaaS services which whitelist access based on source IP.

How will the implementation of IP Protection impact these security measures. If up to a million users will appear to come from a small number of shared IP addresses per geo, that doesn't provide the necessary granularity to enforce the best security.

What we will see on the server side?

Hi,
For the traffic that will go through the proxy servers,
Do we expect to see any other changes/additions in the HTTP request/response headers?
Or, all the headers will stay intact but only the IP will be of the proxy?

Thanks.

User choices

Will a user be given a choice to toggle IP Protection the first time they open Chrome?

Will it be explicitly mentioned that all data of the user from within their browser would be routed through a Google server?

How will whitelists be handled?

Seems to me that if you publish a list of Google anonymized IPs, or have them under one ASN, you'll effectively make websites blacklist all VPNs but yours.

timeline

Hi,

I was wondering, do you have a rough timeline in mind for testing and deployment of the IP Protection mechanism? This will likely require certain adjustments to our infrastructure, and so a high level understanding of the timeline would be very helpful for us.

Best regards,
Jonasz

Blocking abusive clients

Originally filed at spanicker/ip-blindness#14 by @spanicker

At some point, a service may have confidence that a given request is associated with an abusive client. Perhaps the request is willfully causing quality of service issues, demonstrates intent to harm another user, or otherwise violates a site’s terms of use.

Historically, services would ban a user by their IP address. This has become less common with the rise of the mobile internet, but IP is still a surprisingly common tool in scaled abuse scenarios.

We would like to provide websites with the ability to request that the proxy no longer send traffic from the user of the proxy that issued the given request. We need to do this without re-introducing the cross-site tracking risk that the proxy is designed to counter.

Are there existing protocols or limitations relevant to your service that we should be mindful of? Would it be acceptable if embedded services would have to ban a user once for each top-level context (e.g. a.com on example1.com and a.com on example2.com would need to ban the user separately

How will bots using plugins and headless mode be managed with this?

Chrome has been used to mimic legitimate user traffic with the uses of plugins will this type of traffic be opted out when using automation tools and/or plugins to limit abuse on the proxy network? Or would a header be sent to inform the destination this particular request should be handled with care?

Conflicting with our firewall policies - Some users cannot access Google Workspace mail

This appears to causing an issue for some of our on-prem clients behind our Sonicwall firewall. We are a school district and block external email access along with proxy servers and other site categories, applications, etc. Some of our users are getting error messages attempting to access their Google Workspace email. After much troubleshooting it seems we've isolated it to Google ChromeVariations being enabled and/or the "IP Protection Proxy" flag being Default (We do not know the Active Variation GUID to look for that on systems that our users report the issue).

Example of user complaints:

  • Issues were reported late this afternoon from {redacted} and {redacted} regarding an issue accessing email. {redacted} stated the issue started the week before Thanksgiving and has persisted.. Error message: This site can't be reached ... "The webpage at https://mail.google.com/mail/u/0/?authuser=0 might be temporarily down or it may have moved permanently to a new web address." "ERR_FAILED"

  • Email can not pull up. Keeps saying there is a firewall or proxy that is keeping it from working. Have reset my connection, cleared cache, and restarted computer.

Resolution for us is to:

  1. Disable Google ChromeVariations (or set to enable for critical fixes only) and/or
  2. Go into chrome://flags and disable "IP Protection Proxy"

Once we do this, we can terminate and relaunch Chrome and then the user can access email again.

Our first discovery of this was November 30, 2023 which reportedly started around November 20-24. This seems to be a growing issue with the above solution resolving each one.

How will abuse reports be handled

Like the title says it would be great to have an idea break down how abuses will be handled from submission (automation preferably), who will decided valid vs invalid report. Will there be an appeal process for both side and who will manage it?

list-based approach

In the readme, you stated that you're using "a list-based approach and only domains on the list in a third-party context will be impacted"

Will this list be published to everyone?

Thanks.

Is Privacy Proxy source code open source?

I realize we are talking about chrome and not chromium, but in your policies do you plan to make the sources of the server code in which it runs available to the opensource community?
thank you for your reply

I never asked for this. Nobody asked for this.

We have VPN's and DNS servers to reroute if necessary. Why is this under the guise of "Privacy" when this does the exact opposite? I'm totally against this proposal and hope it goes down the "Killed by Google" drain.

What's the approach toward countries with internet censorship?

Countries like China, Iran, Russia, and others exhibit strict behavior towards censorship and show no regard for internet freedom.
As soon as it becomes available, it will be blocked in these countries

Is there a strategy or mechanism in place to address these conditions?

Breaking Fraud Prevention technique

We provide fraud detection/prevention for our clients as a third party. We rely heavily on having full accessibility to the originating IP address to evaluate the risk of a respondent. We perform many checks that include determining time zone (many countries support many time zones), postal code proximity, residential proxy use, fast fluxing, etc., etc. How will you support not breaking that business model?

WHOIS information alternative

Our use case involves identifying the registered ASN organizations associated with an IP address visiting a website, and processing and reporting these back to the website owner. We currently use a third party tracker to collect the IP addresses and then a WHOIS service to find the registered owners. My understanding so far is that because I am using a third party tracker the traffic would go through Googles proxy, and I would instead get a Google IP address with almost equivalent geolocation data.

However, I assume all the WHOIS data will now all come out as Google as the owner instead of the actual owner. Seeing as WHOIS data is not the users personal information what assurances can Google provide that this data will still be accessible, and through what mechanism will we be able to obtain it?

Can the first proxy see the IP address assigned to the user by the second proxy?

Is it correct to assume that the second proxy uses some form of NATing to assign end-user IP addresses different from the actual IP of the second proxy itself (ie the one that the first proxy would see and use to connect to it)?

If so, does the first proxy ever have visibility into the IP address assigned to a user on any granular connection/session basis?

Reason for asking is it seems the first proxy knowing both the user's original IP address and the one they end up assigned to seems like a vector for correlation by first proxy operators (Google) without collusion.

Thanks.

Log level/retention

So this proposal raises a few key questions, most of which have clearly been addressed or are being addressed already.

However, currently I'm unclear about how much data is logged and for how long that data is retained.

As this feature brands itself as a proxy, I am assuming that no additional encryption is applied to traffic. This does mean that Google could use this to store information on who has visited what sites.

So, what data is logged, and for how long are those logs stored? What can those logs be used for? And how can we verify this?

And how can we trust Google to provide a feature like this? Your track record is not amazing and I appreciate that this will likely be an uphill struggle to justify, but I want to hear how you can protect users not only from third-parties but from yourselves.

Thank you for taking the time to run this as a proposal openly, and accepting feedback. Doing things this way is a lot more transparent and I do truly appreciate the opportunity to make my voice heard.

IP Geolocation granularity impacts

Originally filed at spanicker/ip-blindness#20 by @smhendrickson

IETF RFC 8805 allows for country, ‘region’ (state in US), and city level mappings to be advertised. While the IP Protection proposal will not retain all state/city level granularity, we would like to retain enough to keep inferred geographic content relevant to users, and GeoIP based performance optimizations functioning.

To achieve this, Chrome is considering mapping many unique cities/regions to groups of combined cities/regions. We call these Geos. We are considering using a threshold of 1 million estimated human population for each created geo. This geo will then be shared in the public IP Geolocation feed.

Which use cases of yours would 1 million people sufficiently cover and which use cases would not be sufficiently covered?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.