Comments (15)
I'll ask ChatGPT about how DNS works on Windows. If we can shadow everything yeah that's probably best cause timeouts wouldn't get cached.
from firezone.
Yeah, this will happen any time a Gateway flaps, the Relays are being restarted, or the Portal is being restarted since those all can cause a timeout when the client tries a lookup.
from firezone.
I think it was done like this because we don't know how many of each IPv4/IPv6 dummy addresses to generate until we do an actual lookup, and also because we don't have a good way to reverse the mapping when it reaches a Gateway.
If we immediately generated 100.96.100.100
for example, we would still need some way to tell whichever Gateway we land on that 100.96.100.100
corresponds to the original DNS name requested, and then do the actual lookup.
We should probably review this architecture
cc @conectado
from firezone.
Yeah that would explain some things I've seen lately when trying the Windows Client.
I'll see if there's any other way to handle it. I don't like removing the system DNS servers because then it'll be a mix of the iOS resolver dance and the /etc/resolv.conf
thing, where we have to remove the servers, but also keep them safe, and revert them, and if we crash the user is in deep trouble, and if DHCP keeps in we have to re-remove them and update our stashed servers.
Could connlib's DNS have a more aggressive timeout and lie and say the domain doesn't exist? Then traffic can't escape the tunnel.
from firezone.
Ah I see, on other platforms the nameservers are reverted when bringing the tunnel back down. Maybe Windows has a similar option?
Could connlib's DNS have a more aggressive timeout and lie and say the domain doesn't exist?
Yeah maybe that's a better option. If a Gateway is flapping or no Gateways are online what should connlib do though? I think it's probably most appropriate to timeout instead of incorrectly returning an NXDOMAIN which the app might cache.
from firezone.
I'm not sure exactly how it works on Windows but we want something that shadows the system resolvers without removing them from their own interfaces. That was frustrating about the /etc/resolv.conf
thing on Linux.
from firezone.
Yeah replicates for me on the dev laptop on c036d1a (tip of main)
Even after I enable the Policy again, it's stuck outside the tunnel.
from firezone.
And there was a reason why we can't assign an IP and respond to the DNS query before the gateway responds to us?
I remember asking this and I think Gabi said there was a good reason, but I can't remember. If we could just always assign an IP before knowing whether the Resource was even reachable, it would avert this - The DNS query would come back in milliseconds and then the connect()
call will just time out harmlessly.
from firezone.
ChatGPT suggested blocking traffic to the system's other DNS servers 🤔
Maybe we could claim their routes. Would that cause a packet loop when we try to send traffic to an IP address we've already claimed?
from firezone.
Yeah I think it was about the number of IPs we get back.
I could poke around in the DNS code and try things like returning NXDOMAIN or SERVFAIL if we run out of time. ChatGPT thought SERVFAIL might be treated as a temporary error and the resolver would try again, but I haven't found MS docs to prove it yet
from firezone.
Claiming routes or adding firewall rules might be a creative way to solve it.
I don't think it'd cause a loop -- we'd either drop the packet in connlib or if it's a resource it would be sent through a tunnel anyway
from firezone.
I was thinking if we use the system resolvers and we also claim their routes, we won't be able to reach them. Unless we have a way to bypass our own routes
from firezone.
If we add them as routes and ensure our tun interface has metric priority, they should be effectively blackholed by connlib for the duration of the tunnel
from firezone.
Yeah so connlib can't reach them if connlib is blackholing them
Like my 192.168.1.1 DNS cache on my home router will be unreachable because connlib tries to send it a query and it just comes back to connlib
from firezone.
Ah, I'm following now. Yeah this seems tricky on Windows. On Linux we have fwmark
, Android protect
, and Apple NECP
that handles this nicely for us.
Can think more about it, but unless we make major changes in connlib's DNS proxy I don't know if responding with NXDOMAIN
or SERVFAIL
won't cause unexpected application behavior. How long are those cached for? If a DNS server doesn't respond, we're at least pretty confident the application will ask again and again on each request.
It might not be perfect but we could do this:
but also keep them safe, and revert them, and if we crash the user is in deep trouble, and if DHCP keeps in we have to re-remove them and update our stashed servers.
and just run the risk of them not being reverted.
I would imagine our likelihood of crashing is lower than the likelihood of customers' Gateways going down or flapping
from firezone.
Related Issues (20)
- Ensure `reconnect` clears all previous backoff timers HOT 1
- One-click installer for DO
- k8s instructions HOT 2
- Pulumi instructions
- Show instructions in docs for deploying Gateways for different infra HOT 1
- UX audit tracking issue
- connlib: perform mangling of DNS requests to resolvers that are CIDR resources before we look up the peer HOT 1
- connlib: implement reconnect as "drop all connections and wait for new packets to trigger new ones"
- Allow FIREZONE_TOKEN to point to file HOT 2
- chore(connlib/android): revert possible Android regression from #4788
- Tracking issue for extensions to `tunnel_test`
- techdebt(connlib): use emitted events to update DNS servers in clients HOT 1
- connlib: unify packet routing between CIDR and DNS resources HOT 4
- Show warning if admin enters only IPv4 or IPv6 upstream resolvers
- Autostart Linux GUI on boot HOT 6
- Add a new `General` section to Settings
- Allow removing a Resource from a Site when multi-site Resources is not active
- Policy flexibility
- Allow removing Resources and Groups from a Policy HOT 1
- Add resource to favorites HOT 11
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from firezone.