netbird

mirror of https://github.com/netbirdio/netbird.git synced 2026-04-25 11:46:40 +00:00

Author	SHA1	Message	Date
Zoltán Papp	77ec25796e	client/dns/mgmt: bypass overlay for control-plane FQDN resolution When an exit-node peer's network-map installs a 0.0.0.0/0 default route on the overlay interface before that peer's WireGuard key material is active, any UDP socket dialing an off-link address is routed into wt0 and the kernel returns ENOKEY. Two places needed fixing: 1. The mgmt cache refresh path. It reactively refreshes the control-plane FQDNs advertised by the mgmt (api/signal/stun/turn/ the Relay pool root) after the daemon has installed its own resolv.conf pointing at the overlay listener. Previously the refresh dial followed the chain's upstream handler, which followed the overlay default route and deadlocked on ENOKEY. 2. Foreign relay FQDN resolution. When a remote peer is homed on a different relay instance than us, we need to resolve a streamline-* subdomain that is not in the cache. That lookup went through the same overlay-routed upstream and failed identically, deadlocking the exit-node test whenever the relay LB put the two peers on different instances. Fix both by giving the mgmt cache a dedicated net.Resolver that dials the original pre-NetBird system nameservers through nbnet.NewDialer. The dialer marks the socket as control-plane (SO_MARK on Linux, IP_BOUND_IF on darwin, IP_UNICAST_IF on Windows); the routemanager's policy rules keep those sockets on the underlay regardless of the overlay default. Pool-root domains (the Relay entries in ServerDomains) now register through a subdomain-matching wrapper so that instance subdomains like streamline-de-fra1-0.relay.netbird.io also hit the mgmt cache handler. On cache miss under a pool root, ServeDNS resolves the FQDN on demand through the bypass resolver, caches the result, and returns it. Pool-root membership is derived dynamically from mgmt-advertised ServerDomains.Relay[] — no hardcoded domain lists, no protocol change. No hardcoded fallback nameservers: if the host had no original system resolver at all, the bypass resolver stays nil and the stale-while- revalidate cache keeps serving. The general upstream forwarder and the user DNS path are unchanged.	2026-04-24 17:40:33 +02:00
Viktor Liu	801de8c68d	[client] Add TTL-based refresh to mgmt DNS cache via handler chain (#5945 )	2026-04-22 15:10:14 +02:00
Zoltan Papp	d18747e846	[client] Exclude Flow domain from caching to prevent TLS failures (#5433 ) * Exclude Flow domain from caching to prevent TLS failures due to stale records. * Fix test	2026-02-24 16:48:38 +01:00
Maycon Santos	433bc4ead9	[client] lookup for management domains using an additional timeout (#4983 ) in some cases iOS and macOS may be locked when looking for management domains during network changes This change introduce an additional timeout on top of the context call	2025-12-22 20:04:52 +01:00
Viktor Liu	d4c067f0af	[client] Don't deactivate upstream resolvers on failure (#4128 )	2025-08-29 17:40:05 +02:00

5 Commits