* [debug] fix port collision in TestUpload
TestUpload hardcoded :8080, so it failed deterministically when anything
was already on that port and collided across concurrent test runs.
Bind a :0 listener in the test to get a kernel-assigned free port, and
add Server.Serve so tests can hand the listener in without reaching
into unexported state.
* [debug] drop test-only Server.Serve, use SERVER_ADDRESS env
The previous commit added a Server.Serve method on the upload-server,
used only by TestUpload. That left production with an unused function.
Reserve an ephemeral loopback port in the test, release it, and pass
the address through SERVER_ADDRESS (which the server already reads).
A small wait helper ensures the server is accepting connections before
the upload runs, so the close/rebind gap does not cause a false failure.
* [client] Suppress ICE signaling and periodic offers in force-relay mode
When NB_FORCE_RELAY is enabled, skip WorkerICE creation entirely,
suppress ICE credentials in offer/answer messages, disable the
periodic ICE candidate monitor, and fix isConnectedOnAllWay to
only check relay status so the guard stops sending unnecessary offers.
* [client] Dynamically suppress ICE based on remote peer's offer credentials
Track whether the remote peer includes ICE credentials in its
offers/answers. When remote stops sending ICE credentials, skip
ICE listener dispatch, suppress ICE credentials in responses, and
exclude ICE from the guard connectivity check. When remote resumes
sending ICE credentials, re-enable all ICE behavior.
* [client] Fix nil SessionID panic and force ICE teardown on relay-only transition
Fix nil pointer dereference in signalOfferAnswer when SessionID is nil
(relay-only offers). Close stale ICE agent immediately when remote peer
stops sending ICE credentials to avoid traffic black-hole during the
ICE disconnect timeout.
* [client] Add relay-only fallback check when ICE is unavailable
Ensure the relay connection is supported with the peer when ICE is disabled to prevent connectivity issues.
* [client] Add tri-state connection status to guard for smarter ICE retry (#5828)
* [client] Add tri-state connection status to guard for smarter ICE retry
Refactor isConnectedOnAllWay to return a ConnStatus enum (Connected,
Disconnected, PartiallyConnected) instead of a boolean. When relay is
up but ICE is not (PartiallyConnected), limit ICE offers to 3 retries
with exponential backoff then fall back to hourly attempts, reducing
unnecessary signaling traffic. Fully disconnected peers continue to
retry aggressively. External events (relay/ICE disconnect, signal/relay
reconnect) reset retry state to give ICE a fresh chance.
* [client] Clarify guard ICE retry state and trace log trigger
Split iceRetryState.attempt into shouldRetry (pure predicate) and
enterHourlyMode (explicit state transition) so the caller in
reconnectLoopWithRetry reads top-to-bottom. Restore the original
trace-log behavior in isConnectedOnAllWay so it only logs on full
disconnection, not on the new PartiallyConnected state.
* [client] Extract pure evalConnStatus and add unit tests
Split isConnectedOnAllWay into a thin method that snapshots state and
a pure evalConnStatus helper that takes a connStatusInputs struct, so
the tri-state decision logic can be exercised without constructing
full Worker or Handshaker objects. Add table-driven tests covering
force-relay, ICE-unavailable and fully-available code paths, plus
unit tests for iceRetryState budget/hourly transitions and reset.
* [client] Improve grammar in logs and refactor ICE credential checks
* Add support for legacy IDP cache environment variable
* Centralize cache store creation to reuse a single Redis connection pool
Each cache consumer (IDP cache, token store, PKCE store, secrets manager,
EDR validator) was independently calling NewStore, creating separate Redis
clients with their own connection pools — up to 1400 potential connections
from a single management server process.
Introduce a shared CacheStore() singleton on BaseServer that creates one
store at boot and injects it into all consumers. Consumer constructors now
receive a store.StoreInterface instead of creating their own.
For Redis mode, all consumers share one connection pool (1000 max conns).
For in-memory mode, all consumers share one GoCache instance.
* Update management-integrations module to latest version
* sync go.sum
* Export `GetAddrFromEnv` to allow reuse across packages
* Update management-integrations module version in go.mod and go.sum
* Update management-integrations module version in go.mod and go.sum
Resolve conflict in setupAndroidRoutes: merge IPv6 fake IP route
with the explicit fake IP route storage from #5865.
Notifier now stores a slice of fake IP routes (v4 + v6) via
SetFakeIPRoutes to preserve the stale route re-injection fix.
extraInitialRoutes() was meant to preserve only the fake IP route
(240.0.0.0/8) across TUN rebuilds, but it re-injected any initial
route missing from the current set. When the management server
advertised exit node routes (0.0.0.0/0) that were later filtered
by the route selector, extraInitialRoutes() re-added them, causing
the Android VPN to capture all traffic with no peer to handle it.
Store the fake IP route explicitly and append only that in notify(),
removing the overly broad initial route diffing.
The packet tracer resolved 'self' to the v4 overlay address
unconditionally, causing "mixed address families" errors when tracing
v6 traffic. Pick the self address matching the peer's address family.
Add Engine.GetWgV6Addr() and rework parseAddress into
resolveTraceAddresses which parses the non-self address first to
determine the family, then resolves self accordingly.
The v6 NAT duplication only triggered for DomainSet destinations
(modern DNS path). Legacy dynamic routes use a 0.0.0.0/0 prefix
destination, so the v6 NAT rule was never created.
Add a Dynamic field to RouterPair so the firewall manager can
distinguish dynamic routes from exit nodes (both use /0 prefixes).
Set it from route.IsDynamic() in routeToRouterPair and propagate
through GetInversePair. Both nftables and iptables managers check
pair.Dynamic instead of destination shape.
Also accumulate errors in RemoveNatRule so v6 cleanup is attempted
even if v4 removal fails.
removeFromServerNetwork and CleanUp hardcoded useNewDNSRoute=false
when building the router pair for RemoveNatRule. This meant the
destination was a Prefix (0.0.0.0/0) instead of a DomainSet, so the
IsSet() branch in RemoveNatRule that removes the v6 duplicate never
triggered. The v6 NAT rule leaked until the next full Reset.
Store useNewDNSRoute on the Router from UpdateRoutes and use it
consistently in removeFromServerNetwork and CleanUp, making add
and remove symmetric.
- Add IPv6 router dispatch to AddOutputDNAT/RemoveOutputDNAT in both
nftables and iptables managers (was hardcoded to v4 router only).
- Fix all DNAT and AddDNATRule dispatch methods to check Is6() first,
then error with ErrIPv6NotInitialized if v6 components are missing.
Previously the hasIPv6() && Is6() pattern silently fell through to
the v4 router for v6 addresses when v6 was not initialized.
- Add ErrIPv6NotInitialized sentinel error, replace all ad-hoc
"IPv6 not initialized" format strings across both managers.
- Rename sourcePort/targetPort to originalPort/translatedPort in all
DNAT method signatures to reflect actual DNAT semantics.
- Remove stale "localAddr must be IPv4" comments from interface.
- Add GetSelectedClientRoutes() to the route manager that filters through FilterSelectedExitNodes, returning only active routes instead of all management routes
- Use GetSelectedClientRoutes() in the DNS route checker so deselected exit nodes' 0.0.0.0/0 no longer matches upstream DNS IPs — this prevented the resolver from switching
away from the utun-bound socket after exit node deselection
- Initialize iOS DNS server with host DNS fallback addresses (1.1.1.1:53, 1.0.0.1:53) and a permanent root zone handler, matching Android's behavior — without this, unmatched
DNS queries arriving via the 0.0.0.0/0 tunnel route had no handler and were silently dropped
Update the mgmProber interface to use HealthCheck() instead of the
now-unexported GetServerPublicKey(), aligning with the changes in the
management client API.
* Unexport GetServerPublicKey, add HealthCheck method
Internalize server key fetching into Login, Register,
GetDeviceAuthorizationFlow, and GetPKCEAuthorizationFlow methods,
removing the need for callers to fetch and pass the key separately.
Replace the exported GetServerPublicKey with a HealthCheck() error
method for connection validation, keeping IsHealthy() bool for
non-blocking background monitoring.
Fix test encryption to use correct key pairs (client public key as
remotePubKey instead of server private key).
* Refactor `doMgmLogin` to return only error, removing unused response
- DNS resolution broke after deselecting an exit node because the route checker used all client routes (including deselected ones) to decide how to forward upstream DNS
queries
- Added GetSelectedClientRoutes() to the route manager that filters out deselected exit nodes, and switched the DNS route checker to use it
- Confirmed fix via device testing: after deselecting exit node, DNS queries now correctly use a regular network socket instead of binding to the utun interface
* [client] Support embed.Client on Android with netstack mode
embed.Client.Start() calls ConnectClient.Run() which passes an empty
MobileDependency{}. On Android, the engine dereferences nil fields
(IFaceDiscover, NetworkChangeListener, DnsReadyListener) causing panics.
Provide complete no-op stubs so the engine's existing Android code
paths work unchanged — zero modifications to engine.go:
- Add androidRunOverride hook in Run() for Android-specific dispatch
- Add runOnAndroidEmbed() with complete MobileDependency (all stubs)
- Wire default stubs via init() in connect_android_default.go:
noopIFaceDiscover, noopNetworkChangeListener, noopDnsReadyListener
- Forward logPath to c.run()
Tested: embed.Client starts on Android arm64, joins mesh via relay,
discovers peers, localhost proxy works for TCP+UDP forwarding.
* [client] Fix TestServiceParamsPath for Windows path separators
Use filepath.Join in test assertions instead of hardcoded POSIX paths
so the test passes on Windows where filepath.Join uses backslashes.
Remove client secret from gRPC auth flow. The secret was originally included to support providers like Google Workspace that don't offer a proper PKCE flow, but this is no longer necessary with the embedded IdP. Deployments using such providers should migrate to the embedded IdP instead.
* [client] Add Expose support to embed library
Add ability to expose local services via the NetBird reverse proxy
from embedded client code.
Introduce ExposeSession with a blocking Wait method that keeps
the session alive until the context is cancelled.
Extract ProtocolType with ParseProtocolType into the expose package
and use it across CLI and embed layers.
* Fix TestNewRequest assertion to use ProtocolType instead of int
* Add documentation for Request and KeepAlive in expose manager
* Refactor ExposeSession to pass context explicitly in Wait method
* Refactor ExposeSession Wait method to explicitly pass context
* Update client/embed/expose.go
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
* Fix build
* Update client/embed/expose.go
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
---------
Co-authored-by: Viktor Liu <viktor@netbird.io>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: Viktor Liu <17948409+lixmal@users.noreply.github.com>