Commit Graph

1249 Commits

Author SHA1 Message Date
Viktor Liu
5f3aef3198 Validate IP-declared lengths before synthesizing direct ICMP packet 2026-05-04 12:12:45 +02:00
Viktor Liu
1b2d7777a3 Skip iOS SetInterfaceIPv6 when no IPv6 overlay address is assigned 2026-05-04 12:12:05 +02:00
Viktor Liu
ad30faed5f Log v6 prefix decode failure at error level instead of warn 2026-05-04 12:10:18 +02:00
Viktor Liu
d7b971e157 Merge remote-tracking branch 'origin/main' into proto-ipv6-overlay
# Conflicts:
#	client/internal/peer/status.go
2026-05-04 12:08:38 +02:00
Viktor Liu
ecf987c5b5 Use isDefaultRoute helper for exit node detection in UI 2026-05-04 12:03:31 +02:00
Viktor Liu
bcf006581d Roll back nftables init via deferred cleanup on any failure 2026-05-04 12:03:05 +02:00
Zoltan Papp
a21f6ecb0a [client] release Status.mux before invoking notifier callbacks (#6039)
The Status recorder used to fire notifier callbacks while holding d.mux:
- notifyPeerListChanged / notifyPeerStateChangeListeners ran from inside
  the locked section of every Update*/AddPeerStateRoute/etc.
- notifyAddressChanged ran from UpdateLocalPeerState and CleanLocalPeerState
  while d.mux was held.
- onConnectionChanged was registered with a defer above defer d.mux.Unlock,
  so it executed before the mutex was released in the Mark*Connected/
  Disconnected helpers.
- notifyPeerStateChangeListeners did a blocking channel send under d.mux,
  so a slow subscriber stalled every other d.mux holder.

A listener that re-enters the recorder (e.g. calls GetFullStatus from
within a callback) deadlocks against d.mux, and any callback that takes
longer than expected stalls every other state query for its duration.

Capture the values needed for notification under the lock, release d.mux,
then call the notifier. Build per-peer router-state snapshots inside the
lock and dispatch them via dispatchRouterPeers afterwards. The router-peer
channel send stays blocking, but now happens outside d.mux so a slow
consumer cannot stall any other d.mux holder, and no peer state
transitions are silently dropped.

The notifier itself is unchanged: its internal state was already protected
by its own locks, and the field d.notifier is set once in NewRecorder and
never reassigned, so reading it without d.mux is safe.

Also fix a pre-existing race in Test_notifier_RemoveListener /
Test_notifier_SetListener: setListener spawns a goroutine that writes
listener.peers, but the tests read listener.peers without waiting for it.
2026-05-04 11:59:01 +02:00
Viktor Liu
006d925d9c Use family-specific protocol token in iptables AddOutputDNAT 2026-05-04 11:57:18 +02:00
Viktor Liu
15070f0b13 Return error from trace selfAddr when no overlay address for family 2026-05-04 11:55:19 +02:00
Viktor Liu
03ac436d02 Guard v6 exit node merge against empty companion routes slice 2026-05-04 11:54:58 +02:00
Viktor Liu
d2d6e14b9e Guard isOwnAddress against nil wgInterface 2026-05-04 11:54:36 +02:00
Viktor Liu
b01a7da44f Clear anonymized IPv6 address when prefix encode fails 2026-05-04 11:52:33 +02:00
Viktor Liu
d8a5bdab88 Guard MSS clamp precompute against MTU smaller than TCP/IP header overhead 2026-05-04 11:52:16 +02:00
Viktor Liu
5cb82b26c8 Decode ICMP error payload using family-specific minimum length 2026-05-04 11:51:49 +02:00
Viktor Liu
61c64caf69 Skip nftables MSS clamping when MTU is below header overhead 2026-05-04 11:50:38 +02:00
Viktor Liu
0ce2d7406a Roll back v4 NAT rule when v6 mirror fails in nftables AddNatRule 2026-05-04 11:50:17 +02:00
Viktor Liu
fc34db6db1 Validate ip6tables-save stderr in nftables compatibility test 2026-05-04 11:49:47 +02:00
Viktor Liu
35332d6aa3 Merge remote-tracking branch 'origin/main' into proto-ipv6-overlay
# Conflicts:
#	client/firewall/uspfilter/forwarder/endpoint.go
#	client/wasm/cmd/main.go
#	proxy/cmd/proxy/cmd/debug.go
2026-05-04 11:40:41 +02:00
Viktor Liu
50b58a6828 [client, relay] Advertise relay server IP via signal for foreign-relay fallback dial (#6004) 2026-05-04 11:40:25 +02:00
Viktor Liu
057d651d2e [client, proxy] Add packet capture to debug bundle and CLI (#5891) 2026-05-04 11:28:56 +02:00
Maycon Santos
3fc5a8d4a1 [misc] fix MSI generation add installer tests (#6031)
Add Windows installer build test workflow
2026-04-29 23:44:38 +02:00
Viktor Liu
ed828b7af4 Tolerate EEXIST when adding macOS scoped default routes (#6027) 2026-04-29 16:08:47 +02:00
Viktor Liu
11ac2af2f5 Use BindListener for all userspace bind in lazyconn activity (#6028) 2026-04-29 16:07:33 +02:00
Bethuel Mmbaga
df197d5001 [management] Prevent JWT reuse during peer login (#6002) 2026-04-29 15:04:27 +03:00
shuuri-labs
ad93dcf980 [client] Enable UI autostart for silent and MSI installs (#6026)
* fix(client): enable UI autostart for silent and MSI installs

The MSI installer had no autostart logic and the EXE silent installer
skipped the autostart page, leaving the registry entry unwritten. This
caused the NetBird UI tray to not start at login after RMM deployments.

Add an AUTOSTART property (default: 1) to the MSI that writes the
HKLM Run key, and initialize AutostartEnabled in the NSIS .onInit so
silent installs match the interactive default.

* add real guid for NetBirdAutoStart component
2026-04-29 13:14:46 +02:00
Viktor Liu
28fe26637b [client] Fix Windows installer upgrade detection for pre-0.70.1 installs (#6025) 2026-04-29 11:01:07 +02:00
Viktor Liu
c30f081d67 Merge branch 'main' into proto-ipv6-overlay
# Conflicts:
#	client/proto/daemon.pb.go
2026-04-29 10:09:34 +02:00
Viktor Liu
407e9d304b [client] Move macOS sleep detection into the daemon (purego) (#5926) 2026-04-29 08:09:55 +02:00
Viktor Liu
e5474e199f [client] Use WinRT COM for Windows toasts (#6013)
* Use WinRT COM for Windows toasts instead of fyne's PowerShell path

* Quote autostart path and split HKCU registry into per-user component
2026-04-28 20:54:06 +02:00
Zoltan Papp
8fc4265995 [relay] evict foreign client cache on disconnect (#6015)
* [relay] evict foreign client cache on disconnect

When a foreign relay's TCP connection drops, the manager's
onServerDisconnected handler only triggered reconnect logic for the
home server; the disconnected foreign entry stayed in the relayClients
cache. Subsequent OpenConn calls reused the closed client until the
60-second cleanup tick evicted it, breaking peer connectivity through
that relay for up to a minute.

Evict the foreign entry from the cache on disconnect so the next
OpenConn dials a fresh client.

Also:
- Make the reconnect backoff cap configurable via WithMaxBackoffInterval
  ManagerOption; the previous hard-coded 60s constant forced
  TestAutoReconnect to sleep ~61s. Test now polls Ready() and finishes
  in ~2s.
- Add NB_HOME_RELAY_SERVERS env var that overrides the relay URL list
  received from management, so a peer can be pinned to a specific home
  relay (used by the netbird-conn-lab Edge 4 reproducer).

* [client] treat empty NB_HOME_RELAY_SERVERS as unset

Returning (urls=[], ok=true) when the env var contained only separators or
whitespace caused callers to wipe the mgmt-provided relay list, leaving the
peer with no relays. Treat a parsed-empty result the same as an unset env.
2026-04-28 15:04:48 +02:00
Viktor Liu
b0f5d78df1 Ignore v6 exit node notification 2026-04-28 12:10:39 +02:00
Viktor Liu
e19d0c7d77 Merge branch 'main' into proto-ipv6-overlay
# Conflicts:
#	client/firewall/iptables/manager_linux.go
#	client/firewall/nftables/manager_linux.go
#	client/firewall/nftables/router_linux.go
2026-04-23 11:48:15 +02:00
Viktor Liu
801de8c68d [client] Add TTL-based refresh to mgmt DNS cache via handler chain (#5945) 2026-04-22 15:10:14 +02:00
Zoltan Papp
1165058fad [client] fix port collision in TestUpload (#5950)
* [debug] fix port collision in TestUpload

TestUpload hardcoded :8080, so it failed deterministically when anything
was already on that port and collided across concurrent test runs.
Bind a :0 listener in the test to get a kernel-assigned free port, and
add Server.Serve so tests can hand the listener in without reaching
into unexported state.

* [debug] drop test-only Server.Serve, use SERVER_ADDRESS env

The previous commit added a Server.Serve method on the upload-server,
used only by TestUpload. That left production with an unused function.
Reserve an ephemeral loopback port in the test, release it, and pass
the address through SERVER_ADDRESS (which the server already reads).
A small wait helper ensures the server is accepting connections before
the upload runs, so the close/rebind gap does not cause a false failure.
2026-04-21 19:07:20 +02:00
Zoltan Papp
2fb50aef6b [client] allow UDP packet loss in TestICEBind_HandlesConcurrentMixedTraffic (#5953)
The test writes 500 packets per family and asserted exact-count
delivery within a 5s window, even though its own comment says "Some
packet loss is acceptable for UDP". On FreeBSD/QEMU runners the writer
loops cannot always finish all 500 before the 5s deadline closes the
readers (we have seen 411/500 in CI).

The real assertion of this test is the routing check — IPv4 peer only
gets v4- packets, IPv6 peer only gets v6- packets — which remains
strict. Replace the exact-count assertions with a >=80% delivery
threshold so runner speed variance no longer causes false failures.
2026-04-21 19:05:58 +02:00
Viktor Liu
064ec1c832 [client] Trust wg interface in firewalld to bypass owner-flagged chains (#5928) 2026-04-21 17:57:16 +02:00
Viktor Liu
75e408f51c [client] Prefer systemd-resolved stub over file mode regardless of resolv.conf header (#5935) 2026-04-21 17:56:56 +02:00
Zoltan Papp
5a89e6621b [client] Supress ICE signaling (#5820)
* [client] Suppress ICE signaling and periodic offers in force-relay mode

When NB_FORCE_RELAY is enabled, skip WorkerICE creation entirely,
suppress ICE credentials in offer/answer messages, disable the
periodic ICE candidate monitor, and fix isConnectedOnAllWay to
only check relay status so the guard stops sending unnecessary offers.

* [client] Dynamically suppress ICE based on remote peer's offer credentials

Track whether the remote peer includes ICE credentials in its
offers/answers. When remote stops sending ICE credentials, skip
ICE listener dispatch, suppress ICE credentials in responses, and
exclude ICE from the guard connectivity check. When remote resumes
sending ICE credentials, re-enable all ICE behavior.

* [client] Fix nil SessionID panic and force ICE teardown on relay-only transition

Fix nil pointer dereference in signalOfferAnswer when SessionID is nil
(relay-only offers). Close stale ICE agent immediately when remote peer
stops sending ICE credentials to avoid traffic black-hole during the
ICE disconnect timeout.

* [client] Add relay-only fallback check when ICE is unavailable

Ensure the relay connection is supported with the peer when ICE is disabled to prevent connectivity issues.

* [client] Add tri-state connection status to guard for smarter ICE retry (#5828)

* [client] Add tri-state connection status to guard for smarter ICE retry

Refactor isConnectedOnAllWay to return a ConnStatus enum (Connected,
Disconnected, PartiallyConnected) instead of a boolean. When relay is
up but ICE is not (PartiallyConnected), limit ICE offers to 3 retries
with exponential backoff then fall back to hourly attempts, reducing
unnecessary signaling traffic. Fully disconnected peers continue to
retry aggressively. External events (relay/ICE disconnect, signal/relay
reconnect) reset retry state to give ICE a fresh chance.

* [client] Clarify guard ICE retry state and trace log trigger

Split iceRetryState.attempt into shouldRetry (pure predicate) and
enterHourlyMode (explicit state transition) so the caller in
reconnectLoopWithRetry reads top-to-bottom. Restore the original
trace-log behavior in isConnectedOnAllWay so it only logs on full
disconnection, not on the new PartiallyConnected state.

* [client] Extract pure evalConnStatus and add unit tests

Split isConnectedOnAllWay into a thin method that snapshots state and
a pure evalConnStatus helper that takes a connStatusInputs struct, so
the tri-state decision logic can be exercised without constructing
full Worker or Handshaker objects. Add table-driven tests covering
force-relay, ICE-unavailable and fully-available code paths, plus
unit tests for iceRetryState budget/hourly transitions and reset.

* [client] Improve grammar in logs and refactor ICE credential checks
2026-04-21 15:52:08 +02:00
Viktor Liu
3537e2234f Fix manager_test.go: use netip.MustParseAddr for PeerSSHInfo.IP 2026-04-20 19:58:59 +02:00
Viktor Liu
d2aaadbb8c Replace deprecated iptables --set with --match-set in ACL ipset match 2026-04-20 19:56:45 +02:00
Viktor Liu
4506f82b2d Merge branch 'main' into proto-ipv6-overlay 2026-04-20 19:45:14 +02:00
Zoltan Papp
3098f48b25 [client] fix ios network addresses mac filter (#5906)
* fix(client): skip MAC address filter for network addresses on iOS

iOS does not expose hardware (MAC) addresses due to Apple's privacy
restrictions (since iOS 14), causing networkAddresses() to return an
empty list because all interfaces are filtered out by the HardwareAddr
check. Move networkAddresses() to platform-specific files so iOS can
skip this filter.
2026-04-20 11:49:38 +02:00
Zoltan Papp
7f023ce801 [client] Android debug bundle support (#5888)
Add Android debug bundle support with Troubleshoot UI
2026-04-20 11:26:30 +02:00
Michael Uray
e361126515 [client] Fix WGIface.Close deadlock when DNS filter hook re-enters GetDevice (#5916)
WGIface.Close() took w.mu and held it across w.tun.Close(). The
underlying wireguard-go device waits for its send/receive goroutines to
drain before Close() returns, and some of those goroutines re-enter
WGIface during shutdown. In particular, the userspace packet filter DNS
hook in client/internal/dns.ServiceViaMemory.filterDNSTraffic calls
s.wgInterface.GetDevice() on every packet, which also needs w.mu. With
the Close-side holding the mutex, the read goroutine blocks in
GetDevice and Close waits forever for that goroutine to exit:

  goroutine N (TestDNSPermanent_updateUpstream):
    WGIface.Close -> holds w.mu -> tun.Close -> sync.WaitGroup.Wait
  goroutine M (wireguard read routine):
    FilteredDevice.Read -> filterOutbound -> udpHooksDrop ->
    filterDNSTraffic.func1 -> WGIface.GetDevice -> sync.Mutex.Lock

This surfaces as a 5 minute test timeout on the macOS Client/Unit
CI job (panic: test timed out after 5m0s, running tests:
TestDNSPermanent_updateUpstream).

Release w.mu before calling w.tun.Close(). The other Close steps
(wgProxyFactory.Free, waitUntilRemoved, Destroy) do not mutate any
fields guarded by w.mu beyond what Free() already does, so the lock
is not needed once the tun has started shutting down. A new unit test
in iface_close_test.go uses a fake WGTunDevice to reproduce the
deadlock deterministically without requiring CAP_NET_ADMIN.
2026-04-20 10:36:19 +02:00
Viktor Liu
95213f7157 [client] Use Match host+exec instead of Host+Match in SSH client config (#5903) 2026-04-20 10:24:11 +02:00
Viktor Liu
cec21034e8 [client] Reconcile external nft accept rules on external changes (#5912) 2026-04-20 10:23:44 +02:00
Viktor Liu
2e0e3a3601 [client] Replace exclusion routes with scoped default + IP_BOUND_IF on macOS (#5918) 2026-04-20 10:01:01 +02:00
Viktor Liu
02c0b20f21 Move IPv6 blackhole from notifier to platform tun (java/swift) 2026-04-18 12:28:16 +02:00
Viktor Liu
0c9f4706b2 Merge branch 'main' into proto-ipv6-overlay 2026-04-17 05:45:43 +02:00
Maycon Santos
53b04e512a [management] Reuse a single cache store across all management server consumers (#5889)
* Add support for legacy IDP cache environment variable

* Centralize cache store creation to reuse a single Redis connection pool

Each cache consumer (IDP cache, token store, PKCE store, secrets manager,
EDR validator) was independently calling NewStore, creating separate Redis
clients with their own connection pools — up to 1400 potential connections
from a single management server process.

Introduce a shared CacheStore() singleton on BaseServer that creates one
store at boot and injects it into all consumers. Consumer constructors now
receive a store.StoreInterface instead of creating their own.

For Redis mode, all consumers share one connection pool (1000 max conns).
For in-memory mode, all consumers share one GoCache instance.

* Update management-integrations module to latest version

* sync go.sum

* Export `GetAddrFromEnv` to allow reuse across packages

* Update management-integrations module version in go.mod and go.sum

* Update management-integrations module version in go.mod and go.sum
2026-04-16 16:04:53 +02:00