Commit Graph

799 Commits

Author SHA1 Message Date
Viktor Liu
4b298fb53c Fix legacy dynamic route NAT missing v6 duplicate
The v6 NAT duplication only triggered for DomainSet destinations
(modern DNS path). Legacy dynamic routes use a 0.0.0.0/0 prefix
destination, so the v6 NAT rule was never created.

Add a Dynamic field to RouterPair so the firewall manager can
distinguish dynamic routes from exit nodes (both use /0 prefixes).
Set it from route.IsDynamic() in routeToRouterPair and propagate
through GetInversePair. Both nftables and iptables managers check
pair.Dynamic instead of destination shape.

Also accumulate errors in RemoveNatRule so v6 cleanup is attempted
even if v4 removal fails.
2026-04-10 13:09:15 +02:00
Viktor Liu
8ddbcf6c5b Fix dynamic route v6 NAT rule not cleaned up on removal
removeFromServerNetwork and CleanUp hardcoded useNewDNSRoute=false
when building the router pair for RemoveNatRule. This meant the
destination was a Prefix (0.0.0.0/0) instead of a DomainSet, so the
IsSet() branch in RemoveNatRule that removes the v6 duplicate never
triggered. The v6 NAT rule leaked until the next full Reset.

Store useNewDNSRoute on the Router from UpdateRoutes and use it
consistently in removeFromServerNetwork and CleanUp, making add
and remove symmetric.
2026-04-10 12:50:09 +02:00
Viktor Liu
2f5d9fc0cd Add IPv6 dispatch for OutputDNAT, fix v6 guard pattern, rename DNAT params
- Add IPv6 router dispatch to AddOutputDNAT/RemoveOutputDNAT in both
  nftables and iptables managers (was hardcoded to v4 router only).
- Fix all DNAT and AddDNATRule dispatch methods to check Is6() first,
  then error with ErrIPv6NotInitialized if v6 components are missing.
  Previously the hasIPv6() && Is6() pattern silently fell through to
  the v4 router for v6 addresses when v6 was not initialized.
- Add ErrIPv6NotInitialized sentinel error, replace all ad-hoc
  "IPv6 not initialized" format strings across both managers.
- Rename sourcePort/targetPort to originalPort/translatedPort in all
  DNAT method signatures to reflect actual DNAT semantics.
- Remove stale "localAddr must be IPv4" comments from interface.
2026-04-10 12:32:41 +02:00
Viktor Liu
2a34f173c5 Anonymize SourcePrefixes in firewall rule debug output 2026-04-10 06:55:10 +02:00
Viktor Liu
6c5ff88569 Return error from EncodePrefix instead of silently clamping bits 2026-04-10 06:51:55 +02:00
Viktor Liu
456298864c Merge remote-tracking branch 'origin/main' into proto-ipv6-overlay
# Conflicts:
#	client/firewall/iptables/manager_linux.go
#	client/firewall/nftables/manager_linux.go
2026-04-10 06:51:49 +02:00
Viktor Liu
6e05a2ebe9 Fix CodeRabbit review issues from IPv6 overlay PR (#5839) 2026-04-10 09:12:35 +08:00
Viktor Liu
d2cdc0efec [client] Use native firewall for peer ACLs in userspace WireGuard mode (#5668) 2026-04-10 09:12:13 +08:00
Viktor Liu
f484835292 Use net.JoinHostPort and net.SplitHostPort for IPv6-safe host:port handling (#5836) 2026-04-10 09:10:57 +08:00
Viktor Liu
ac816a8382 Merge remote-tracking branch 'origin/main' into proto-ipv6-overlay 2026-04-09 11:58:06 +02:00
Viktor Liu
1c4e5e71d7 [client] Add IPv6 support to ACL manager, USP filter, and forwarder (#5688) 2026-04-09 10:56:08 +02:00
Viktor Liu
94a36cb53e [client] Handle UPnP routers that only support permanent leases (#5826) 2026-04-08 17:59:59 +02:00
Viktor Liu
413d95b740 [client] Include service.json in debug bundle (#5825)
* Include service.json in debug bundle

* Add tests for service params sanitization logic
2026-04-08 21:10:31 +08:00
Viktor Liu
d33cd4c95b [client] Add NAT-PMP/UPnP support (#5202) 2026-04-08 15:29:32 +08:00
Maycon Santos
e2c2f64be7 [client] Fix iOS DNS upstream routing for deselected exit nodes (#5803)
- Add GetSelectedClientRoutes() to the route manager that filters through FilterSelectedExitNodes, returning only active routes instead of all management routes              
  - Use GetSelectedClientRoutes() in the DNS route checker so deselected exit nodes' 0.0.0.0/0 no longer matches upstream DNS IPs — this prevented the resolver from switching
  away from the utun-bound socket after exit node deselection                                                                                                                   
  - Initialize iOS DNS server with host DNS fallback addresses (1.1.1.1:53, 1.0.0.1:53) and a permanent root zone handler, matching Android's behavior — without this, unmatched
   DNS queries arriving via the 0.0.0.0/0 tunnel route had no handler and were silently dropped
2026-04-08 08:43:48 +02:00
Viktor Liu
fa777684c5 Merge remote-tracking branch 'origin/main' into proto-ipv6-overlay 2026-04-08 07:51:46 +02:00
Viktor Liu
cb73b94ffb [client] Add TCP DNS support for local listener (#5758) 2026-04-08 07:40:36 +02:00
Viktor Liu
939598c83c Collapse IPv6 toggle log to stay under Sonar file line limit 2026-04-07 20:09:12 +02:00
Viktor Liu
90c5065c66 Add missing SetInterfaceIPv6 to Android noop network listener 2026-04-07 18:44:27 +02:00
Viktor Liu
9592de1aac Merge remote-tracking branch 'origin/main' into proto-ipv6-overlay
# Conflicts:
#	client/android/client.go
#	client/ssh/server/server.go
#	shared/management/proto/management.pb.go
2026-04-07 18:35:13 +02:00
Viktor Liu
aba5d6f0d2 [client] Error out on netbird expose when block inbound is enabled (#5818) 2026-04-07 17:55:35 +02:00
Zoltan Papp
6da34e483c [client] Fix mgmProber interface to match unexported GetServerPublicKey (#5815)
Update the mgmProber interface to use HealthCheck() instead of the
now-unexported GetServerPublicKey(), aligning with the changes in the
management client API.
2026-04-07 13:13:38 +02:00
Zoltan Papp
0efef671d7 [client] Unexport GetServerPublicKey, add HealthCheck method (#5735)
* Unexport GetServerPublicKey, add HealthCheck method

Internalize server key fetching into Login, Register,
GetDeviceAuthorizationFlow, and GetPKCEAuthorizationFlow methods,
removing the need for callers to fetch and pass the key separately.

Replace the exported GetServerPublicKey with a HealthCheck() error
method for connection validation, keeping IsHealthy() bool for
non-blocking background monitoring.

Fix test encryption to use correct key pairs (client public key as
remotePubKey instead of server private key).

* Refactor `doMgmLogin` to return only error, removing unused response
2026-04-07 12:18:21 +02:00
Maycon Santos
decb5dd3af [client] Add GetSelectedClientRoutes to route manager and update DNS route check (#5802)
- DNS resolution broke after deselecting an exit node because the route checker used all client routes (including deselected ones) to decide how to forward upstream DNS
  queries
  - Added GetSelectedClientRoutes() to the route manager that filters out deselected exit nodes, and switched the DNS route checker to use it
  - Confirmed fix via device testing: after deselecting exit node, DNS queries now correctly use a regular network socket instead of binding to the utun interface
2026-04-05 13:44:53 +02:00
Bethuel Mmbaga
9d1a37c644 [management,client] Revert gRPC client secret removal (#5781)
* This reverts commit e5914e4e8b

Signed-off-by: bcmmbaga <bethuelmbaga12@gmail.com>

* Deprecate client secret in proto

Signed-off-by: bcmmbaga <bethuelmbaga12@gmail.com>

* Fix lint

Signed-off-by: bcmmbaga <bethuelmbaga12@gmail.com>

---------

Signed-off-by: bcmmbaga <bethuelmbaga12@gmail.com>
2026-04-02 18:21:00 +02:00
tham-le
81f45dab21 [client] Support embed.Client on Android with netstack mode (#5623)
* [client] Support embed.Client on Android with netstack mode

embed.Client.Start() calls ConnectClient.Run() which passes an empty
MobileDependency{}. On Android, the engine dereferences nil fields
(IFaceDiscover, NetworkChangeListener, DnsReadyListener) causing panics.

Provide complete no-op stubs so the engine's existing Android code
paths work unchanged — zero modifications to engine.go:

- Add androidRunOverride hook in Run() for Android-specific dispatch
- Add runOnAndroidEmbed() with complete MobileDependency (all stubs)
- Wire default stubs via init() in connect_android_default.go:
  noopIFaceDiscover, noopNetworkChangeListener, noopDnsReadyListener
- Forward logPath to c.run()

Tested: embed.Client starts on Android arm64, joins mesh via relay,
discovers peers, localhost proxy works for TCP+UDP forwarding.

* [client] Fix TestServiceParamsPath for Windows path separators

Use filepath.Join in test assertions instead of hardcoded POSIX paths
so the test passes on Windows where filepath.Join uses backslashes.
2026-04-01 16:19:34 +02:00
Bethuel Mmbaga
e5914e4e8b [management,client] Remove client secret from gRPC auth flow (#5751)
Remove client secret from gRPC auth flow. The secret was originally included to support providers like Google Workspace that don't offer a proper PKCE flow, but this is no longer necessary with the embedded IdP. Deployments using such providers should migrate to the embedded IdP instead.
2026-03-31 18:50:49 +03:00
Viktor Liu
6553ce4cea [client] Mock management client in TestUpdateOldManagementURL to fix CI flakiness (#5703) 2026-03-31 10:49:06 +02:00
Viktor Liu
a62d472bc4 [client] Include fake IP block routes in Android TUN rebuilds (#5739) 2026-03-31 10:36:27 +02:00
Zoltan Papp
c522506849 [client] Add Expose support to embed library (#5695)
* [client] Add Expose support to embed library

Add ability to expose local services via the NetBird reverse proxy
from embedded client code.

Introduce ExposeSession with a blocking Wait method that keeps
the session alive until the context is cancelled.

Extract ProtocolType with ParseProtocolType into the expose package
and use it across CLI and embed layers.

* Fix TestNewRequest assertion to use ProtocolType instead of int

* Add documentation for Request and KeepAlive in expose manager

* Refactor ExposeSession to pass context explicitly in Wait method

* Refactor ExposeSession Wait method to explicitly pass context

* Update client/embed/expose.go

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

* Fix build

* Update client/embed/expose.go

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

---------

Co-authored-by: Viktor Liu <viktor@netbird.io>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: Viktor Liu <17948409+lixmal@users.noreply.github.com>
2026-03-30 15:53:50 +02:00
Viktor Liu
1cc19e7355 Merge branch 'client-ipv6-dns' into client-ipv6-ssh-netflow 2026-03-26 13:00:10 +01:00
Viktor Liu
1a6cf9dfec Merge branch 'client-ipv6-iface' into client-ipv6-dns
# Conflicts:
#	client/internal/dns/upstream_ios.go
2026-03-26 12:59:58 +01:00
Viktor Liu
6acc6a13f1 Merge branch 'proto-ipv6-overlay' into client-ipv6-iface 2026-03-26 12:56:48 +01:00
Viktor Liu
58eb519dbc Merge remote-tracking branch 'origin/main' into proto-ipv6-overlay 2026-03-26 12:56:35 +01:00
Viktor Liu
145d82f322 [client] Replace iOS DNS IsPrivate heuristic with route manager check (#5694) 2026-03-26 18:11:05 +08:00
Viktor Liu
bc6ed1a97f Merge branch 'client-ipv6-dns' into client-ipv6-ssh-netflow 2026-03-25 10:58:33 +01:00
Viktor Liu
50c0bc583b Fix connect.go lint: use SetIPv6FromCompact instead of if-else chain 2026-03-25 10:57:40 +01:00
Viktor Liu
baf2c03508 Fix CodeRabbit findings: hasIPv6Changed restart loop, empty peerIPs panic, v6 validation 2026-03-25 10:18:06 +01:00
Viktor Liu
5a29fa8432 Merge branch 'client-ipv6-dns' into client-ipv6-ssh-netflow 2026-03-25 10:07:46 +01:00
Viktor Liu
5fcea07181 Merge branch 'client-ipv6-iface' into client-ipv6-dns
# Conflicts:
#	client/iface/wgaddr/address.go
2026-03-25 10:07:43 +01:00
Viktor Liu
3be5a5f230 Fix CodeRabbit findings: hasIPv6Changed restart loop, empty peerIPs panic, v6 validation 2026-03-25 10:06:55 +01:00
Viktor Liu
d81cd5d154 Add IPv6 support to SSH server, client config, and netflow logger 2026-03-25 09:57:58 +01:00
Viktor Liu
71962f88f8 Add IPv6 reverse DNS and host configurator support 2026-03-25 09:57:26 +01:00
Viktor Liu
878dc45abf Fix govet non-constant format string in log.Warnf 2026-03-25 09:55:44 +01:00
Viktor Liu
1a7e835949 Fix CodeRabbit findings: hasIPv6Changed restart loop, empty peerIPs panic, v6 validation 2026-03-25 09:55:44 +01:00
Viktor Liu
b852ce1a99 Add IPv6 overlay address support to client interface and engine 2026-03-25 09:55:44 +01:00
Viktor Liu
acdf8d981a Merge branch 'main' into proto-ipv6-overlay 2026-03-22 18:34:17 +01:00
Zoltan Papp
91f0d5cefd [client] Feature/client metrics (#5512)
* Add client metrics

* Add client metrics system with OpenTelemetry and VictoriaMetrics support

Implements a comprehensive client metrics system to track peer connection
stages and performance. The system supports multiple backend implementations
(OpenTelemetry, VictoriaMetrics, and no-op) and tracks detailed connection
stage durations from creation through WireGuard handshake.

Key changes:
- Add metrics package with pluggable backend implementations
- Implement OpenTelemetry metrics backend
- Implement VictoriaMetrics metrics backend
- Add no-op metrics implementation for disabled state
- Track connection stages: creation, semaphore, signaling, connection ready, and WireGuard handshake
- Move WireGuard watcher functionality to conn.go
- Refactor engine to integrate metrics tracking
- Add metrics export endpoint in debug server

* Add signaling metrics tracking for initial and reconnection attempts

* Reset connection stage timestamps during reconnections to exclude unnecessary metrics tracking

* Delete otel lib from client

* Update unit tests

* Invoke callback on handshake success in WireGuard watcher

* Add Netbird version tracking to client metrics

Integrate Netbird version into VictoriaMetrics backend and metrics labels. Update `ClientMetrics` constructor and metric name formatting to include version information.

* Add sync duration tracking to client metrics

Introduce `RecordSyncDuration` for measuring sync message processing time. Update all metrics implementations (VictoriaMetrics, no-op) to support the new method. Refactor `ClientMetrics` to use `AgentInfo` for static agent data.

* Remove no-op metrics implementation and simplify ClientMetrics constructor

Eliminate unused `noopMetrics` and refactor `ClientMetrics` to always use the VictoriaMetrics implementation. Update associated logic to reflect these changes.

* Add total duration tracking for connection attempts

Calculate total duration for both initial connections and reconnections, accounting for different timestamp scenarios. Update `Export` method to include Prometheus HELP comments.

* Add metrics push support to VictoriaMetrics integration

* [client] anchor connection metrics to first signal received

* Remove creation_to_semaphore connection stage metric

The semaphore queuing stage (Created → SemaphoreAcquired) is no longer
tracked. Connection metrics now start from SignalingReceived. Updated
docs and Grafana dashboard accordingly.

* [client] Add remote push config for metrics with version-based eligibility

Introduce remoteconfig.Manager that fetches a remote JSON config to control
metrics push interval and restrict pushing to a specific agent version
range. When NB_METRICS_INTERVAL is set, remote config is bypassed
entirely for local override.

* [client] Add WASM-compatible NewClientMetrics implementation

Replace NewClientMetrics in metrics.go with a WASM-specific stub in metrics_js.go, returning nil for compatibility with JS builds. Simplify method usage for WASM targets.

* Add missing file

* Update default case in DeploymentType.String to return "unknown" instead of "selfhosted"

* [client] Rework metrics to use timestamped samples instead of histograms

Replace cumulative Prometheus histograms with timestamped point-in-time
samples that are pushed once and cleared. This fixes metrics for sparse
events (connections/syncs that happen once at startup) where rate() and
increase() produced incorrect or empty results.

Changes:
- Switch from VictoriaMetrics histogram library to raw Prometheus text
  format with explicit millisecond timestamps
- Reset samples after successful push (no resending stale data)
- Rename connection_to_handshake → connection_to_wg_handshake
- Add netbird_peer_connection_count metric for ICE vs Relay tracking
- Simplify dashboard: point-based scatter plots, donut pie chart
- Add maxStalenessInterval=1m to VictoriaMetrics to prevent forward-fill
- Fix deployment_type Unknown returning "selfhosted" instead of "unknown"
- Fix inverted shouldPush condition in push.go

* [client] Add InfluxDB metrics backend alongside VictoriaMetrics

Add influxdb.go with timestamped line protocol export for sparse
one-shot events. Restore victoria.go to use proper Prometheus
histograms. Update Grafana dashboards, add InfluxDB datasource,
and update docs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* [client] Fix metrics issues and update dev docker setup

- Fix StopPush not clearing push state, preventing restart
- Fix race condition reading currentConnPriority without lock in recordConnectionMetrics
- Fix stale comment referencing old metrics server URL
- Update docker-compose for InfluxDB: add scoped tokens, .env config, init scripts
- Rename docker-compose.victoria.yml to docker-compose.yml

* [client] Add anonymised peer tracking to pushed metrics

Introduce peer_id and connection_pair_id tags to InfluxDB metrics.
Public keys are hashed (truncated SHA-256) for anonymisation. The
connection pair ID is deterministic regardless of which side computes
it, enabling deduplication of reconnections in the ICE vs Relay
dashboard. Also pin Grafana to v11.6.0 for file-based provisioning
and fix datasource UID references.

* Remove unused dependencies from go.mod and go.sum

* Refactor InfluxDB ingest pipeline: extract validation logic

- Move line validation logic to `validateLine` and `validateField` helper functions.
- Improve error handling with structured validation and clearer separation of concerns.
- Add stderr redirection for error messages in `create-tokens.sh`.

* Set non-root user in Dockerfile for Ingest service

* Fix Windows CI: command line too long

* Remove Victoria metrics

* Add hashed peer ID as Authorization header in metrics push

* Revert influxdb in docker compose

* Enable gzip compression and authorization validation for metrics push and ingest

* Reducate code of complexity

* Update debug documentation to include metrics.txt description

* Increase `maxBodySize` limit to 50 MB and update gzip reader wrapping logic

* Refactor deployment type detection to use URL parsing for improved accuracy

* Update readme

* Throttle remote config retries on fetch failure

* Preserve first WG handshake timestamp, ignore rekeys

* Skip adding empty metrics.txt to debug bundle in debug mode

* Update default metrics server URL to https://ingest.netbird.io

* Atomic metrics export-and-reset to prevent sample loss between Export and Reset calls

* Fix doc

* Refactor Push configuration to improve clarity and enforce minimum push interval

* Remove `minPushInterval` and update push interval validation logic

* Revert ExportAndReset, it is acceptable data loss

* Fix metrics review issues: rename env var, remove stale infra, add tests

- Rename NB_METRICS_ENABLED to NB_METRICS_PUSH_ENABLED to clarify that
  collection is always active (for debug bundles) and only push is opt-in
- Change default config URL from staging to production (ingest.netbird.io)
- Delete broken Prometheus dashboard (used non-existent metric names)
- Delete unused VictoriaMetrics datasource config
- Replace committed .env with .env.example containing placeholder values
- Wire Grafana admin credentials through env vars in docker-compose
- Make metricsStages a pointer to prevent reset-vs-write race on reconnect
- Fix typed-nil interface in debug bundle path (GetClientMetrics)
- Use deterministic field order in InfluxDB Export (sorted keys)
- Replace Authorization header with X-Peer-ID for metrics push
- Fix ingest server timeout to use time.Second instead of float
- Fix gzip double-close, stale comments, trim log levels
- Add tests for influxdb.go and MetricsStages

* Add login duration metric, ingest tag validation, and duration bounds

- Add netbird_login measurement recording login/auth duration to management
  server, with success/failure result tag
- Validate InfluxDB tags against per-measurement allowlists in ingest server
  to prevent arbitrary tag injection
- Cap all duration fields (*_seconds) at 300s instead of only total_seconds
- Add ingest server tests for tag/field validation, bounds, and auth

* Add arch tag to all metrics

* Fix Grafana dashboard: add arch to drop columns, add login panels

* Validate NB_METRICS_SERVER_URL is an absolute HTTP(S) URL

* Address review comments: fix README wording, update stale comments

* Clarify env var precedence does not bypass remote config eligibility

* Remove accidentally committed pprof files

---------

Co-authored-by: Viktor Liu <viktor@netbird.io>
2026-03-22 12:45:41 +01:00
Viktor Liu
82762280ee [client] Add health check flag to status command and expose daemon status in output (#5650) 2026-03-22 12:39:40 +01:00
Viktor Liu
01c4d5761d Fix gosec and staticcheck lint errors from proto deprecation 2026-03-21 18:20:34 +01:00