Commit Graph

2747 Commits

Author SHA1 Message Date
Bethuel Mmbaga
9d1a37c644 [management,client] Revert gRPC client secret removal (#5781)
* This reverts commit e5914e4e8b

Signed-off-by: bcmmbaga <bethuelmbaga12@gmail.com>

* Deprecate client secret in proto

Signed-off-by: bcmmbaga <bethuelmbaga12@gmail.com>

* Fix lint

Signed-off-by: bcmmbaga <bethuelmbaga12@gmail.com>

---------

Signed-off-by: bcmmbaga <bethuelmbaga12@gmail.com>
v0.67.3
2026-04-02 18:21:00 +02:00
Viktor Liu
5bf2372c4d [management] Fix L4 service creation deadlock on single-connection databases (#5779) 2026-04-02 14:46:14 +02:00
Bethuel Mmbaga
c2c6396a04 [management] Allow updating embedded IdP user name and email (#5721) 2026-04-02 13:02:10 +03:00
Misha Bragin
aaf813fc0c Add selfhosted scaling note (#5769) v0.67.2 2026-04-01 19:23:39 +02:00
Vlad
d97fe84296 [management] fix race condition in the setup flow that enables creation of multiple owner users (#5754) 2026-04-01 16:25:35 +02:00
tham-le
81f45dab21 [client] Support embed.Client on Android with netstack mode (#5623)
* [client] Support embed.Client on Android with netstack mode

embed.Client.Start() calls ConnectClient.Run() which passes an empty
MobileDependency{}. On Android, the engine dereferences nil fields
(IFaceDiscover, NetworkChangeListener, DnsReadyListener) causing panics.

Provide complete no-op stubs so the engine's existing Android code
paths work unchanged — zero modifications to engine.go:

- Add androidRunOverride hook in Run() for Android-specific dispatch
- Add runOnAndroidEmbed() with complete MobileDependency (all stubs)
- Wire default stubs via init() in connect_android_default.go:
  noopIFaceDiscover, noopNetworkChangeListener, noopDnsReadyListener
- Forward logPath to c.run()

Tested: embed.Client starts on Android arm64, joins mesh via relay,
discovers peers, localhost proxy works for TCP+UDP forwarding.

* [client] Fix TestServiceParamsPath for Windows path separators

Use filepath.Join in test assertions instead of hardcoded POSIX paths
so the test passes on Windows where filepath.Join uses backslashes.
2026-04-01 16:19:34 +02:00
Zoltan Papp
d670e7382a [client] Fix ipv6 address in quic server (#5763)
* [client] Use `net.JoinHostPort` for consistency in constructing host-port pairs

* [client] Fix handling of IPv6 addresses by trimming brackets in `net.JoinHostPort`
2026-04-01 15:11:23 +02:00
Pascal Fischer
cd8c686339 [misc] add path traversal and file size protections (#5755) 2026-04-01 14:23:24 +02:00
Pascal Fischer
f5c41e3018 [misc] set permissions on env file for getting started scripts (#5761) 2026-04-01 14:13:53 +02:00
Pascal Fischer
2477f99d89 [proxy] Add pprof (#5764) 2026-04-01 14:10:41 +02:00
shuuri-labs
940f530ac2 [management] Legacy to embedded IdP migration tool (#5586) 2026-04-01 13:53:19 +02:00
Zoltan Papp
4d3e2f8ad3 Fix path join (#5762) 2026-04-01 13:21:19 +02:00
Vlad
5ae986e1c4 [management] fix panic on management reboot (#5759) 2026-04-01 12:31:30 +02:00
Bethuel Mmbaga
e5914e4e8b [management,client] Remove client secret from gRPC auth flow (#5751)
Remove client secret from gRPC auth flow. The secret was originally included to support providers like Google Workspace that don't offer a proper PKCE flow, but this is no longer necessary with the embedded IdP. Deployments using such providers should migrate to the embedded IdP instead.
2026-03-31 18:50:49 +03:00
Pascal Fischer
c238f5425f [management] proper module permission validation for posture check delete (#5742) 2026-03-31 16:43:49 +02:00
Pascal Fischer
3c3097ea74 [management] add target user account validation (#5741) 2026-03-31 16:43:16 +02:00
Maycon Santos
405c3f4003 [management] Feature/fleetdm api spec (#5597)
add fleetdm api spec
2026-03-31 14:03:34 +02:00
Viktor Liu
6553ce4cea [client] Mock management client in TestUpdateOldManagementURL to fix CI flakiness (#5703) 2026-03-31 10:49:06 +02:00
Viktor Liu
a62d472bc4 [client] Include fake IP block routes in Android TUN rebuilds (#5739) 2026-03-31 10:36:27 +02:00
Eduard Gert
434ac7f0f5 [docs] Update CONTRIBUTOR_LICENSE_AGREEMENT.md (#5131) 2026-03-31 09:31:03 +02:00
Akshay Ubale
7bbe71c3ac [client] Refactor Android PeerInfo to use proper ConnStatus enum type (#5644)
* Simplify Android ConnStatus API with integer constants

Replace dual field PeerInfo design with unified integer based
ConnStatus field and exported gomobile friendly constants.

Changes:
> PeerInfo.ConnStatus: changed from string to int
> Export three constants: ConnStatusIdle, ConnStatusConnecting,ConnStatusConnected (mapped to peer.ConnStatus enum values)
> Updated PeersList() to convert peer enum directly to int

Benefits:
> Simpler API surface with single ConnStatus field
> Better gomobile compatibility for cross-platform usage
> Type-safe integer constants across language boundaries

* test: add All group to setupTestAccount fixture

The setupTestAccount() test helper was missing the required "All" group,
causing "failed to get group all: no group ALL found" errors during
test execution. Add the All group with all test peers to match the
expected account structure.

Fixes the failing account and types package tests when GetGroupAll()
is called in test scenarios.
2026-03-30 17:55:01 +02:00
Viktor Liu
04dcaadabf [client] Persist service install parameters across reinstalls (#5732) 2026-03-30 16:25:14 +02:00
Zoltan Papp
c522506849 [client] Add Expose support to embed library (#5695)
* [client] Add Expose support to embed library

Add ability to expose local services via the NetBird reverse proxy
from embedded client code.

Introduce ExposeSession with a blocking Wait method that keeps
the session alive until the context is cancelled.

Extract ProtocolType with ParseProtocolType into the expose package
and use it across CLI and embed layers.

* Fix TestNewRequest assertion to use ProtocolType instead of int

* Add documentation for Request and KeepAlive in expose manager

* Refactor ExposeSession to pass context explicitly in Wait method

* Refactor ExposeSession Wait method to explicitly pass context

* Update client/embed/expose.go

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

* Fix build

* Update client/embed/expose.go

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

---------

Co-authored-by: Viktor Liu <viktor@netbird.io>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: Viktor Liu <17948409+lixmal@users.noreply.github.com>
2026-03-30 15:53:50 +02:00
Viktor Liu
0765352c99 [management] Persist proxy capabilities to database (#5720) 2026-03-30 13:03:42 +02:00
tobsec
13807f1b3d [client] Fix Exit Node submenu separator accumulation on Windows (#5691)
* client/ui: fix Exit Node submenu separator accumulation on Windows

On Windows the tray uses a background poller (every 10s) instead of
TrayOpenedCh to keep the Exit Node menu fresh. Each poll that has a
selected exit node called s.mExitNode.AddSeparator() before the
"Deselect All" item. Because AddSeparator() returns no handle the
separator was never removed in the cleanup pass of
recreateExitNodeMenu(), while every other item (exit node checkboxes
and the "Deselect All" entry) was properly tracked and removed.

After the client has been running for a while with an exit node
selected this leaves hundreds of separator lines stacked in the
submenu, filling the screen height with blank entries (#4702).

On Linux/FreeBSD this is masked because the parent mExitNode item
itself is removed and recreated each cycle, wiping all children
including orphaned separators.

Fix: replace the untracked AddSeparator() call with a regular disabled
sub-menu item that is stored in mExitNodeSeparator and removed at the
start of each recreateExitNodeMenu() call alongside mExitNodeDeselectAll.

Fixes #4702

* client/ui: extract addExitNodeDeselectAll to reduce cognitive complexity

Move the separator + deselect-all creation and its goroutine listener
out of recreateExitNodeMenu into a dedicated helper, bringing the
function's cognitive complexity back under the SonarCloud threshold.
2026-03-30 10:41:38 +02:00
Bethuel Mmbaga
c919ea149e [misc] Add missing OpenAPI definitions (#5690) 2026-03-30 11:20:17 +03:00
Pascal Fischer
be6fd119d8 [management] no events for temporary peers (#5719) 2026-03-30 10:08:02 +02:00
Pascal Fischer
7abf730d77 [management] update to latest grpc version (#5716) 2026-03-27 15:22:23 +01:00
Pascal Fischer
ec96c5ecaf [management] Extend blackbox tests (#5699) 2026-03-26 16:59:49 +01:00
Pascal Fischer
7e1cce4b9f [management] add terminated field to service (#5700) 2026-03-26 16:59:08 +01:00
Bethuel Mmbaga
7be8752a00 [management] Add notification endpoints (#5590) 2026-03-26 18:26:33 +03:00
Viktor Liu
145d82f322 [client] Replace iOS DNS IsPrivate heuristic with route manager check (#5694) v0.67.1 2026-03-26 18:11:05 +08:00
Viktor Liu
a8b9570700 [client] Enable RPM package signature verification in install script (#5676) 2026-03-26 09:50:43 +01:00
Viktor Liu
6ff6d84646 [client] Bump go-m1cpu to v0.2.1 to fix segfault on macOS 26 / M5 chips (#5701) 2026-03-26 09:49:02 +01:00
Viktor Liu
9aaa05e8ea Replace discontinued LocalStack image with MinIO in S3 test (#5680) 2026-03-25 15:51:29 +08:00
Bethuel Mmbaga
0af5a0441f [management] Fix DNS label uniqueness check on peer rename (#5679) 2026-03-24 20:25:29 +03:00
Viktor Liu
0fc63ea0ba [management] Allow multiple header auths with same header name (#5678) 2026-03-24 16:18:21 +01:00
Bethuel Mmbaga
0b329f7881 [management] Replace JumpCloud SDK with direct HTTP calls (#5591) 2026-03-24 13:21:42 +03:00
Viktor Liu
5b85edb753 [management] Omit proxy_protocol from API response when false (#5656)
The internal Target model uses a plain bool for ProxyProtocol,
which was always serialized to the API response as false even
when not configured. Only set the API field when true so it
gets omitted via omitempty when unset.
2026-03-23 17:53:17 +01:00
Maycon Santos
17cfa5fe1e [misc] Set signing env only if not fork and set license (#5659)
* Add condition to GPG key decoding to handle pull requests

* Add license field to deb and rpm package configurations

* Add condition to GPG key decoding for external pull requests
2026-03-23 17:16:23 +01:00
Viktor Liu
2313494e0e [client] Don't abort debug for command when up/down fails (#5657) 2026-03-23 14:04:03 +01:00
Viktor Liu
fd9d430334 [client] Simplify entrypoint by running netbird up unconditionally (#5652) v0.67.0 2026-03-23 09:39:32 +01:00
Zoltan Papp
91f0d5cefd [client] Feature/client metrics (#5512)
* Add client metrics

* Add client metrics system with OpenTelemetry and VictoriaMetrics support

Implements a comprehensive client metrics system to track peer connection
stages and performance. The system supports multiple backend implementations
(OpenTelemetry, VictoriaMetrics, and no-op) and tracks detailed connection
stage durations from creation through WireGuard handshake.

Key changes:
- Add metrics package with pluggable backend implementations
- Implement OpenTelemetry metrics backend
- Implement VictoriaMetrics metrics backend
- Add no-op metrics implementation for disabled state
- Track connection stages: creation, semaphore, signaling, connection ready, and WireGuard handshake
- Move WireGuard watcher functionality to conn.go
- Refactor engine to integrate metrics tracking
- Add metrics export endpoint in debug server

* Add signaling metrics tracking for initial and reconnection attempts

* Reset connection stage timestamps during reconnections to exclude unnecessary metrics tracking

* Delete otel lib from client

* Update unit tests

* Invoke callback on handshake success in WireGuard watcher

* Add Netbird version tracking to client metrics

Integrate Netbird version into VictoriaMetrics backend and metrics labels. Update `ClientMetrics` constructor and metric name formatting to include version information.

* Add sync duration tracking to client metrics

Introduce `RecordSyncDuration` for measuring sync message processing time. Update all metrics implementations (VictoriaMetrics, no-op) to support the new method. Refactor `ClientMetrics` to use `AgentInfo` for static agent data.

* Remove no-op metrics implementation and simplify ClientMetrics constructor

Eliminate unused `noopMetrics` and refactor `ClientMetrics` to always use the VictoriaMetrics implementation. Update associated logic to reflect these changes.

* Add total duration tracking for connection attempts

Calculate total duration for both initial connections and reconnections, accounting for different timestamp scenarios. Update `Export` method to include Prometheus HELP comments.

* Add metrics push support to VictoriaMetrics integration

* [client] anchor connection metrics to first signal received

* Remove creation_to_semaphore connection stage metric

The semaphore queuing stage (Created → SemaphoreAcquired) is no longer
tracked. Connection metrics now start from SignalingReceived. Updated
docs and Grafana dashboard accordingly.

* [client] Add remote push config for metrics with version-based eligibility

Introduce remoteconfig.Manager that fetches a remote JSON config to control
metrics push interval and restrict pushing to a specific agent version
range. When NB_METRICS_INTERVAL is set, remote config is bypassed
entirely for local override.

* [client] Add WASM-compatible NewClientMetrics implementation

Replace NewClientMetrics in metrics.go with a WASM-specific stub in metrics_js.go, returning nil for compatibility with JS builds. Simplify method usage for WASM targets.

* Add missing file

* Update default case in DeploymentType.String to return "unknown" instead of "selfhosted"

* [client] Rework metrics to use timestamped samples instead of histograms

Replace cumulative Prometheus histograms with timestamped point-in-time
samples that are pushed once and cleared. This fixes metrics for sparse
events (connections/syncs that happen once at startup) where rate() and
increase() produced incorrect or empty results.

Changes:
- Switch from VictoriaMetrics histogram library to raw Prometheus text
  format with explicit millisecond timestamps
- Reset samples after successful push (no resending stale data)
- Rename connection_to_handshake → connection_to_wg_handshake
- Add netbird_peer_connection_count metric for ICE vs Relay tracking
- Simplify dashboard: point-based scatter plots, donut pie chart
- Add maxStalenessInterval=1m to VictoriaMetrics to prevent forward-fill
- Fix deployment_type Unknown returning "selfhosted" instead of "unknown"
- Fix inverted shouldPush condition in push.go

* [client] Add InfluxDB metrics backend alongside VictoriaMetrics

Add influxdb.go with timestamped line protocol export for sparse
one-shot events. Restore victoria.go to use proper Prometheus
histograms. Update Grafana dashboards, add InfluxDB datasource,
and update docs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* [client] Fix metrics issues and update dev docker setup

- Fix StopPush not clearing push state, preventing restart
- Fix race condition reading currentConnPriority without lock in recordConnectionMetrics
- Fix stale comment referencing old metrics server URL
- Update docker-compose for InfluxDB: add scoped tokens, .env config, init scripts
- Rename docker-compose.victoria.yml to docker-compose.yml

* [client] Add anonymised peer tracking to pushed metrics

Introduce peer_id and connection_pair_id tags to InfluxDB metrics.
Public keys are hashed (truncated SHA-256) for anonymisation. The
connection pair ID is deterministic regardless of which side computes
it, enabling deduplication of reconnections in the ICE vs Relay
dashboard. Also pin Grafana to v11.6.0 for file-based provisioning
and fix datasource UID references.

* Remove unused dependencies from go.mod and go.sum

* Refactor InfluxDB ingest pipeline: extract validation logic

- Move line validation logic to `validateLine` and `validateField` helper functions.
- Improve error handling with structured validation and clearer separation of concerns.
- Add stderr redirection for error messages in `create-tokens.sh`.

* Set non-root user in Dockerfile for Ingest service

* Fix Windows CI: command line too long

* Remove Victoria metrics

* Add hashed peer ID as Authorization header in metrics push

* Revert influxdb in docker compose

* Enable gzip compression and authorization validation for metrics push and ingest

* Reducate code of complexity

* Update debug documentation to include metrics.txt description

* Increase `maxBodySize` limit to 50 MB and update gzip reader wrapping logic

* Refactor deployment type detection to use URL parsing for improved accuracy

* Update readme

* Throttle remote config retries on fetch failure

* Preserve first WG handshake timestamp, ignore rekeys

* Skip adding empty metrics.txt to debug bundle in debug mode

* Update default metrics server URL to https://ingest.netbird.io

* Atomic metrics export-and-reset to prevent sample loss between Export and Reset calls

* Fix doc

* Refactor Push configuration to improve clarity and enforce minimum push interval

* Remove `minPushInterval` and update push interval validation logic

* Revert ExportAndReset, it is acceptable data loss

* Fix metrics review issues: rename env var, remove stale infra, add tests

- Rename NB_METRICS_ENABLED to NB_METRICS_PUSH_ENABLED to clarify that
  collection is always active (for debug bundles) and only push is opt-in
- Change default config URL from staging to production (ingest.netbird.io)
- Delete broken Prometheus dashboard (used non-existent metric names)
- Delete unused VictoriaMetrics datasource config
- Replace committed .env with .env.example containing placeholder values
- Wire Grafana admin credentials through env vars in docker-compose
- Make metricsStages a pointer to prevent reset-vs-write race on reconnect
- Fix typed-nil interface in debug bundle path (GetClientMetrics)
- Use deterministic field order in InfluxDB Export (sorted keys)
- Replace Authorization header with X-Peer-ID for metrics push
- Fix ingest server timeout to use time.Second instead of float
- Fix gzip double-close, stale comments, trim log levels
- Add tests for influxdb.go and MetricsStages

* Add login duration metric, ingest tag validation, and duration bounds

- Add netbird_login measurement recording login/auth duration to management
  server, with success/failure result tag
- Validate InfluxDB tags against per-measurement allowlists in ingest server
  to prevent arbitrary tag injection
- Cap all duration fields (*_seconds) at 300s instead of only total_seconds
- Add ingest server tests for tag/field validation, bounds, and auth

* Add arch tag to all metrics

* Fix Grafana dashboard: add arch to drop columns, add login panels

* Validate NB_METRICS_SERVER_URL is an absolute HTTP(S) URL

* Address review comments: fix README wording, update stale comments

* Clarify env var precedence does not bypass remote config eligibility

* Remove accidentally committed pprof files

---------

Co-authored-by: Viktor Liu <viktor@netbird.io>
2026-03-22 12:45:41 +01:00
Viktor Liu
82762280ee [client] Add health check flag to status command and expose daemon status in output (#5650) 2026-03-22 12:39:40 +01:00
Viktor Liu
b550a2face [management, proxy] Add require_subdomain capability for proxy clusters (#5628) 2026-03-20 11:29:50 +01:00
Viktor Liu
ab77508950 [client] Add env var for management gRPC max receive message size (#5622) 2026-03-19 17:33:50 +01:00
Viktor Liu
b9462f5c6b [client] Make raw table initialization non-fatal in firewall managers (#5621) 2026-03-19 17:33:38 +01:00
Viktor Liu
5ffaa5cdd6 [client] Fix duplicate log lines in containers (#5609) 2026-03-19 15:53:05 +01:00
Pascal Fischer
a1858a9cb7 [management] recover proxies after cleanup if heartbeat is still running (#5617) 2026-03-18 11:48:38 +01:00
Viktor Liu
212b34f639 [management] Add GET /reverse-proxies/clusters endpoint (#5611) 2026-03-18 11:15:56 +08:00