netbird

mirror of https://github.com/netbirdio/netbird.git synced 2026-04-16 07:16:38 +00:00

Go to file

Zoltan Papp 13539543af [client] Fix/grpc retry (#5750 )

* [client] Fix flow client Receive retry loop not stopping after Close

Use backoff.Permanent for canceled gRPC errors so Receive returns
immediately instead of retrying until context deadline when the
connection is already closed. Add TestNewClient_PermanentClose to
verify the behavior.

The connectivity.Shutdown check was meaningless because when the connection is
shut down, c.realClient.Events(ctx, grpc.WaitForReady(true)) on the nex line
already fails with codes.Canceled — which is now handled as a permanent error.
The explicit state check was just duplicating what gRPC already reports
through its normal error path.

* [client] remove WaitForReady from stream open call

grpc.WaitForReady(true) parks the RPC call internally until the
connection reaches READY, only unblocking on ctx cancellation.
This means the external backoff.Retry loop in Receive() never gets
control back during a connection outage — it cannot tick, log, or
apply its retry intervals while WaitForReady is blocking.

Removing it restores fail-fast behaviour: Events() returns immediately
with codes.Unavailable when the connection is not ready, which is
exactly what the backoff loop expects. The backoff becomes the single
authority over retry timing and cadence, as originally intended.

* [client] Add connection recreation and improve flow client error handling

Store gRPC dial options on the client to enable connection recreation
on Internal errors (RST_STREAM/PROTOCOL_ERROR). Treat Unauthenticated,
PermissionDenied, and Unimplemented as permanent failures. Unify mutex
usage and add reconnection logging for better observability.

* [client] Remove Unauthenticated, PermissionDenied, and Unimplemented from permanent error handling

* [client] Fix error handling in Receive to properly re-establish stream and improve reconnection messaging

* Fix test

* [client] Add graceful shutdown handling and test for concurrent Close during Receive

Prevent reconnection attempts after client closure by tracking a `closed` flag. Use `backoff.Permanent` for errors caused by operations on a closed client. Add a test to ensure `Close` does not block when `Receive` is actively running.

* [client] Fix connection swap to properly close old gRPC connection

Close the old `gRPC.ClientConn` after successfully swapping to a new connection during reconnection.

* [client] Reset backoff

* [client] Ensure stream closure on error during initialization

* [client] Add test for handling server-side stream closure and reconnection

Introduce `TestReceive_ServerClosesStream` to verify the client's ability to recover and process acknowledgments after the server closes the stream. Enhance test server with a controlled stream closure mechanism.

* [client] Add protocol error simulation and enhance reconnection test

Introduce `connTrackListener` to simulate HTTP/2 RST_STREAM with PROTOCOL_ERROR for testing. Refactor and rename `TestReceive_ServerClosesStream` to `TestReceive_ProtocolErrorStreamReconnect` to verify client recovery on protocol errors.

* [client] Update Close error message in test for clarity

* [client] Fine-tune the tests

* [client] Adjust connection tracking in reconnection test

* [client] Wait for Events handler to exit in RST_STREAM reconnection test

Ensure the old `Events` handler exits fully before proceeding in the reconnection test to avoid dropped acknowledgments on a broken stream. Add a `handlerDone` channel to synchronize handler exits.

* [client] Prevent panic on nil connection during Close

* [client] Refactor connection handling to use explicit target tracking

Introduce `target` field to store the gRPC connection target directly, simplifying reconnections and ensuring consistent connection reuse logic.

* [client] Rename `isCancellation` to `isContextDone` and extend handling for `DeadlineExceeded`

Refactor error handling to include `DeadlineExceeded` scenarios alongside `Canceled`. Update related condition checks for consistency.

* [client] Add connection generation tracking to prevent stale reconnections

Introduce `connGen` to track connection generations and ensure that stale `recreateConnection` calls do not override newer connections. Update stream establishment and reconnection logic to incorporate generation validation.

* [client] Add backoff reset condition to prevent short-lived retry cycles

Refine backoff reset logic to ensure it only occurs for sufficiently long-lived stream connections, avoiding interference with `MaxElapsedTime`.

* [client] Introduce `minHealthyDuration` to refine backoff reset logic

Add `minHealthyDuration` constant to ensure stream retries only reset the backoff timer if the stream survives beyond a minimum duration. Prevents unhealthy, short-lived streams from interfering with `MaxElapsedTime`.

* [client] IPv6 friendly connection

parsedURL.Hostname() strips IPv6 brackets. For http://[::1]:443, this turns it into ::1:443, which is not a valid host:port target for gRPC. Additionally, fmt.Sprintf("%s:%s", hostname, port) produces a trailing colon when the URL has no explicit port—http://example.com becomes example.com:. Both cases break the initial dial and reconnect paths. Use parsedURL.Host directly instead.

* [client] Add `handlerStarted` channel to synchronize stream establishment in tests

Introduce `handlerStarted` channel in the test server to signal when the server-side handler begins, ensuring robust synchronization between client and server during stream establishment. Update relevant test cases to wait for this signal before proceeding.

* [client] Replace `receivedAcks` map with atomic counter and improve stream establishment sync in tests

Refactor acknowledgment tracking in tests to use an `atomic.Int32` counter instead of a map. Replace fixed sleep with robust synchronization by waiting on `handlerStarted` signal for stream establishment.

* [client] Extract `handleReceiveError` to simplify receive logic

Refactor error handling in `receive` to a dedicated `handleReceiveError` method. Streamlines the main logic and isolates error recovery, including backoff reset and connection recreation.

* [client] recreate gRPC ClientConn on every retry to prevent dual backoff

The flow client had two competing retry loops: our custom exponential
backoff and gRPC's internal subchannel reconnection. When establishStream
failed, the same ClientConn was reused, allowing gRPC's internal backoff
state to accumulate and control dial timing independently.

Changes:
- Consolidate error handling into handleRetryableError, which now
handles context cancellation, permanent errors, backoff reset,
and connection recreation in a single path
- Call recreateConnection on every retryable error so each retry
gets a fresh ClientConn with no internal backoff state
- Remove connGen tracking since Receive is sequential and protected
by a new receiving guard against concurrent calls
- Reduce RandomizationFactor from 1 to 0.5 to avoid near-zero
backoff intervals

2026-04-13 10:42:24 +02:00

.devcontainer

Revert "Revert "[relay] Update GO version and QUIC version (#4736 )" (#5055 )" (#5071 )

2026-01-08 18:58:22 +01:00

.githooks

[ci] Add local lint setup with pre-push hook to catch issues early (#4925 )

2025-12-15 10:34:48 +01:00

.github

[management] Legacy to embedded IdP migration tool (#5586 )

2026-04-01 13:53:19 +02:00

base62

Update GitHub Actions and Enhance golangci-lint (#1075 )

2023-09-04 17:03:44 +02:00

client

Fix Android internet blackhole caused by stale route re-injection on TUN rebuild (#5865 )

2026-04-13 09:38:38 +02:00

combined

[management] enable access log cleanup by default (#5842 )

2026-04-10 17:07:27 +02:00

dns

[client] Fall through dns chain for custom dns zones (#5081 )

2026-01-12 13:56:39 +01:00

docs/media

Update README.md (#524 )

2022-10-22 16:19:16 +02:00

encryption

[client,signal,management] Add browser client support (#4415 )

2025-10-01 20:10:11 +02:00

flow

[client] Fix/grpc retry (#5750 )

2026-04-13 10:42:24 +02:00

formatter

[misc] Update timestamp format with milliseconds (#5387 )

2026-02-19 11:23:42 +01:00

idp

[management] Legacy to embedded IdP migration tool (#5586 )

2026-04-01 13:53:19 +02:00

infrastructure_files

[misc] update dashboards (#5840 )

2026-04-10 12:15:58 +02:00

LICENSES

Dual license: apply AGPL‑3.0 to management/, signal/, and relay directories (BSD‑3 remains for the rest)

2025-08-05 11:37:21 +02:00

management

[management] add domain and service cleanup migration (#5850 )

2026-04-11 12:00:40 +02:00

monotime

[client] Fix elapsed time calculation when machine is in sleep mode (#4140 )

2025-07-12 11:10:45 +02:00

proxy

[proxy] Update proxy web packages (#5661 )

2026-04-07 10:35:09 +02:00

relay

[relay] Replace net.Conn with context-aware Conn interface (#5770 )

2026-04-08 09:38:31 +02:00

release_files

[client] Enable RPM package signature verification in install script (#5676 )

2026-03-26 09:50:43 +01:00

route

[management] incremental network map builder (#4753 )

2025-11-07 10:44:46 +01:00

shared

[client] Update RaceDial to accept context for improved cancellation handling (#5849 )

2026-04-10 20:51:04 +02:00

sharedsock

[client] Add IPv6 support to usersace bind (#5147 )

2026-01-22 10:20:43 +08:00

signal

[self-hosted] add netbird server (#5232 )

2026-02-12 19:24:43 +01:00

stun

[self-hosted] add netbird server (#5232 )

2026-02-12 19:24:43 +01:00

tools/idp-migrate

[management] Legacy to embedded IdP migration tool (#5586 )

2026-04-01 13:53:19 +02:00

upload-server

[misc] add path traversal and file size protections (#5755 )

2026-04-01 14:23:24 +02:00

util

[client] Fix duplicate log lines in containers (#5609 )

2026-03-19 15:53:05 +01:00

version

[client, management] auto-update (#4732 )

2025-12-19 19:57:39 +01:00

.dockerignore

[management, reverse proxy] Add reverse proxy feature (#5291 )

2026-02-13 19:37:43 +01:00

.dockerignore-client

client: container: support CLI with entrypoint addition (#4126 )

2025-07-25 11:44:30 +02:00

.editorconfig

Fix syslog output containing duplicated timestamps (#2292 )

2024-08-01 18:22:02 +02:00

.git-branches.toml

add git town config (#3555 )

2025-04-09 20:18:52 +01:00

.gitattributes

Run linter action on MacOS and Windows (#1198 )

2023-10-07 21:45:46 +02:00

.gitignore

[management, reverse proxy] Add reverse proxy feature (#5291 )

2026-02-13 19:37:43 +01:00

.gitmodules

[client,signal,management] Add browser client support (#4415 )

2025-10-01 20:10:11 +02:00

.golangci.yaml

Revert "Revert "[relay] Update GO version and QUIC version (#4736 )" (#5055 )" (#5071 )

2026-01-08 18:58:22 +01:00

.goreleaser_ui_darwin.yaml

[client] Add universal bin build and update sign workflow version (#2738 )

2024-10-15 15:03:17 +02:00

.goreleaser_ui.yaml

[misc] Add GPG signing key support for rpm packages (#5581 )

2026-03-13 09:47:00 +01:00

.goreleaser.yaml

[management] Legacy to embedded IdP migration tool (#5586 )

2026-04-01 13:53:19 +02:00

AUTHORS

[misc, client, management] Replace Wiretrustee with Netbird (#3267 )

2025-02-05 16:49:41 +01:00

CODE_OF_CONDUCT.md

Update CODE_OF_CONDUCT.md (#2048 )

2024-05-24 17:29:14 +02:00

CONTRIBUTING.md

[ci] Add local lint setup with pre-push hook to catch issues early (#4925 )

2025-12-15 10:34:48 +01:00

CONTRIBUTOR_LICENSE_AGREEMENT.md

[docs] Update CONTRIBUTOR_LICENSE_AGREEMENT.md (#5131 )

2026-03-31 09:31:03 +02:00

funding.json

Create funding.json (#2813 )

2024-10-30 17:18:27 +01:00

go.mod

[client] Add NAT-PMP/UPnP support (#5202 )

2026-04-08 15:29:32 +08:00

go.sum

[client] Add NAT-PMP/UPnP support (#5202 )

2026-04-08 15:29:32 +08:00

LICENSE

[management, reverse proxy] Add reverse proxy feature (#5291 )

2026-02-13 19:37:43 +01:00

Makefile

[ci] Add local lint setup with pre-push hook to catch issues early (#4925 )

2025-12-15 10:34:48 +01:00

README.md

[misc] Add netbird-tui to community projects (#5568 )

2026-03-17 05:33:13 +01:00

SECURITY.md

Add security policy file (#600 )

2022-12-02 13:54:22 +01:00

versioninfo.json

Add release version to windows binaries and update sign pipeline version (#2256 )

2024-07-11 19:06:55 +02:00

README.md

Start using NetBird at netbird.io
See Documentation
Join our Slack channel or our Community forum

🚀 We are hiring! Join us at careers.netbird.io

New: NetBird terraform provider

NetBird combines a configuration-free peer-to-peer private network and a centralized access control system in a single platform, making it easy to create secure private networks for your organization or home.

Connect. NetBird creates a WireGuard-based overlay network that automatically connects your machines over an encrypted tunnel, leaving behind the hassle of opening ports, complex firewall rules, VPN gateways, and so forth.

Secure. NetBird enables secure remote access by applying granular access policies while allowing you to manage them intuitively from a single place. Works universally on any infrastructure.

Open Source Network Security in a Single Platform

https://github.com/user-attachments/assets/10cec749-bb56-4ab3-97af-4e38850108d2

Self-Host NetBird (Video)

Key features

Connectivity	Management	Security	Automation	Platforms
- [x] Kernel WireGuard	- [x] Admin Web UI	- [x] SSO & MFA support	- [x] Public API	- [x] Linux
- [x] Peer-to-peer connections	- [x] Auto peer discovery and configuration	- [x] Access control - groups & rules	- [x] Setup keys for bulk network provisioning	- [x] Mac
- [x] Connection relay fallback	- [x] IdP integrations	- [x] Activity logging	- [x] Self-hosting quickstart script	- [x] Windows
- [x] Routes to external networks	- [x] Private DNS	- [x] Device posture checks	- [x] IdP groups sync with JWT	- [x] Android
- [x] NAT traversal with BPF	- [x] Multiuser support	- [x] Peer-to-peer encryption		- [x] iOS
		- [x] Quantum-resistance with Rosenpass		- [x] OpenWRT
		- [x] Periodic re-authentication		- [x] Serverless
				- [x] Docker

Quickstart with NetBird Cloud

Download and install NetBird at https://app.netbird.io/install
Follow the steps to sign-up with Google, Microsoft, GitHub or your email address.
Check NetBird admin UI.
Add more machines.

Quickstart with self-hosted NetBird

This is the quickest way to try self-hosted NetBird. It should take around 5 minutes to get started if you already have a public domain and a VM. Follow the Advanced guide with a custom identity provider for installations with different IDPs.

Infrastructure requirements:

A Linux VM with at least 1CPU and 2GB of memory.
The VM should be publicly accessible on TCP ports 80 and 443 and UDP port: 3478.
Public domain name pointing to the VM.

Software requirements:

Docker installed on the VM with the docker-compose plugin (Docker installation guide) or docker with docker-compose in version 2 or higher.
jq installed. In most distributions Usually available in the official repositories and can be installed with sudo apt install jq or sudo yum install jq
curl installed. Usually available in the official repositories and can be installed with sudo apt install curl or sudo yum install curl

Steps

Download and run the installation script:

export NETBIRD_DOMAIN=netbird.example.com; curl -fsSL https://github.com/netbirdio/netbird/releases/latest/download/getting-started.sh | bash

Once finished, you can manage the resources via docker-compose

A bit on NetBird internals

Every machine in the network runs NetBird Agent (or Client) that manages WireGuard.
Every agent connects to Management Service that holds network state, manages peer IPs, and distributes network updates to agents (peers).
NetBird agent uses WebRTC ICE implemented in pion/ice library to discover connection candidates when establishing a peer-to-peer connection between machines.
Connection candidates are discovered with the help of STUN servers.
Agents negotiate a connection through Signal Service passing p2p encrypted messages with candidates.
Sometimes the NAT traversal is unsuccessful due to strict NATs (e.g. mobile carrier-grade NAT) and a p2p connection isn't possible. When this occurs the system falls back to a relay server called TURN, and a secure WireGuard tunnel is established via the TURN server.

Coturn is the one that has been successfully used for STUN and TURN in NetBird setups.

See a complete architecture overview for details.

Community projects

NetBird installer script
NetBird ansible collection by Dominion Solutions
netbird-tui — terminal UI for managing NetBird peers, routes, and settings

Note: The main branch may be in an unstable or even broken state during development. For stable versions, see releases.

Support acknowledgement

In November 2022, NetBird joined the StartUpSecure program sponsored by The Federal Ministry of Education and Research of The Federal Republic of Germany. Together with CISPA Helmholtz Center for Information Security NetBird brings the security best practices and simplicity to private networking.

Testimonials

We use open-source technologies like WireGuard®, Pion ICE (WebRTC), Coturn, and Rosenpass. We very much appreciate the work these guys are doing and we'd greatly appreciate if you could support them in any way (e.g., by giving a star or a contribution).

Legal

This repository is licensed under BSD-3-Clause license that applies to all parts of the repository except for the directories management/, signal/ and relay/. Those directories are licensed under the GNU Affero General Public License version 3.0 (AGPLv3). See the respective LICENSE files inside each directory.

WireGuard and the WireGuard logo are registered trademarks of Jason A. Donenfeld.