Gate the peer-sync fast path on a runtime flag polled from Redis so operators can roll the optimisation out gradually and flip it off without a redeploy.
Without NB_PEER_SYNC_REDIS_ADDRESS the routine stays disabled, every Sync runs the full network map path, and no entries accumulate in the peer serial cache — bit-for-bit identical to the pre-fast-path behaviour. When the env var is set, a background goroutine polls the configured key (default "peerSyncFastPath") every minute; values "1" or "true" enable the fast path, anything else disables it.
- RunFastPathFlagRoutine mirrors shared/logleveloverrider: dedicated Redis connection, background ticker, redis.Nil treated as disabled.
- NewServer takes the flag handle; tryFastPathSync and the recordPeerSyncEntry helpers short-circuit when Enabled() is false.
- invalidatePeerSyncEntry still runs on Login regardless of flag state.
- NewFastPathFlag(bool) exposed for tests and callers that need to force a state without going through Redis.
Introduce a peer-sync cache keyed by WireGuard pubkey that records the
NetworkMap.Serial and meta hash the server last delivered to each peer.
When a Sync request arrives from a non-Android peer whose cached serial
matches the current account serial and whose meta hash matches the last
delivery, short-circuit SyncAndMarkPeer and reply with a NetbirdConfig-only
SyncResponse mirroring the shape TimeBasedAuthSecretsManager already pushes
for TURN/Relay token rotation. The client keeps its existing network map
state and refreshes only control-plane credentials.
The fast path avoids GetAccountWithBackpressure, the full per-peer map
assembly, posture-check recomputation and the large encrypted payload on
every reconnect of a peer whose account is quiescent. Slow path remains
the source of truth for any real state change; every full-map send (initial
sync or streamed NetworkMap update) rewrites the cache, and every Login
deletes it so a fresh map is guaranteed after SSH key rotation, approval
changes or re-registration.
Backend-only: no proto changes and no client changes. Compatibility is
provided by the existing client handling of nil NetworkMap in handleSync
(every version from v0.20.0 on). Android is gated out at the server because
its readInitialSettings path calls GrpcClient.GetNetworkMap which errors on
nil map. The cache is wired through BaseServer.CacheStore() so it shares
the same Redis/in-memory backend as OneTimeTokenStore and PKCEVerifierStore.
Test coverage lands in four layers:
- Pure decision function (peer_serial_cache_decision_test.go)
- Cache wrapper with TTL + concurrency (peer_serial_cache_test.go)
- Response shape unit tests (sync_fast_path_response_test.go)
- In-process gRPC behavioural tests covering first sync, reconnect skip,
android never-skip, meta change, login invalidation, and serial advance
(management/server/sync_fast_path_test.go)
- Frozen SyncRequest wire-format fixtures for v0.20.0 / v0.40.0 / v0.60.0
/ current / android replayed against the in-process server
(management/server/sync_legacy_wire_test.go + testdata fixtures)
* Add support for legacy IDP cache environment variable
* Centralize cache store creation to reuse a single Redis connection pool
Each cache consumer (IDP cache, token store, PKCE store, secrets manager,
EDR validator) was independently calling NewStore, creating separate Redis
clients with their own connection pools — up to 1400 potential connections
from a single management server process.
Introduce a shared CacheStore() singleton on BaseServer that creates one
store at boot and injects it into all consumers. Consumer constructors now
receive a store.StoreInterface instead of creating their own.
For Redis mode, all consumers share one connection pool (1000 max conns).
For in-memory mode, all consumers share one GoCache instance.
* Update management-integrations module to latest version
* sync go.sum
* Export `GetAddrFromEnv` to allow reuse across packages
* Update management-integrations module version in go.mod and go.sum
* Update management-integrations module version in go.mod and go.sum
* Add gRPC update debouncing mechanism
Implements backpressure handling for peer network map updates to
efficiently handle rapid changes. First update is sent immediately,
subsequent rapid updates are coalesced, ensuring only the latest
update is sent after a 1-second quiet period.
* Enhance unit test to verify peer count synchronization with debouncing and timeout handling
* Debounce based on type
* Refactor test to validate timer restart after pending update dispatch
* Simplify timer reset for Go 1.23+ automatic channel draining
Remove manual channel drain in resetTimer() since Go 1.23+ automatically
drains the timer channel when Stop() returns false, making the
select-case pattern unnecessary.
Embed Dex as a built-in IdP to simplify self-hosting setup.
Adds an embedded OIDC Identity Provider (Dex) with local user management and optional external IdP connectors (Google/GitHub/OIDC/SAML), plus device-auth flow for CLI login. Introduces instance onboarding/setup endpoints (including owner creation), field-level encryption for sensitive user data, a streamlined self-hosting provisioning script, and expanded APIs + test coverage for IdP management.
more at https://github.com/netbirdio/netbird/pull/5008#issuecomment-3718987393
This PR adds a validate flow response feature to the management server by integrating an IntegratedValidator component. The main purpose is to enable validation of PKCE authorization flows through an integrated validator interface.
- Adds a new ValidateFlowResponse method to the IntegratedValidator interface
- Integrates the validator into the management server to validate PKCE authorization flows
- Updates dependency version for management-integrations
This PR introduces a new configuration option `DisableDefaultPolicy` that prevents the creation of the default all-to-all policy when new accounts are created. This is useful for automation scenarios where explicit policies are preferred.
### Key Changes:
- Added DisableDefaultPolicy flag to the management server config
- Modified account creation logic to respect this flag
- Updated all test cases to explicitly pass the flag (defaulting to false to maintain backward compatibility)
- Propagated the flag through the account manager initialization chain
### Testing:
- Verified default behavior remains unchanged when flag is false
- Confirmed no default policy is created when flag is true
- All existing tests pass with the new parameter
This PR fixes configuration inconsistencies and updates the store engine type usage throughout the management code. Key changes include:
- Replacing outdated server.Config references with types.Config and updating related flag variables (e.g. types.MgmtConfigPath).
- Converting engine constants (SqliteStoreEngine, PostgresStoreEngine, MysqlStoreEngine) to use types.Engine for consistent type–safety.
- Adjusting various test and migration code paths to correctly reference the new configuration and engine types.
adds NetFlow functionality to track and log network traffic information between peers, with features including:
- Flow logging for TCP, UDP, and ICMP traffic
- Integration with connection tracking system
- Resource ID tracking in NetFlow events
- DNS and exit node collection configuration
- Flow API and Redis cache in management
- Memory-based flow storage implementation
- Kernel conntrack counters and userspace counters
- TCP state machine improvements for more accurate tracking
- Migration from net.IP to netip.Addr in the userspace firewall
This update adds new relay integration for NetBird clients. The new relay is based on web sockets and listens on a single port.
- Adds new relay implementation with websocket with single port relaying mechanism
- refactor peer connection logic, allowing upgrade and downgrade from/to P2P connection
- peer connections are faster since it connects first to relay and then upgrades to P2P
- maintains compatibility with old clients by not using the new relay
- updates infrastructure scripts with new relay service
* compile client under freebsd (#1620)
Compile netbird client under freebsd and now support netstack and userspace modes.
Refactoring linux specific code to share same code with FreeBSD, move to *_unix.go files.
Not implemented yet:
Kernel mode not supported
DNS probably does not work yet
Routing also probably does not work yet
SSH support did not tested yet
Lack of test environment for freebsd (dedicated VM for github runners under FreeBSD required)
Lack of tests for freebsd specific code
info reporting need to review and also implement, for example OS reported as GENERIC instead of FreeBSD (lack of FreeBSD icon in management interface)
Lack of proper client setup under FreeBSD
Lack of FreeBSD port/package
* Add DNS routes (#1943)
Given domains are resolved periodically and resolved IPs are replaced with the new ones. Unless the flag keep_route is set to true, then only new ones are added.
This option is helpful if there are long-running connections that might still point to old IP addresses from changed DNS records.
* Add process posture check (#1693)
Introduces a process posture check to validate the existence and active status of specific binaries on peer systems. The check ensures that files are present at specified paths, and that corresponding processes are running. This check supports Linux, Windows, and macOS systems.
Co-authored-by: Evgenii <mail@skillcoder.com>
Co-authored-by: Pascal Fischer <pascal@netbird.io>
Co-authored-by: Zoltan Papp <zoltan.pmail@gmail.com>
Co-authored-by: Viktor Liu <17948409+lixmal@users.noreply.github.com>
Co-authored-by: Bethuel Mmbaga <bethuelmbaga12@gmail.com>
* migrate sqlite store to
generic sql store
* fix conflicts
* init postgres store
* Add postgres store tests
* Refactor postgres store engine name
* fix tests
* Run postgres store tests on linux only
* fix tests
* Refactor
* cascade policy rules on policy deletion
* fix tests
* run postgres cases in new db
* close store connection after tests
* refactor
* using testcontainers
* sync go sum
* remove postgres service
* remove store cleanup
* go mod tidy
* remove env
* use postgres as engine and initialize test store with testcontainer
---------
Co-authored-by: Maycon Santos <mlsmaycon@gmail.com>
This PR implements the following posture checks:
* Agent minimum version allowed
* OS minimum version allowed
* Geo-location based on connection IP
For the geo-based location, we rely on GeoLite2 databases which are free IP geolocation databases. MaxMind was tested and we provide a script that easily allows to download of all necessary files, see infrastructure_files/download-geolite2.sh.
The OpenAPI spec should extensively cover the life cycle of current version posture checks.
With this change we should be able to collect and expose the following histograms:
* `management.updatechannel.create.duration.ms` with `closed` boolean label
* `management.updatechannel.create.duration.micro` with `closed` boolean label
* `management.updatechannel.close.one.duration.ms`
* `management.updatechannel.close.one.duration.micro`
* `management.updatechannel.close.multiple.duration.ms`
* `management.updatechannel.close.multiple.duration.micro`
* `management.updatechannel.close.multiple.channels`
* `management.updatechannel.send.duration.ms` with `found` and `dropped` boolean labels
* `management.updatechannel.send.duration.micro` with `found` and `dropped` boolean labels
* `management.updatechannel.get.all.duration.ms`
* `management.updatechannel.get.all.duration.micro`
* `management.updatechannel.get.all.peers`
Restructure data handling for improved performance and flexibility.
Introduce 'G'-prefixed fields to represent Gorm relations, simplifying resource management.
Eliminate complexity in lookup tables for enhanced query and write speed.
Enable independent operations on data structures, requiring adjustments in the Store interface and Account Manager.
Implement user deletion across all IDP-ss. Expires all user peers
when the user is deleted. Users are permanently removed from a local
store, but in IDP, we remove Netbird attributes for the user
untilUserDeleteFromIDPEnabled setting is not enabled.
To test, an admin user should remove any additional users.
Until the UI incorporates this feature, use a curl DELETE request
targeting the /users/<USER_ID> management endpoint. Note that this
request only removes user attributes and doesn't trigger a delete
from the IDP.
To enable user removal from the IdP, set UserDeleteFromIDPEnabled
to true in account settings. Until we have a UI for this, make this
change directly in the store file.
Store the deleted email addresses in encrypted in activity store.
The ephemeral manager keep the inactive ephemeral peers in a linked list. The manager schedule a cleanup procedure to the head of the linked list (to the most deprecated peer). At the end of cleanup schedule the next cleanup to the new head.
If a device connect back to the server the manager will remote it from the peers list.
The peer login expiration ACL check introduced in #714
filters out peers that are expired and agents receive a network map
without that expired peers.
However, the agents should see those peers in status "Disconnected".
This PR extends the Agent <-> Management protocol
by introducing a new field OfflinePeers
that contain expired peers. Agents keep track of those and display
then just in the Status response.
The Management gRPC API has too much business logic
happening while it has to be in the Account manager.
This also needs to make more requests to the store
through the account manager.
This PR adds system activity tracking.
The management service records events like
add/remove peer, group, rule, route, etc.
The activity events are stored in the SQLite event store
and can be queried by the HTTP API.