Commit Graph

1080 Commits

Author SHA1 Message Date
mlsmaycon
3eb1298cb4 Refactor sync fast path tests and fix CI flakiness
- Introduce `skipOnWindows` helper to properly skip tests relying on Unix specific paths.
- Replace fixed sleep with `require.Eventually` in `waitForPeerDisconnect` to address flakiness in CI.
- Split `commitFastPath` logic out of `runFastPathSync` to close race conditions and improve clarity.
- Update tests to leverage new helpers and more precise assertions (e.g., `waitForPeerDisconnect`).
- Add `flakyStore` test helper to exercise fail-closed behavior in flag handling.
- Enhance `RunFastPathFlagRoutine` to disable the flag on store read errors.
2026-04-21 17:07:31 +02:00
mlsmaycon
93391fc68f generate only current.bin and android_current.bin on ci/cd 2026-04-21 16:49:54 +02:00
mlsmaycon
48c080b861 Replace Redis dependency with a generic cache store for fast path flag handling 2026-04-21 16:28:24 +02:00
mlsmaycon
3716838c25 Remove unused cacheKey helper and testcontainers imports, simplify Redis container setup 2026-04-21 16:17:31 +02:00
mlsmaycon
8430b06f2a [management] Add Redis-backed kill switch for Sync fast path
Gate the peer-sync fast path on a runtime flag polled from Redis so operators can roll the optimisation out gradually and flip it off without a redeploy.

Without NB_PEER_SYNC_REDIS_ADDRESS the routine stays disabled, every Sync runs the full network map path, and no entries accumulate in the peer serial cache — bit-for-bit identical to the pre-fast-path behaviour. When the env var is set, a background goroutine polls the configured key (default "peerSyncFastPath") every minute; values "1" or "true" enable the fast path, anything else disables it.

- RunFastPathFlagRoutine mirrors shared/logleveloverrider: dedicated Redis connection, background ticker, redis.Nil treated as disabled.
- NewServer takes the flag handle; tryFastPathSync and the recordPeerSyncEntry helpers short-circuit when Enabled() is false.
- invalidatePeerSyncEntry still runs on Login regardless of flag state.
- NewFastPathFlag(bool) exposed for tests and callers that need to force a state without going through Redis.
2026-04-21 15:52:34 +02:00
mlsmaycon
3f4ef0031b [management] Skip full network map on Sync when peer state is unchanged
Introduce a peer-sync cache keyed by WireGuard pubkey that records the
NetworkMap.Serial and meta hash the server last delivered to each peer.
When a Sync request arrives from a non-Android peer whose cached serial
matches the current account serial and whose meta hash matches the last
delivery, short-circuit SyncAndMarkPeer and reply with a NetbirdConfig-only
SyncResponse mirroring the shape TimeBasedAuthSecretsManager already pushes
for TURN/Relay token rotation. The client keeps its existing network map
state and refreshes only control-plane credentials.

The fast path avoids GetAccountWithBackpressure, the full per-peer map
assembly, posture-check recomputation and the large encrypted payload on
every reconnect of a peer whose account is quiescent. Slow path remains
the source of truth for any real state change; every full-map send (initial
sync or streamed NetworkMap update) rewrites the cache, and every Login
deletes it so a fresh map is guaranteed after SSH key rotation, approval
changes or re-registration.

Backend-only: no proto changes and no client changes. Compatibility is
provided by the existing client handling of nil NetworkMap in handleSync
(every version from v0.20.0 on). Android is gated out at the server because
its readInitialSettings path calls GrpcClient.GetNetworkMap which errors on
nil map. The cache is wired through BaseServer.CacheStore() so it shares
the same Redis/in-memory backend as OneTimeTokenStore and PKCEVerifierStore.

Test coverage lands in four layers:
- Pure decision function (peer_serial_cache_decision_test.go)
- Cache wrapper with TTL + concurrency (peer_serial_cache_test.go)
- Response shape unit tests (sync_fast_path_response_test.go)
- In-process gRPC behavioural tests covering first sync, reconnect skip,
  android never-skip, meta change, login invalidation, and serial advance
  (management/server/sync_fast_path_test.go)
- Frozen SyncRequest wire-format fixtures for v0.20.0 / v0.40.0 / v0.60.0
  / current / android replayed against the in-process server
  (management/server/sync_legacy_wire_test.go + testdata fixtures)
2026-04-17 16:20:04 +02:00
Nicolas Frati
8ae8f2098f [management] chores: fix lint error on google workspace (#5907)
* chores: fix lint error on google workspace

* chores: updated google api dependency

* update google golang api sdk to latest
2026-04-16 20:02:09 +02:00
Maycon Santos
53b04e512a [management] Reuse a single cache store across all management server consumers (#5889)
* Add support for legacy IDP cache environment variable

* Centralize cache store creation to reuse a single Redis connection pool

Each cache consumer (IDP cache, token store, PKCE store, secrets manager,
EDR validator) was independently calling NewStore, creating separate Redis
clients with their own connection pools — up to 1400 potential connections
from a single management server process.

Introduce a shared CacheStore() singleton on BaseServer that creates one
store at boot and injects it into all consumers. Consumer constructors now
receive a store.StoreInterface instead of creating their own.

For Redis mode, all consumers share one connection pool (1000 max conns).
For in-memory mode, all consumers share one GoCache instance.

* Update management-integrations module to latest version

* sync go.sum

* Export `GetAddrFromEnv` to allow reuse across packages

* Update management-integrations module version in go.mod and go.sum

* Update management-integrations module version in go.mod and go.sum
2026-04-16 16:04:53 +02:00
Bethuel Mmbaga
08f624507d [management] Enforce peer or peer groups requirement for network routers (#5894) 2026-04-16 13:12:19 +03:00
Pascal Fischer
c5623307cc [management] add context cancel monitoring (#5879) 2026-04-14 12:49:18 +02:00
Vlad
7f666b8022 [management] revert ctx dependency in get account with backpressure (#5878) 2026-04-14 12:16:03 +02:00
Viktor Liu
0a30b9b275 [management, proxy] Add CrowdSec IP reputation integration for reverse proxy (#5722) 2026-04-14 12:14:58 +02:00
Pascal Fischer
5259e5df51 [management] add domain and service cleanup migration (#5850) 2026-04-11 12:00:40 +02:00
Pascal Fischer
cf86b9a528 [management] enable access log cleanup by default (#5842) 2026-04-10 17:07:27 +02:00
Pascal Fischer
ee588e1536 Revert "[management] allow local routing peer resource (#5814)" (#5847) 2026-04-10 14:53:47 +02:00
Pascal Fischer
2a8aacc5c9 [management] allow local routing peer resource (#5814) 2026-04-10 13:08:21 +02:00
Pascal Fischer
15709bc666 [management] update account delete with proper proxy domain and service cleanup (#5817) 2026-04-10 13:08:04 +02:00
Pascal Fischer
ee343d5d77 [management] use sql null vars (#5844) 2026-04-09 18:12:38 +02:00
Maycon Santos
099c493b18 [management] network map tests (#5795)
* Add network map benchmark and correctness test files

* Add tests for network map components correctness and edge cases

* Skip benchmarks in CI and enhance network map test coverage with new helper functions

* Remove legacy network map benchmarks and tests; refactor components-based test coverage for clarity and scalability.
2026-04-08 21:28:29 +02:00
Pascal Fischer
c1d1229ae0 [management] use NullBool for terminated flag (#5829) 2026-04-08 21:08:43 +02:00
Viktor Liu
0588d2dbe1 [management] Load missing service columns in pgx account loader (#5816) 2026-04-07 14:56:56 +02:00
Pascal Fischer
14b3b77bda [management] validate permissions on groups read with name (#5749) 2026-04-07 14:13:09 +02:00
Bethuel Mmbaga
9d1a37c644 [management,client] Revert gRPC client secret removal (#5781)
* This reverts commit e5914e4e8b

Signed-off-by: bcmmbaga <bethuelmbaga12@gmail.com>

* Deprecate client secret in proto

Signed-off-by: bcmmbaga <bethuelmbaga12@gmail.com>

* Fix lint

Signed-off-by: bcmmbaga <bethuelmbaga12@gmail.com>

---------

Signed-off-by: bcmmbaga <bethuelmbaga12@gmail.com>
2026-04-02 18:21:00 +02:00
Viktor Liu
5bf2372c4d [management] Fix L4 service creation deadlock on single-connection databases (#5779) 2026-04-02 14:46:14 +02:00
Bethuel Mmbaga
c2c6396a04 [management] Allow updating embedded IdP user name and email (#5721) 2026-04-02 13:02:10 +03:00
Vlad
d97fe84296 [management] fix race condition in the setup flow that enables creation of multiple owner users (#5754) 2026-04-01 16:25:35 +02:00
shuuri-labs
940f530ac2 [management] Legacy to embedded IdP migration tool (#5586) 2026-04-01 13:53:19 +02:00
Vlad
5ae986e1c4 [management] fix panic on management reboot (#5759) 2026-04-01 12:31:30 +02:00
Bethuel Mmbaga
e5914e4e8b [management,client] Remove client secret from gRPC auth flow (#5751)
Remove client secret from gRPC auth flow. The secret was originally included to support providers like Google Workspace that don't offer a proper PKCE flow, but this is no longer necessary with the embedded IdP. Deployments using such providers should migrate to the embedded IdP instead.
2026-03-31 18:50:49 +03:00
Pascal Fischer
c238f5425f [management] proper module permission validation for posture check delete (#5742) 2026-03-31 16:43:49 +02:00
Pascal Fischer
3c3097ea74 [management] add target user account validation (#5741) 2026-03-31 16:43:16 +02:00
Akshay Ubale
7bbe71c3ac [client] Refactor Android PeerInfo to use proper ConnStatus enum type (#5644)
* Simplify Android ConnStatus API with integer constants

Replace dual field PeerInfo design with unified integer based
ConnStatus field and exported gomobile friendly constants.

Changes:
> PeerInfo.ConnStatus: changed from string to int
> Export three constants: ConnStatusIdle, ConnStatusConnecting,ConnStatusConnected (mapped to peer.ConnStatus enum values)
> Updated PeersList() to convert peer enum directly to int

Benefits:
> Simpler API surface with single ConnStatus field
> Better gomobile compatibility for cross-platform usage
> Type-safe integer constants across language boundaries

* test: add All group to setupTestAccount fixture

The setupTestAccount() test helper was missing the required "All" group,
causing "failed to get group all: no group ALL found" errors during
test execution. Add the All group with all test peers to match the
expected account structure.

Fixes the failing account and types package tests when GetGroupAll()
is called in test scenarios.
2026-03-30 17:55:01 +02:00
Viktor Liu
0765352c99 [management] Persist proxy capabilities to database (#5720) 2026-03-30 13:03:42 +02:00
Pascal Fischer
be6fd119d8 [management] no events for temporary peers (#5719) 2026-03-30 10:08:02 +02:00
Pascal Fischer
ec96c5ecaf [management] Extend blackbox tests (#5699) 2026-03-26 16:59:49 +01:00
Pascal Fischer
7e1cce4b9f [management] add terminated field to service (#5700) 2026-03-26 16:59:08 +01:00
Bethuel Mmbaga
7be8752a00 [management] Add notification endpoints (#5590) 2026-03-26 18:26:33 +03:00
Bethuel Mmbaga
0af5a0441f [management] Fix DNS label uniqueness check on peer rename (#5679) 2026-03-24 20:25:29 +03:00
Viktor Liu
0fc63ea0ba [management] Allow multiple header auths with same header name (#5678) 2026-03-24 16:18:21 +01:00
Bethuel Mmbaga
0b329f7881 [management] Replace JumpCloud SDK with direct HTTP calls (#5591) 2026-03-24 13:21:42 +03:00
Viktor Liu
5b85edb753 [management] Omit proxy_protocol from API response when false (#5656)
The internal Target model uses a plain bool for ProxyProtocol,
which was always serialized to the API response as false even
when not configured. Only set the API field when true so it
gets omitted via omitempty when unset.
2026-03-23 17:53:17 +01:00
Viktor Liu
b550a2face [management, proxy] Add require_subdomain capability for proxy clusters (#5628) 2026-03-20 11:29:50 +01:00
Pascal Fischer
a1858a9cb7 [management] recover proxies after cleanup if heartbeat is still running (#5617) 2026-03-18 11:48:38 +01:00
Viktor Liu
212b34f639 [management] Add GET /reverse-proxies/clusters endpoint (#5611) 2026-03-18 11:15:56 +08:00
Viktor Liu
f0eed50678 [management] Accept domain target type for L4 reverse proxy services (#5612) 2026-03-17 16:29:03 +01:00
Viktor Liu
387e374e4b [proxy, management] Add header auth, access restrictions, and session idle timeout (#5587) 2026-03-16 15:22:00 +01:00
Viktor Liu
3e6baea405 [management,proxy,client] Add L4 capabilities (TLS/TCP/UDP) (#5530) 2026-03-13 18:36:44 +01:00
Zoltan Papp
fe9b844511 [client] refactor auto update workflow (#5448)
Auto-update logic moved out of the UI into a dedicated updatemanager.Manager service that runs in the connection layer. The
UI no longer polls or checks for updates independently.
The update manager supports three modes driven by the management server's auto-update policy:
No policy set by mgm: checks GitHub for the latest version and notifies the user (previous behavior, now centralized)
mgm enforces update: the "About" menu triggers installation directly instead of just downloading the file — user still initiates the action
mgm forces update: installation proceeds automatically without user interaction
updateManager lifecycle is now owned by daemon, giving the daemon server direct control via a new TriggerUpdate RPC
Introduces EngineServices struct to group external service dependencies passed to NewEngine, reducing its argument count from 11 to 4
2026-03-13 17:01:28 +01:00
Pascal Fischer
d86875aeac [management] Exclude proxy from peer approval (#5588) 2026-03-13 15:01:59 +01:00
Pascal Fischer
e50e124e70 [proxy] Fix domain switching update (#5585) 2026-03-12 17:12:26 +01:00