netbird

mirror of https://github.com/netbirdio/netbird.git synced 2026-06-09 09:29:57 +00:00

Author	SHA1	Message	Date
Zoltán Papp	c5ea58c5ad	ios: enable sync response persistence for debug bundle Turn on sync response persistence before starting the engine so DebugBundle can include the network map. On iOS the store is disk-backed (see syncstore) to keep the map out of the constrained process memory.	2026-06-03 14:26:24 +02:00
Zoltán Papp	56c958b436	Merge branch 'main' into fix/ios-debug-bundle	2026-06-03 14:23:27 +02:00
Zoltan Papp	3e61ccb162	[client] Persist sync response via pluggable store (disk on iOS) (#6331 ) * Persist sync response via pluggable store (disk on iOS) The latest Management sync response (which carries the network map) was kept in memory for debug bundle generation. On memory-constrained platforms like iOS the network map can be large enough to matter. Introduce a syncstore package with a Store interface and two backends: a memory backend (the previous behavior) and a disk backend that serializes the response to a file in the state directory. The backend is selected per-platform at build time: disk on iOS, memory elsewhere. The disk store clears any leftover file on construction so a fresh store never reads stale data from an earlier run (e.g. another profile's network map). In the engine, drop the separate persistSyncResponse bool: the store is only instantiated while persistence is enabled, and its presence is what marks persistence as active. The store is also cleared on engine close so the file does not linger on disk. * syncstore: silence nilnil linter on "nothing stored" returns Get returns (nil, nil) to signal that nothing is stored, which is part of the Store contract and preserves the original behaviour. Annotate both backends with //nolint:nilnil so golangci-lint does not flag it. * syncstore: hold syncRespMux for the whole store Set/Get Both handleSync and GetLatestSyncResponse snapshotted e.syncStore under the read lock and then released it before calling Set/Get. That allowed SetSyncResponsePersistence(false) or engine close to clear the store mid-call. In particular a concurrent Clear()+nil followed by a late Set could re-create the file that was just removed, defeating the leak/lingering protection. Hold syncRespMux for the duration of the store operation in both spots so the store cannot be cleared while a Set/Get is in flight. * syncstore: avoid StateDir "." when state path is empty On mobile the state path may be empty (the engine tolerates a missing state file). filepath.Dir("") returns ".", which would make a disk-backed syncstore write into the working directory instead of letting NewDiskStore fall back to os.TempDir(). Only set engineConfig.StateDir when path is non-empty.	2026-06-03 14:18:50 +02:00
Viktor Liu	a48c20d8d8	[client] Gate DNS forwarder on BlockInbound (#6257 )	2026-06-03 11:33:29 +02:00
Riccardo Manfrin	2b57a7d43b	[client, management, misc] expose VCS revision in dev build version output (#6263 ) * Refactor to use a common checker for development version * Adds commit sha to development version for cobra command only Leave dashboard unaffected * Adjust for "v0.31.1-dev" test case which must be considered pre-release * Drop synthetic "dev"/"0.50.0-dev" firewall feature-gate fixtures These test cases encoded the loose strings.Contains(v, "dev") semantics inherited from peerSupportedFirewallFeatures, but NetbirdVersion() never produces those values — only the literal "development" (and now "development-<sha>[-dirty]") ever flows through the wire. The agent owns the semantics of an ephemeral development build, so the tests should exercise the strings we actually emit. Replaced with development, development-<sha> and development-<sha>-dirty cases that match the HasPrefix("development") predicate introduced upstream. * Remove unexistent tests on wire format The sha / dirty flag are added only when the CLI asks the version. Account versions is unaffacted and can only strictly match "development" * Adds tests for IsDevelopmentVersion	2026-06-03 08:56:50 +02:00
Maycon Santos	fa1e241aea	[management, client, proxy] Follow-up fixes for private reverse-proxy services (#6268 ) * fix(proxy): gate tunnel-peer fast-path on inbound listener marker forwardWithTunnelPeer previously accepted any RFC1918 / ULA / CGNAT source IP, so a public client whose address happened to fall in those ranges could bypass the configured operator auth scheme by colliding with a known tunnel IP. The fast-path is now gated on TunnelLookupFromContext(r.Context()) being present — that context value is attached only by the per-account inbound (overlay) listener, so the host-facing listener never enters this branch. Tests updated to reflect the new requirement: requests that don't carry the inbound marker now fall through to the regular auth flow. * fix(proxy): harden inbound listener resource + startup-ctx handling Three correctness fixes on the per-account inbound path, with tests: - Close the logrus ErrorLog PipeWriter on tearDown. WriterLevel hands back an io.PipeWriter backed by a pipe + scanner goroutine that the caller owns; the two writers per account (https + plain) were never closed, leaking the pipe and goroutine on every teardown. - Run the post-Start hooks on context.Background(). runClientStartup is launched in a goroutine from AddPeer and was inheriting the caller's request-scoped ctx, so a cancelled request could abort the inbound bring-up or fail the management status notification. The tail is split into notifyClientReady so the contract is testable. Tests cover the PipeWriter close behaviour and assert the readyHandler + NotifyStatus calls receive a non-cancelled background context. feat(proxy): short-circuit peer-own-target loops with 421 When a peer that hosts the target of a private service dials its own service URL the request was being looped through the proxy and back over WireGuard to the same peer — twice the WG round-trip for no benefit, with no signal to the caller that something was wrong. Add isSelfTargetLoop to ReverseProxy.ServeHTTP: when the request arrived on the per-account overlay listener (IsOverlayOrigin) and the source tunnel IP matches the target host, refuse the request with 421 Misdirected Request and a body pointing the operator at the backend directly. The gate is scoped to overlay origin so requests on the public listener that happen to share a source IP with the target host are forwarded normally. * fix(management): private-service validation + tunnel-IP lookup semantics - Require an explicit port for L4 cluster targets. validateL4Target exempted TargetTypeCluster from the port check, but buildPathMappings serializes every L4 target via net.JoinHostPort(host, port) — port=0 shipped a ":0" upstream. Cluster targets use the same Host/Port fields, so the same requirement applies. - GetPeerByIP returns NotFound on a tunnel-IP miss instead of mapping every error to Internal. The proxy's ValidateTunnelPeer probes IPs that legitimately aren't in the roster; the miss is expected and now distinguishable from a real store failure. - Thread ctx into getClusterCapability's gorm query so a cancelled request doesn't keep the store busy. Tests updated for the L4-cluster port requirement and the GetPeerByIP NotFound path. * fix(client): include offlinePeers in PeerStateByIP lookup ReplaceOfflinePeers moves peers into d.offlinePeers but PeerStateByIP only scanned d.peers. Callers (the local DNS filter via localPeerConnectivity, embed.Client.IdentityForIP used by the proxy's tunnel-peer validator) were treating known-but-offline peers as unknown, which: - causes the DNS filter to keep returning records pointing at peers that have no live tunnel, AND - makes the proxy's local-roster check deny a request from such a peer rather than letting the cached management RPC carry the authorisation decision. Search both slices in PeerStateByIP. Adds a unit test for the IPv4 and IPv6 offline-match paths. * fix(rest): reject empty Delete path params in reverse-proxy clients ReverseProxyClustersAPI.Delete and ReverseProxyTokensAPI.Delete passed the path parameter into url.PathEscape without an empty check. PathEscape("") returns "" which collapses the request onto the collection endpoint ("/api/reverse-proxies/clusters/" / "/api/reverse-proxies/proxy-tokens/"), so a caller bug delete with no id reached a routable URL with surprising semantics (typically 405). Short-circuit with a typed error before the request is built. Tests mount a handler on the collection path that fails the test if hit, so the regression is impossible to reintroduce silently. * chore(api,ci,docs,test): private-service schema, proto-check, fixups Non-functional cleanups and contract/CI hardening around the private-service work: API schema (openapi.yml): - Require a non-empty access_groups and mode=http when private=true, on both Service and ServiceRequest, mirroring validatePrivateRequirements. mode stays optional-but-constrained (empty defaults to http server-side), matching runtime. CI (proto-version-check.yml): - Cover renamed .pb.go files (read base via previous_filename). - Match protoc-gen-go-grpc version headers (optional "- " prefix and -gen-go-grpc suffix) so grpc-generated files are in scope. Docs / comments: - Reword Config field docs to say defaults are applied at Server.Start (initDefaults), not New. - Rename the obsolete --private-inbound flag to --private across comments and the proto doc. Pre-existing test fixups surfaced by review: - Repair the integration-tagged validate_session_test.go (SignToken signature growth + new Manager interface methods). - Fix the CI-skip boolean precedence so Windows isn't skipped unconditionally. - Guard the router.HTTPListener type assertion with comma-ok. * fix(proxy): background ctx for already-started AddPeer notification The earlier ctx fix covered the async runClientStartup path but missed the synchronous branch: when a service is added to an already-started client, AddPeer called NotifyStatus with the caller's request-scoped ctx. A cancelled request/stream could drop the connected notification to management. Use context.Background() here too, matching notifyClientReady. Extends TestNetBird_AddPeer_ExistingStartedClient_NotifiesStatus to pass a pre-cancelled caller ctx and assert the notification still ran on a non-cancelled context. * use the cmd context for roundtripper	2026-06-02 13:40:09 +02:00
Viktor Liu	e7c9182ff9	[client] Offer injected ICMPv6 echo replies to packet capture (#6321 )	2026-06-01 19:38:00 +02:00
Pascal Fischer	9189625487	[management] enrich context in permissions manager (#6286 )	2026-05-29 16:36:38 +02:00
Bethuel Mmbaga	e9dbf9db6f	[management] Extend combined server initialization (#6156 )	2026-05-29 17:35:35 +03:00
Theodor Midtlien	5a9e9e7bc9	[Infrastructure] Pin actions with SHA and improve workflows (#6249 ) * Pin actions with SHA, replace unmaintained, add dependabot for actions * Update FreeBSD to version 15 for tests * Use shared actions * Update sign-pipelines version	2026-05-29 15:24:30 +02:00
Viktor Liu	43e041cf9f	[client] Apply netroute unspecified-destination workaround on android (#6192 )	2026-05-29 15:15:22 +02:00
Viktor Liu	77e5693200	[client] Recognize NetBird DNS forwarder port in capture text format (#6177 )	2026-05-29 15:14:32 +02:00
Zoltan Papp	174dc24867	[management] Add SSO session extend flow (management) (#6197 ) * add SSO session extend flow (management) Adds the management-server half of the SSO session-extension feature: - New ExtendAuthSession gRPC RPC that refreshes a peer's session expiry using a fresh JWT, validated through the same pipeline as Login but without tearing down the tunnel or redoing the NetworkMap sync. - Per-peer SessionExpiresAt timestamp on every LoginResponse and SyncResponse so connected clients learn the deadline on the existing long-lived stream, and admin-side changes (toggling expiration, changing the expiration window) reach every peer within seconds. - SessionExpiresAt(...) helper on Peer that derives the absolute UTC deadline from LastLogin + the account-level PeerLoginExpiration setting, returning zero when the peer is not SSO-tracked or expiration is disabled. The matching client-side consumer of these fields lands separately. * encode SessionExpiresAt as 3-state on the wire Previously the `sessionExpiresAt` field on LoginResponse, SyncResponse and ExtendAuthSessionResponse was 2-state: a valid timestamp meant "new deadline", and nil meant "clear". That conflated two distinct meanings — "no info in this snapshot" vs "expiry is explicitly off / peer is not SSO-tracked" — so a Sync push that legitimately couldn't compute the deadline (settings lookup failed) would silently clear the client's anchor and lose the warning window. Three states now, encoded on the same field number (no .proto schema churn — only comments and the server-side encoder change): - nil pointer (field absent) → "no info"; client preserves anchor - &Timestamp{} (seconds=0, nanos=0) → explicit "disabled / not SSO" sentinel; client clears - valid timestamp → new absolute UTC deadline A new encodeSessionExpiresAt helper centralises the zero/non-zero encoding and is shared by the Sync, Login and ExtendAuthSession builders. The Sync builder still emits nil when settings are missing. Login and ExtendAuthSession always carry an authoritative value. The matching client-side decoder lands on feature/session-extend. * add UserExtendedPeerSession activity event ExtendAuthSession previously reused UserLoggedInPeer for its audit record, which conflated two distinct user actions: a full interactive SSO login (tunnel re-established, network map resync) versus an in-place deadline refresh (tunnel untouched). Auditors reading the log couldn't tell which one happened, and downstream dashboards/alerts on "login" volume were polluted by routine extends. Adds a dedicated UserExtendedPeerSession Activity (code 125, "user.peer.session.extend") and switches ExtendPeerSession over to it. The peer-extend audit trail is now distinguishable from interactive logins. * make ExtendAuthSession JWT-retry backoff cancellable Skip the retry log and 200ms wait on the final attempt, and replace the uncancellable time.Sleep with a select on time.After/ctx.Done so an upstream cancellation aborts the wait instead of running it to completion.	2026-05-28 19:14:14 +02:00
Riccardo Manfrin	7ea5e37dd4	[client] Improve rosenpass support (#6136 ) * Updates rosenpass version go-rosenpass v0.4.0 → v0.5.42 bump — detailed findings Change summary cunicu.li/go-rosenpass v0.4.0 → v0.5.42 (target) cilium/ebpf v0.15.0 → v0.19.0 (transitive) gopacket/gopacket v1.1.1 → v1.4.0 (transitive) wireguard 2023-07 → 2023-12 (transitive) wireguard/wgctrl 2023-04 → 2024-12 (transitive) Wire interop v0.4.0 (in v0.70.5) <-> v0.5.42 OK v0.5.42 <-> v0.5.42 OK Quantum resistance: true both ends --- Replay error eliminated. Before (on v0.4.0): `ERROR Failed to handle message: failed to load biscuit (ICR1): detected replay` Recurring every ~50ms for minutes at a time. Gone entirely after both ends upgraded to v0.5.42. Upstream fix in biscuit/replay handling between v0.4.x and v0.5.x series. * Fixup [::]:port socket trying to send to v4 * Adds more tests on netbird<->rosenpass interactions * Anticipates rp handler creation before generateConfig * [client] Moves deterministic key gen into rosenpass * go mod tidy * Adds reminder to reason about rosenpass surface area * Apply code rabbit suggestions	2026-05-28 09:01:18 +02:00
Riccardo Manfrin	9d7ef9b255	[client] Fix statemanager possible deadlock (#6228 ) 1. Stop() takes m.mu.Lock() and defers m.mu.Unlock() 2. <-m.done under lock 3. periodicStateSave defers close(m.done) 4. periodicStateSave calls PersistState() (line 256) which does m.mu.Lock() Double Stop() remains idempotent: second cancel() on dead ctx (no-op) and reads done already closed (immediate return).	2026-05-28 08:54:15 +02:00
Zoltán Papp	a3352c8402	Merge tag 'v0.71.4' into fix/ios-debug-bundle	2026-05-27 17:41:09 +02:00
Pascal Fischer	944a258459	[management] extend nmap monitoring (#6271 )	2026-05-27 16:56:02 +02:00
Zoltán Papp	557b611b02	Include the iOS state file in the debug bundle addStateFile() resolved the state path via ServiceManager.GetStatePath(), which on iOS points at a hard-coded default that does not exist in the app sandbox, so the state file was silently skipped. Add an optional StatePath to GeneratorDependencies and use it when set, falling back to the ServiceManager default otherwise. The iOS DebugBundle passes the client's actual state file path (the App Group profile state), matching the Android bundle which includes the state file.	2026-05-27 15:55:10 +02:00
Zoltán Papp	485fa06c94	Add iOS debug bundle support in Go Thread cacheDir through NewClient -> RunOniOS -> MobileDependency.TempDir so the iOS client can pass its sandbox-writable cache directory for debug bundle zip file creation instead of os.TempDir(). Move log collection into platform-dispatched addPlatformLog(): - iOS: adds the file-based Go client log (with rotation, stderr/stdout companions and anonymization handled by addLogfile) plus the Swift app log (swift-log.log) written by the iOS app into the same log directory - Other non-Android platforms: existing file-based log + systemd fallback Narrow the debug_nonandroid.go build tag to !android && !ios so iOS no longer attempts the systemd journal fallback. Add a DebugBundle() entry point to the iOS Go client that generates a bundle, uploads it and returns the upload key. It works with or without a running engine: when the engine is up it reuses the live config, sync response and client metrics; otherwise it loads the config from disk (or the preloaded tvOS config). Guard the live config/ConnectClient behind a state mutex since DebugBundle may run on a different thread.	2026-05-27 15:32:47 +02:00
Pascal Fischer	1f9a829f2c	[management] update log levels (#6266 )	2026-05-27 11:43:49 +02:00
Bethuel Mmbaga	14af179556	[management] Refactor management server bootstrap (#6256 )	2026-05-26 17:44:28 +03:00
Pascal Fischer	1fbb5e6d5d	[management] fix owner role update (#6264 )	2026-05-26 16:37:58 +02:00
Viktor Liu	6771e35d57	[client] Release js.FuncOf callbacks in wasm ssh and rdp to prevent leaks (#5982 )	2026-05-26 14:32:39 +02:00
Viktor Liu	e89b1e0596	[proxy, client] Bound embed client WireGuard per-Device memory (#5962 )	2026-05-26 11:51:53 +02:00
Philip Laine	d542c60e21	Refactor Linux system info to use syscalls (#6230 )	2026-05-25 21:00:24 +02:00
Viktor Liu	4983b5cf17	[client] Match DNS wildcard handlers on label boundaries (#6255 )	2026-05-25 18:38:48 +02:00
Viktor Liu	b3b0feb3b8	[client] Filter scoped/cloned default routes from BSD network monitor RTM_ADD (#6208 )	2026-05-25 18:38:21 +02:00
Maycon Santos	7aebdd69dd	[management, client, proxy] add expose NetBird-only services over tunnel peers (#6226 ) Adds a new "private" service mode for the reverse proxy: services reachable exclusively over the embedded WireGuard tunnel, gated by per-peer group membership instead of operator auth schemes. Wire contract - ProxyMapping.private (field 13): the proxy MUST call ValidateTunnelPeer and fail closed; operator schemes are bypassed. - ProxyCapabilities.private (4) + supports_private_service (5): capability gate. Management never streams private mappings to proxies that don't claim the capability; the broadcast path applies the same filter via filterMappingsForProxy. - ValidateTunnelPeer RPC: resolves an inbound tunnel IP to a peer, checks the peer's groups against service.AccessGroups, and mints a session JWT on success. checkPeerGroupAccess fails closed when a private service has empty AccessGroups. - ValidateSession/ValidateTunnelPeer responses now carry peer_group_ids + peer_group_names so the proxy can authorise policy-aware middlewares without an extra management round-trip. - ProxyInboundListener + SendStatusUpdate.inbound_listener: per-account inbound listener state surfaced to dashboards. - PathTargetOptions.direct_upstream (11): bypass the embedded NetBird client and dial the target via the proxy host's network stack for upstreams reachable without WireGuard. Data model - Service.Private (bool) + Service.AccessGroups ([]string, JSON- serialised). Validate() rejects bearer auth on private services. Copy() deep-copies AccessGroups. pgx getServices loads the columns. - DomainConfig.Private threaded into the proxy auth middleware. Request handler routes private services through forwardWithTunnelPeer and returns 403 on validation failure. - Account-level SynthesizePrivateServiceZones (synthetic DNS) and injectPrivateServicePolicies (synthetic ACL) gate on len(svc.AccessGroups) > 0. Proxy - /netbird proxy --private (embedded mode) flag; Config.Private in proxy/lifecycle.go. - Per-account inbound listener (proxy/inbound.go) binding HTTP/HTTPS on the embedded NetBird client's WireGuard tunnel netstack. - proxy/internal/auth/tunnel_cache: ValidateTunnelPeer response cache with single-flight de-duplication and per-account eviction. - Local peerstore short-circuit: when the inbound IP isn't in the account roster, deny fast without an RPC. - proxy/server.go reports SupportsPrivateService=true and redacts the full ProxyMapping JSON from info logs (auth_token + header-auth hashed values now only at debug level). Identity forwarding - ValidateSessionJWT returns user_id, email, method, groups, group_names. sessionkey.Claims carries Email + Groups + GroupNames so the proxy can stamp identity onto upstream requests without an extra management round-trip on every cookie-bearing request. - CapturedData carries userEmail / userGroups / userGroupNames; the proxy stamps X-NetBird-User and X-NetBird-Groups on r.Out from the authenticated identity (strips client-supplied values first to prevent spoofing). - AccessLog.UserGroups: access-log enrichment captures the user's group memberships at write time so the dashboard can render group context without reverse-resolving stale memberships. OpenAPI/dashboard surface - ReverseProxyService gains private + access_groups; ReverseProxyCluster gains private + supports_private. ReverseProxyTarget target_type enum gains "cluster". ServiceTargetOptions gains direct_upstream. ProxyAccessLog gains user_groups.	2026-05-25 17:41:50 +02:00
Viktor Liu	0358be2313	[client] Revert "Clean up legacy 32-bit and HKCU registry entries on Windows install (#6176 )" (#6232 ) This reverts commit `d927ef468a`. v0.71.4	2026-05-21 16:27:12 +02:00
Viktor Liu	37052fd5bc	[client] Fix nil channel panic in external chain monitor stop (#6224 ) v0.71.3	2026-05-20 18:46:51 +02:00
Pascal Fischer	454ff66518	[management] scope network router update call (#6222 )	2026-05-20 18:24:00 +02:00
Pascal Fischer	6137a1fcc5	[proxy] concurrent proxy snapshot apply (#6207 )	2026-05-20 18:21:22 +02:00
Viktor Liu	4955c345d5	Clean up README header, key features table, and self-hosted quickstart (#6178 )	2026-05-20 16:25:56 +02:00
Viktor Liu	9192b4f029	[client] Bump macOS sleep callback timeout to 20s (#6220 )	2026-05-20 13:09:22 +02:00
Maycon Santos	c784b02550	[misc] Update contribution guidelines (#6219 ) Update contribution guidelines and PR template to require discussing impactful changes with the team	2026-05-20 12:21:03 +02:00
Maycon Santos	d250f92c43	feat(reverse-proxy): clusters API surfaces type, online status, and capability flags (#6148 ) The cluster listing now answers three questions in one round-trip instead of forcing the dashboard to cross-reference the domains API: which clusters can this account see, are they currently up, and what do they support. The ProxyCluster wire type drops the boolean self_hosted in favour of a `type` enum (`account` / `shared`) plus explicit `online`, `supports_custom_ports`, `require_subdomain`, and `supports_crowdsec` fields. Store query reworked so offline clusters still appear (no last_seen WHERE), with online and connected_proxies both derived from the existing 2-min active window via portable CASE expressions; the 1-hour heartbeat reaper still removes long-stale rows. Service manager enriches each cluster with the capability flags via the existing per-cluster lookups (CapabilityProvider now also exposes ClusterSupportsCrowdSec). GetActiveClusterAddresses* keep their tight 2-min filter so service routing and domain enumeration aren't pulled into the wider window. The hard cut removes self_hosted from the response — the dashboard is the only consumer and is updated in the matching PR; no transitional field is shipped. Adds a cross-engine regression test asserting offline clusters surface, connected_proxies counts only fresh proxies, and account-scoped BYOP clusters never leak across accounts.	2026-05-20 10:08:34 +02:00
Maycon Santos	80966ab1b0	[management] Ensure SessionStartedAt has a default value (#6211 ) * [management] Ensure SessionStartedAt has a default value Avoid null values for the new column * [management] Add PeerStatus with LastSeen in peer_test * [management] Add migration for PeerStatusSessionStartedAt default value * [management] Add PeerStatus with LastSeen in migration tests	2026-05-20 08:25:30 +02:00
Maycon Santos	af24fd7796	[management] Add metrics for peer status updates and ephemeral cleanup (#6196 ) * [management] Add metrics for peer status updates and ephemeral cleanup The session-fenced MarkPeerConnected / MarkPeerDisconnected path and the ephemeral peer cleanup loop both run silently today: when fencing rejects a stale stream, when a cleanup tick deletes peers, or when a batch delete fails, we have no operational signal beyond log lines. Add OpenTelemetry counters and a histogram so the same SLO-style dashboards that already exist for the network-map controller can cover peer connect/disconnect and ephemeral cleanup too. All new attributes are bounded enums: operation in {connect,disconnect} and outcome in {applied,stale,error,peer_not_found}. No account, peer, or user ID is ever written as a metric label — total cardinality is fixed at compile time (8 counter series, 2 histogram series, 4 unlabeled ephemeral series). Metric methods are nil-receiver safe so test composition that doesn't wire telemetry (the bulk of the existing tests) works unchanged. The ephemeral manager exposes a SetMetrics setter rather than taking the collector through its constructor, keeping the constructor signature stable across all test call sites. * [management] Add OpenTelemetry metrics for ephemeral peer cleanup Introduce counters for tracking ephemeral peer cleanup, including peers pending deletion, cleanup runs, successful deletions, and failed batches. Metrics are nil-receiver safe to ensure compatibility with test setups without telemetry.	2026-05-18 22:55:19 +02:00
Maycon Santos	13d32d274f	[management] Fence peer status updates with a session token (#6193 ) * [management] Fence peer status updates with a session token The connect/disconnect path used a best-effort LastSeen-after-streamStart comparison to decide whether a status update should land. Under contention — a re-sync arriving while the previous stream's disconnect was still in flight, or two management replicas seeing the same peer at once — the check was a read-then-decide-then-write window: any UPDATE in between caused the wrong row to be written. The Go-side time.Now() that fed the comparison also drifted under lock contention, since it was captured seconds before the write actually committed. Replace it with an integer-nanosecond fencing token stored alongside the status. Every gRPC sync stream uses its open time (UnixNano) as its token. Connects only land when the incoming token is strictly greater than the stored one; disconnects only land when the incoming token equals the stored one (i.e. we're the stream that owns the current session). Both are single optimistic-locked UPDATEs — no read-then-write, no transaction wrapper. LastSeen is now written by the database itself (CURRENT_TIMESTAMP). The caller never supplies it, so the value always reflects the real moment of the UPDATE rather than the moment the caller queued the work — which was already off by minutes under heavy lock contention. Side effects (geo lookup, peer-login-expiration scheduling, network-map fan-out) are explicitly documented as running after the fence UPDATE commits, never inside it. Geo also skips the update when realIP equals the stored ConnectionIP, dropping a redundant SavePeerLocation call on same-IP reconnects. Tests cover the three semantic cases (matched disconnect lands, stale disconnect dropped, stale connect dropped) plus a 16-goroutine race test that asserts the highest token always wins. * [management] Add SessionStartedAt to peer status updates Stored `SessionStartedAt` for fencing token propagation across goroutines and updated database queries/functions to handle the new field. Removed outdated geolocation handling logic and adjusted tests for concurrency safety. * Rename `peer_status_required_approval` to `peer_status_requires_approval` in SQL store fields	2026-05-18 20:25:12 +02:00
Nicolas Frati	705f87fc20	[management] fix: device redirect uri wasn't registered (#6191 ) * fix: device redirect uri wasn't registered * fix lint	2026-05-18 12:57:59 +02:00
Viktor Liu	3f91f49277	Clean up legacy 32-bit and HKCU registry entries on Windows install (#6176 ) v0.71.2	2026-05-16 16:52:57 +02:00
Maycon Santos	347c5bf317	Avoid context cancellation in `cancelPeerRoutines` (#6175 ) When closing go routines and handling peer disconnect, we should avoid canceling the flow due to parent gRPC context cancellation. This change triggers disconnection handling with a context that is not bound to the parent gRPC cancellation.	2026-05-16 16:29:01 +02:00
Viktor Liu	22e2519d71	[management] Avoid peer IP reallocation when account settings update preserves the network range (#6173 )	2026-05-16 15:51:48 +02:00
Vlad	e916f12cca	[proxy] auth token generation on mapping (#6157 ) * [management / proxy] auth token generation on mapping * fix tests v0.71.1	2026-05-15 19:13:44 +02:00
Viktor Liu	9ed2e2a5b4	[client] Drop DNS probes for passive health projection (#5971 )	2026-05-15 17:07:38 +02:00
Viktor Liu	2ccae7ec47	[client] Mirror v4 exit selection onto v6 pair and honour SkipAutoApply per route (#6150 )	2026-05-15 16:58:47 +02:00
Viktor Liu	07e5450117	[management] Bracket IPv6 reverse-proxy target hosts when building URL Host field (#6141 ) v0.71.0	2026-05-14 16:42:40 +02:00
Viktor Liu	3f914090cb	[client] Bracket IPv6 in embed listeners, expand debug bundle (#6134 )	2026-05-14 16:22:53 +02:00
Viktor Liu	ea9fab4396	[management] Allocate and preserve IPv6 overlay addresses for embedded proxy peers (#6132 )	2026-05-14 16:05:33 +02:00
Vlad	77b479286e	[management] fix offline statuses for public proxy clusters (#6133 )	2026-05-14 13:27:50 +02:00

1 2 3 4 5 ...

2924 Commits