netbird

mirror of https://github.com/netbirdio/netbird.git synced 2026-05-31 21:19:55 +00:00

Author	SHA1	Message	Date
mlsmaycon	75c8fa78e2	Merge branch 'main' into follow-up-private-services	2026-05-27 16:00:40 +02:00
Pascal Fischer	1f9a829f2c	[management] update log levels (#6266 )	2026-05-27 11:43:49 +02:00
mlsmaycon	4f7c73369b	fix(proxy): background ctx for already-started AddPeer notification The earlier ctx fix covered the async runClientStartup path but missed the synchronous branch: when a service is added to an already-started client, AddPeer called NotifyStatus with the caller's request-scoped ctx. A cancelled request/stream could drop the connected notification to management. Use context.Background() here too, matching notifyClientReady. Extends TestNetBird_AddPeer_ExistingStartedClient_NotifiesStatus to pass a pre-cancelled caller ctx and assert the notification still ran on a non-cancelled context.	2026-05-26 22:52:11 +02:00
mlsmaycon	924be2116b	chore(api,ci,docs,test): private-service schema, proto-check, fixups Non-functional cleanups and contract/CI hardening around the private-service work: API schema (openapi.yml): - Require a non-empty access_groups and mode=http when private=true, on both Service and ServiceRequest, mirroring validatePrivateRequirements. mode stays optional-but-constrained (empty defaults to http server-side), matching runtime. CI (proto-version-check.yml): - Cover renamed .pb.go files (read base via previous_filename). - Match protoc-gen-go-grpc version headers (optional "- " prefix and -gen-go-grpc suffix) so grpc-generated files are in scope. Docs / comments: - Reword Config field docs to say defaults are applied at Server.Start (initDefaults), not New. - Rename the obsolete --private-inbound flag to --private across comments and the proto doc. Pre-existing test fixups surfaced by review: - Repair the integration-tagged validate_session_test.go (SignToken signature growth + new Manager interface methods). - Fix the CI-skip boolean precedence so Windows isn't skipped unconditionally. - Guard the router.HTTPListener type assertion with comma-ok.	2026-05-26 22:00:08 +02:00
mlsmaycon	43c0cb1dc2	fix(rest): reject empty Delete path params in reverse-proxy clients ReverseProxyClustersAPI.Delete and ReverseProxyTokensAPI.Delete passed the path parameter into url.PathEscape without an empty check. PathEscape("") returns "" which collapses the request onto the collection endpoint ("/api/reverse-proxies/clusters/" / "/api/reverse-proxies/proxy-tokens/"), so a caller bug delete with no id reached a routable URL with surprising semantics (typically 405). Short-circuit with a typed error before the request is built. Tests mount a handler on the collection path that fails the test if hit, so the regression is impossible to reintroduce silently.	2026-05-26 21:59:34 +02:00
mlsmaycon	86e24a622d	fix(client): include offlinePeers in PeerStateByIP lookup ReplaceOfflinePeers moves peers into d.offlinePeers but PeerStateByIP only scanned d.peers. Callers (the local DNS filter via localPeerConnectivity, embed.Client.IdentityForIP used by the proxy's tunnel-peer validator) were treating known-but-offline peers as unknown, which: - causes the DNS filter to keep returning records pointing at peers that have no live tunnel, AND - makes the proxy's local-roster check deny a request from such a peer rather than letting the cached management RPC carry the authorisation decision. Search both slices in PeerStateByIP. Adds a unit test for the IPv4 and IPv6 offline-match paths.	2026-05-26 21:59:34 +02:00
mlsmaycon	af416c656b	fix(management): private-service validation + tunnel-IP lookup semantics - Require an explicit port for L4 cluster targets. validateL4Target exempted TargetTypeCluster from the port check, but buildPathMappings serializes every L4 target via net.JoinHostPort(host, port) — port=0 shipped a ":0" upstream. Cluster targets use the same Host/Port fields, so the same requirement applies. - GetPeerByIP returns NotFound on a tunnel-IP miss instead of mapping every error to Internal. The proxy's ValidateTunnelPeer probes IPs that legitimately aren't in the roster; the miss is expected and now distinguishable from a real store failure. - Thread ctx into getClusterCapability's gorm query so a cancelled request doesn't keep the store busy. Tests updated for the L4-cluster port requirement and the GetPeerByIP NotFound path.	2026-05-26 21:59:22 +02:00
mlsmaycon	6600f0d45f	feat(proxy): short-circuit peer-own-target loops with 421 When a peer that hosts the target of a private service dials its own service URL the request was being looped through the proxy and back over WireGuard to the same peer — twice the WG round-trip for no benefit, with no signal to the caller that something was wrong. Add isSelfTargetLoop to ReverseProxy.ServeHTTP: when the request arrived on the per-account overlay listener (IsOverlayOrigin) and the source tunnel IP matches the target host, refuse the request with 421 Misdirected Request and a body pointing the operator at the backend directly. The gate is scoped to overlay origin so requests on the public listener that happen to share a source IP with the target host are forwarded normally.	2026-05-26 21:58:58 +02:00
mlsmaycon	878344088f	fix(proxy): harden inbound listener resource + startup-ctx handling Three correctness fixes on the per-account inbound path, with tests: - Close the logrus ErrorLog PipeWriter on tearDown. WriterLevel hands back an *io.PipeWriter backed by a pipe + scanner goroutine that the caller owns; the two writers per account (https + plain) were never closed, leaking the pipe and goroutine on every teardown. - Run the post-Start hooks on context.Background(). runClientStartup is launched in a goroutine from AddPeer and was inheriting the caller's request-scoped ctx, so a cancelled request could abort the inbound bring-up or fail the management status notification. The tail is split into notifyClientReady so the contract is testable. Tests cover the PipeWriter close behaviour and assert the readyHandler + NotifyStatus calls receive a non-cancelled background context.	2026-05-26 21:58:52 +02:00
mlsmaycon	e09d51c1a8	fix(proxy): gate tunnel-peer fast-path on inbound listener marker forwardWithTunnelPeer previously accepted any RFC1918 / ULA / CGNAT source IP, so a public client whose address happened to fall in those ranges could bypass the configured operator auth scheme by colliding with a known tunnel IP. The fast-path is now gated on TunnelLookupFromContext(r.Context()) being present — that context value is attached only by the per-account inbound (overlay) listener, so the host-facing listener never enters this branch. Tests updated to reflect the new requirement: requests that don't carry the inbound marker now fall through to the regular auth flow.	2026-05-26 21:58:23 +02:00
Bethuel Mmbaga	14af179556	[management] Refactor management server bootstrap (#6256 )	2026-05-26 17:44:28 +03:00
Pascal Fischer	1fbb5e6d5d	[management] fix owner role update (#6264 )	2026-05-26 16:37:58 +02:00
Viktor Liu	6771e35d57	[client] Release js.FuncOf callbacks in wasm ssh and rdp to prevent leaks (#5982 )	2026-05-26 14:32:39 +02:00
Viktor Liu	e89b1e0596	[proxy, client] Bound embed client WireGuard per-Device memory (#5962 )	2026-05-26 11:51:53 +02:00
Philip Laine	d542c60e21	Refactor Linux system info to use syscalls (#6230 )	2026-05-25 21:00:24 +02:00
Viktor Liu	4983b5cf17	[client] Match DNS wildcard handlers on label boundaries (#6255 )	2026-05-25 18:38:48 +02:00
Viktor Liu	b3b0feb3b8	[client] Filter scoped/cloned default routes from BSD network monitor RTM_ADD (#6208 )	2026-05-25 18:38:21 +02:00
Maycon Santos	7aebdd69dd	[management, client, proxy] add expose NetBird-only services over tunnel peers (#6226 ) Adds a new "private" service mode for the reverse proxy: services reachable exclusively over the embedded WireGuard tunnel, gated by per-peer group membership instead of operator auth schemes. Wire contract - ProxyMapping.private (field 13): the proxy MUST call ValidateTunnelPeer and fail closed; operator schemes are bypassed. - ProxyCapabilities.private (4) + supports_private_service (5): capability gate. Management never streams private mappings to proxies that don't claim the capability; the broadcast path applies the same filter via filterMappingsForProxy. - ValidateTunnelPeer RPC: resolves an inbound tunnel IP to a peer, checks the peer's groups against service.AccessGroups, and mints a session JWT on success. checkPeerGroupAccess fails closed when a private service has empty AccessGroups. - ValidateSession/ValidateTunnelPeer responses now carry peer_group_ids + peer_group_names so the proxy can authorise policy-aware middlewares without an extra management round-trip. - ProxyInboundListener + SendStatusUpdate.inbound_listener: per-account inbound listener state surfaced to dashboards. - PathTargetOptions.direct_upstream (11): bypass the embedded NetBird client and dial the target via the proxy host's network stack for upstreams reachable without WireGuard. Data model - Service.Private (bool) + Service.AccessGroups ([]string, JSON- serialised). Validate() rejects bearer auth on private services. Copy() deep-copies AccessGroups. pgx getServices loads the columns. - DomainConfig.Private threaded into the proxy auth middleware. Request handler routes private services through forwardWithTunnelPeer and returns 403 on validation failure. - Account-level SynthesizePrivateServiceZones (synthetic DNS) and injectPrivateServicePolicies (synthetic ACL) gate on len(svc.AccessGroups) > 0. Proxy - /netbird proxy --private (embedded mode) flag; Config.Private in proxy/lifecycle.go. - Per-account inbound listener (proxy/inbound.go) binding HTTP/HTTPS on the embedded NetBird client's WireGuard tunnel netstack. - proxy/internal/auth/tunnel_cache: ValidateTunnelPeer response cache with single-flight de-duplication and per-account eviction. - Local peerstore short-circuit: when the inbound IP isn't in the account roster, deny fast without an RPC. - proxy/server.go reports SupportsPrivateService=true and redacts the full ProxyMapping JSON from info logs (auth_token + header-auth hashed values now only at debug level). Identity forwarding - ValidateSessionJWT returns user_id, email, method, groups, group_names. sessionkey.Claims carries Email + Groups + GroupNames so the proxy can stamp identity onto upstream requests without an extra management round-trip on every cookie-bearing request. - CapturedData carries userEmail / userGroups / userGroupNames; the proxy stamps X-NetBird-User and X-NetBird-Groups on r.Out from the authenticated identity (strips client-supplied values first to prevent spoofing). - AccessLog.UserGroups: access-log enrichment captures the user's group memberships at write time so the dashboard can render group context without reverse-resolving stale memberships. OpenAPI/dashboard surface - ReverseProxyService gains private + access_groups; ReverseProxyCluster gains private + supports_private. ReverseProxyTarget target_type enum gains "cluster". ServiceTargetOptions gains direct_upstream. ProxyAccessLog gains user_groups.	2026-05-25 17:41:50 +02:00
Viktor Liu	0358be2313	[client] Revert "Clean up legacy 32-bit and HKCU registry entries on Windows install (#6176 )" (#6232 ) This reverts commit `d927ef468a`. v0.71.4	2026-05-21 16:27:12 +02:00
Viktor Liu	37052fd5bc	[client] Fix nil channel panic in external chain monitor stop (#6224 ) v0.71.3	2026-05-20 18:46:51 +02:00
Pascal Fischer	454ff66518	[management] scope network router update call (#6222 )	2026-05-20 18:24:00 +02:00
Pascal Fischer	6137a1fcc5	[proxy] concurrent proxy snapshot apply (#6207 )	2026-05-20 18:21:22 +02:00
Viktor Liu	4955c345d5	Clean up README header, key features table, and self-hosted quickstart (#6178 )	2026-05-20 16:25:56 +02:00
Viktor Liu	9192b4f029	[client] Bump macOS sleep callback timeout to 20s (#6220 )	2026-05-20 13:09:22 +02:00
Maycon Santos	c784b02550	[misc] Update contribution guidelines (#6219 ) Update contribution guidelines and PR template to require discussing impactful changes with the team	2026-05-20 12:21:03 +02:00
Maycon Santos	d250f92c43	feat(reverse-proxy): clusters API surfaces type, online status, and capability flags (#6148 ) The cluster listing now answers three questions in one round-trip instead of forcing the dashboard to cross-reference the domains API: which clusters can this account see, are they currently up, and what do they support. The ProxyCluster wire type drops the boolean self_hosted in favour of a `type` enum (`account` / `shared`) plus explicit `online`, `supports_custom_ports`, `require_subdomain`, and `supports_crowdsec` fields. Store query reworked so offline clusters still appear (no last_seen WHERE), with online and connected_proxies both derived from the existing 2-min active window via portable CASE expressions; the 1-hour heartbeat reaper still removes long-stale rows. Service manager enriches each cluster with the capability flags via the existing per-cluster lookups (CapabilityProvider now also exposes ClusterSupportsCrowdSec). GetActiveClusterAddresses* keep their tight 2-min filter so service routing and domain enumeration aren't pulled into the wider window. The hard cut removes self_hosted from the response — the dashboard is the only consumer and is updated in the matching PR; no transitional field is shipped. Adds a cross-engine regression test asserting offline clusters surface, connected_proxies counts only fresh proxies, and account-scoped BYOP clusters never leak across accounts.	2026-05-20 10:08:34 +02:00
Maycon Santos	80966ab1b0	[management] Ensure SessionStartedAt has a default value (#6211 ) * [management] Ensure SessionStartedAt has a default value Avoid null values for the new column * [management] Add PeerStatus with LastSeen in peer_test * [management] Add migration for PeerStatusSessionStartedAt default value * [management] Add PeerStatus with LastSeen in migration tests	2026-05-20 08:25:30 +02:00
Maycon Santos	af24fd7796	[management] Add metrics for peer status updates and ephemeral cleanup (#6196 ) * [management] Add metrics for peer status updates and ephemeral cleanup The session-fenced MarkPeerConnected / MarkPeerDisconnected path and the ephemeral peer cleanup loop both run silently today: when fencing rejects a stale stream, when a cleanup tick deletes peers, or when a batch delete fails, we have no operational signal beyond log lines. Add OpenTelemetry counters and a histogram so the same SLO-style dashboards that already exist for the network-map controller can cover peer connect/disconnect and ephemeral cleanup too. All new attributes are bounded enums: operation in {connect,disconnect} and outcome in {applied,stale,error,peer_not_found}. No account, peer, or user ID is ever written as a metric label — total cardinality is fixed at compile time (8 counter series, 2 histogram series, 4 unlabeled ephemeral series). Metric methods are nil-receiver safe so test composition that doesn't wire telemetry (the bulk of the existing tests) works unchanged. The ephemeral manager exposes a SetMetrics setter rather than taking the collector through its constructor, keeping the constructor signature stable across all test call sites. * [management] Add OpenTelemetry metrics for ephemeral peer cleanup Introduce counters for tracking ephemeral peer cleanup, including peers pending deletion, cleanup runs, successful deletions, and failed batches. Metrics are nil-receiver safe to ensure compatibility with test setups without telemetry.	2026-05-18 22:55:19 +02:00
Maycon Santos	13d32d274f	[management] Fence peer status updates with a session token (#6193 ) * [management] Fence peer status updates with a session token The connect/disconnect path used a best-effort LastSeen-after-streamStart comparison to decide whether a status update should land. Under contention — a re-sync arriving while the previous stream's disconnect was still in flight, or two management replicas seeing the same peer at once — the check was a read-then-decide-then-write window: any UPDATE in between caused the wrong row to be written. The Go-side time.Now() that fed the comparison also drifted under lock contention, since it was captured seconds before the write actually committed. Replace it with an integer-nanosecond fencing token stored alongside the status. Every gRPC sync stream uses its open time (UnixNano) as its token. Connects only land when the incoming token is strictly greater than the stored one; disconnects only land when the incoming token equals the stored one (i.e. we're the stream that owns the current session). Both are single optimistic-locked UPDATEs — no read-then-write, no transaction wrapper. LastSeen is now written by the database itself (CURRENT_TIMESTAMP). The caller never supplies it, so the value always reflects the real moment of the UPDATE rather than the moment the caller queued the work — which was already off by minutes under heavy lock contention. Side effects (geo lookup, peer-login-expiration scheduling, network-map fan-out) are explicitly documented as running after the fence UPDATE commits, never inside it. Geo also skips the update when realIP equals the stored ConnectionIP, dropping a redundant SavePeerLocation call on same-IP reconnects. Tests cover the three semantic cases (matched disconnect lands, stale disconnect dropped, stale connect dropped) plus a 16-goroutine race test that asserts the highest token always wins. * [management] Add SessionStartedAt to peer status updates Stored `SessionStartedAt` for fencing token propagation across goroutines and updated database queries/functions to handle the new field. Removed outdated geolocation handling logic and adjusted tests for concurrency safety. * Rename `peer_status_required_approval` to `peer_status_requires_approval` in SQL store fields	2026-05-18 20:25:12 +02:00
Nicolas Frati	705f87fc20	[management] fix: device redirect uri wasn't registered (#6191 ) * fix: device redirect uri wasn't registered * fix lint	2026-05-18 12:57:59 +02:00
Viktor Liu	3f91f49277	Clean up legacy 32-bit and HKCU registry entries on Windows install (#6176 ) v0.71.2	2026-05-16 16:52:57 +02:00
Maycon Santos	347c5bf317	Avoid context cancellation in `cancelPeerRoutines` (#6175 ) When closing go routines and handling peer disconnect, we should avoid canceling the flow due to parent gRPC context cancellation. This change triggers disconnection handling with a context that is not bound to the parent gRPC cancellation.	2026-05-16 16:29:01 +02:00
Viktor Liu	22e2519d71	[management] Avoid peer IP reallocation when account settings update preserves the network range (#6173 )	2026-05-16 15:51:48 +02:00
Vlad	e916f12cca	[proxy] auth token generation on mapping (#6157 ) * [management / proxy] auth token generation on mapping * fix tests v0.71.1	2026-05-15 19:13:44 +02:00
Viktor Liu	9ed2e2a5b4	[client] Drop DNS probes for passive health projection (#5971 )	2026-05-15 17:07:38 +02:00
Viktor Liu	2ccae7ec47	[client] Mirror v4 exit selection onto v6 pair and honour SkipAutoApply per route (#6150 )	2026-05-15 16:58:47 +02:00
Viktor Liu	07e5450117	[management] Bracket IPv6 reverse-proxy target hosts when building URL Host field (#6141 ) v0.71.0	2026-05-14 16:42:40 +02:00
Viktor Liu	3f914090cb	[client] Bracket IPv6 in embed listeners, expand debug bundle (#6134 )	2026-05-14 16:22:53 +02:00
Viktor Liu	ea9fab4396	[management] Allocate and preserve IPv6 overlay addresses for embedded proxy peers (#6132 )	2026-05-14 16:05:33 +02:00
Vlad	77b479286e	[management] fix offline statuses for public proxy clusters (#6133 )	2026-05-14 13:27:50 +02:00
Maycon Santos	ab2a8794e7	[client] Add short flags for status command options (#6137 ) * [client] Add short flags for status command options * uppercase filters	2026-05-14 12:30:42 +02:00
Viktor Liu	9126a192ca	[client] Set 0644 perms on SSH client config after os.CreateTemp (#6126 )	2026-05-12 15:05:53 +02:00
Viktor Liu	1224d6e1ee	[client] Persist management URL and pre-shared key overrides on login (#6065 )	2026-05-12 14:52:56 +02:00
Nicolas Frati	96672dd1f8	[management] chores: update dex version (#6124 ) * chores: update dex version * chore: update dex fork	2026-05-12 13:50:35 +02:00
Viktor Liu	946ce4c3da	[client] Fix --config flag default to point at profile path (#6122 )	2026-05-11 17:48:21 +02:00
Vlad	07cbfdbede	[proxy] feature: bring your own proxy (#5627 )	2026-05-11 14:31:38 +02:00
Viktor Liu	a4114a5e45	[client] Skip DNS upstream failover on definitive EDE (#6089 )	2026-05-11 10:00:23 +02:00
Viktor Liu	6b08e89c7b	[relay] Preserve non-standard port in WS dialer URL prep (#6061 )	2026-05-11 09:59:33 +02:00
Viktor Liu	a852b3bd34	[client, proxy] Harden uspfilter conntrack and share TCP relay (#5936 )	2026-05-11 09:59:13 +02:00
Viktor Liu	afb83b3049	[client] Use unique temp file and clean up on failure when writing ssh config (#6064 )	2026-05-11 09:58:49 +02:00

1 2 3 4 5 ...

2914 Commits