Commit Graph

2889 Commits

Author SHA1 Message Date
Zoltán Papp
7de3242a27 encode SessionExpiresAt as 3-state on the wire
Previously the `sessionExpiresAt` field on LoginResponse, SyncResponse
and ExtendAuthSessionResponse was 2-state: a valid timestamp meant
"new deadline", and nil meant "clear". That conflated two distinct
meanings — "no info in this snapshot" vs "expiry is explicitly off /
peer is not SSO-tracked" — so a Sync push that legitimately couldn't
compute the deadline (settings lookup failed) would silently clear the
client's anchor and lose the warning window.

Three states now, encoded on the same field number (no .proto schema
churn — only comments and the server-side encoder change):

  - nil pointer (field absent) → "no info"; client preserves anchor
  - &Timestamp{} (seconds=0, nanos=0) → explicit "disabled / not SSO"
    sentinel; client clears
  - valid timestamp → new absolute UTC deadline

A new encodeSessionExpiresAt helper centralises the zero/non-zero
encoding and is shared by the Sync, Login and ExtendAuthSession
builders. The Sync builder still emits nil when settings are missing.
Login and ExtendAuthSession always carry an authoritative value.

The matching client-side decoder lands on feature/session-extend.
2026-05-19 14:46:27 +02:00
Zoltán Papp
6dcba89a46 add SSO session extend flow (management)
Adds the management-server half of the SSO session-extension feature:

- New ExtendAuthSession gRPC RPC that refreshes a peer's session expiry
  using a fresh JWT, validated through the same pipeline as Login but
  without tearing down the tunnel or redoing the NetworkMap sync.
- Per-peer SessionExpiresAt timestamp on every LoginResponse and
  SyncResponse so connected clients learn the deadline on the existing
  long-lived stream, and admin-side changes (toggling expiration,
  changing the expiration window) reach every peer within seconds.
- SessionExpiresAt(...) helper on Peer that derives the absolute UTC
  deadline from LastLogin + the account-level PeerLoginExpiration
  setting, returning zero when the peer is not SSO-tracked or expiration
  is disabled.

The matching client-side consumer of these fields lands separately.
2026-05-18 23:37:02 +02:00
Maycon Santos
af24fd7796 [management] Add metrics for peer status updates and ephemeral cleanup (#6196)
* [management] Add metrics for peer status updates and ephemeral cleanup

The session-fenced MarkPeerConnected / MarkPeerDisconnected path and
the ephemeral peer cleanup loop both run silently today: when fencing
rejects a stale stream, when a cleanup tick deletes peers, or when a
batch delete fails, we have no operational signal beyond log lines.

Add OpenTelemetry counters and a histogram so the same SLO-style
dashboards that already exist for the network-map controller can cover
peer connect/disconnect and ephemeral cleanup too.

All new attributes are bounded enums: operation in {connect,disconnect}
and outcome in {applied,stale,error,peer_not_found}. No account, peer,
or user ID is ever written as a metric label — total cardinality is
fixed at compile time (8 counter series, 2 histogram series, 4 unlabeled
ephemeral series).

Metric methods are nil-receiver safe so test composition that doesn't
wire telemetry (the bulk of the existing tests) works unchanged. The
ephemeral manager exposes a SetMetrics setter rather than taking the
collector through its constructor, keeping the constructor signature
stable across all test call sites.

* [management] Add OpenTelemetry metrics for ephemeral peer cleanup

Introduce counters for tracking ephemeral peer cleanup, including peers pending deletion, cleanup runs, successful deletions, and failed batches. Metrics are nil-receiver safe to ensure compatibility with test setups without telemetry.
2026-05-18 22:55:19 +02:00
Maycon Santos
13d32d274f [management] Fence peer status updates with a session token (#6193)
* [management] Fence peer status updates with a session token

The connect/disconnect path used a best-effort LastSeen-after-streamStart
comparison to decide whether a status update should land. Under contention
— a re-sync arriving while the previous stream's disconnect was still in
flight, or two management replicas seeing the same peer at once — the
check was a read-then-decide-then-write window: any UPDATE in between
caused the wrong row to be written. The Go-side time.Now() that fed the
comparison also drifted under lock contention, since it was captured
seconds before the write actually committed.

Replace it with an integer-nanosecond fencing token stored alongside the
status. Every gRPC sync stream uses its open time (UnixNano) as its token.
Connects only land when the incoming token is strictly greater than the
stored one; disconnects only land when the incoming token equals the
stored one (i.e. we're the stream that owns the current session). Both
are single optimistic-locked UPDATEs — no read-then-write, no transaction
wrapper.

LastSeen is now written by the database itself (CURRENT_TIMESTAMP). The
caller never supplies it, so the value always reflects the real moment
of the UPDATE rather than the moment the caller queued the work — which
was already off by minutes under heavy lock contention.

Side effects (geo lookup, peer-login-expiration scheduling, network-map
fan-out) are explicitly documented as running after the fence UPDATE
commits, never inside it. Geo also skips the update when realIP equals
the stored ConnectionIP, dropping a redundant SavePeerLocation call on
same-IP reconnects.

Tests cover the three semantic cases (matched disconnect lands, stale
disconnect dropped, stale connect dropped) plus a 16-goroutine race test
that asserts the highest token always wins.

* [management] Add SessionStartedAt to peer status updates

Stored `SessionStartedAt` for fencing token propagation across goroutines and updated database queries/functions to handle the new field. Removed outdated geolocation handling logic and adjusted tests for concurrency safety.

* Rename `peer_status_required_approval` to `peer_status_requires_approval` in SQL store fields
2026-05-18 20:25:12 +02:00
Nicolas Frati
705f87fc20 [management] fix: device redirect uri wasn't registered (#6191)
* fix: device redirect uri wasn't registered

* fix lint
2026-05-18 12:57:59 +02:00
Viktor Liu
3f91f49277 Clean up legacy 32-bit and HKCU registry entries on Windows install (#6176) v0.71.2 2026-05-16 16:52:57 +02:00
Maycon Santos
347c5bf317 Avoid context cancellation in cancelPeerRoutines (#6175)
When closing go routines and handling peer disconnect, we should avoid canceling the flow due to parent gRPC context cancellation.

This change triggers disconnection handling with a context that is not bound to the parent gRPC cancellation.
2026-05-16 16:29:01 +02:00
Viktor Liu
22e2519d71 [management] Avoid peer IP reallocation when account settings update preserves the network range (#6173) 2026-05-16 15:51:48 +02:00
Vlad
e916f12cca [proxy] auth token generation on mapping (#6157)
* [management / proxy] auth token generation on mapping

* fix tests
v0.71.1
2026-05-15 19:13:44 +02:00
Viktor Liu
9ed2e2a5b4 [client] Drop DNS probes for passive health projection (#5971) 2026-05-15 17:07:38 +02:00
Viktor Liu
2ccae7ec47 [client] Mirror v4 exit selection onto v6 pair and honour SkipAutoApply per route (#6150) 2026-05-15 16:58:47 +02:00
Viktor Liu
07e5450117 [management] Bracket IPv6 reverse-proxy target hosts when building URL Host field (#6141) v0.71.0 2026-05-14 16:42:40 +02:00
Viktor Liu
3f914090cb [client] Bracket IPv6 in embed listeners, expand debug bundle (#6134) 2026-05-14 16:22:53 +02:00
Viktor Liu
ea9fab4396 [management] Allocate and preserve IPv6 overlay addresses for embedded proxy peers (#6132) 2026-05-14 16:05:33 +02:00
Vlad
77b479286e [management] fix offline statuses for public proxy clusters (#6133) 2026-05-14 13:27:50 +02:00
Maycon Santos
ab2a8794e7 [client] Add short flags for status command options (#6137)
* [client] Add short flags for status command options

* uppercase filters
2026-05-14 12:30:42 +02:00
Viktor Liu
9126a192ca [client] Set 0644 perms on SSH client config after os.CreateTemp (#6126) 2026-05-12 15:05:53 +02:00
Viktor Liu
1224d6e1ee [client] Persist management URL and pre-shared key overrides on login (#6065) 2026-05-12 14:52:56 +02:00
Nicolas Frati
96672dd1f8 [management] chores: update dex version (#6124)
* chores: update dex version

* chore: update dex fork
2026-05-12 13:50:35 +02:00
Viktor Liu
946ce4c3da [client] Fix --config flag default to point at profile path (#6122) 2026-05-11 17:48:21 +02:00
Vlad
07cbfdbede [proxy] feature: bring your own proxy (#5627) 2026-05-11 14:31:38 +02:00
Viktor Liu
a4114a5e45 [client] Skip DNS upstream failover on definitive EDE (#6089) 2026-05-11 10:00:23 +02:00
Viktor Liu
6b08e89c7b [relay] Preserve non-standard port in WS dialer URL prep (#6061) 2026-05-11 09:59:33 +02:00
Viktor Liu
a852b3bd34 [client, proxy] Harden uspfilter conntrack and share TCP relay (#5936) 2026-05-11 09:59:13 +02:00
Viktor Liu
afb83b3049 [client] Use unique temp file and clean up on failure when writing ssh config (#6064) 2026-05-11 09:58:49 +02:00
Nicolas Frati
e89aad09f5 [management] Enable MFA for local users (#5804)
* wip: totp for local users

* fix providers not getting populated

* polished UI and fix post_login_redirect_uri

* fix: make sure logout is only prompted from oidc flow

Signed-off-by: jnfrati <nicofrati@gmail.com>

* update templates

Signed-off-by: jnfrati <nicofrati@gmail.com>

* deps: update dex dependency

Signed-off-by: jnfrati <nicofrati@gmail.com>

* fix qube issues

Signed-off-by: jnfrati <nicofrati@gmail.com>

* replace window with globalThis on home html

Signed-off-by: jnfrati <nicofrati@gmail.com>

* fixed coderabbit comments

Signed-off-by: jnfrati <nicofrati@gmail.com>

* debug

* remove unused config and rename totp issuer

* deps: update dex reference to latest

* add dashboard post logout redirect uri to embedded config

* implemented api for mfa configuration

* update docs and config parsing

* catch error on idp manager init mfa

* fix tests

* Add remember me  for MFA

* Add cookie encryption and session share between tabs

* fixed logout showing non actionable error and session cookie encription key

* fixed missing mfa settings on sql query for account

* fix code index for mfa activity

---------

Signed-off-by: jnfrati <nicofrati@gmail.com>
Co-authored-by: braginini <bangvalo@gmail.com>
2026-05-08 16:31:20 +02:00
Maycon Santos
7da94a4956 [misc] Update CONTRIBUTING.md (#6076) 2026-05-07 16:16:48 +02:00
Pascal Fischer
39eac377e4 [management] add update reason to buffered calls (#6103) 2026-05-07 15:55:59 +02:00
Viktor Liu
205ebcfda2 [management, client] Add IPv6 overlay support (#5631) 2026-05-07 11:33:37 +02:00
Zoltan Papp
f23aaa9ae7 [client] iOS: structured ResolvedIPs collection for domain routes (#6090)
* [client] iOS: structured ResolvedIPs collection for domain routes

Replace comma-joined ResolvedIPs string with a gomobile-friendly
ResolvedIPs collection (Add/Get/Size), mirroring the Android bridge
in client/android/network_domains.go.

This allows the iOS app to match domain-route resolved IPs against
connected peer routes without parsing CSV strings, fixing the route
status indicator for dynamic (DNS) routes.

* [client] iOS: align dynamic route exposure with Android bridge

For dynamic (DNS) routes the Swift side previously received
"invalid Prefix" as the Network value, forcing UI code to special-case
that sentinel. The Android bridge uses Domains.SafeString() instead so
peer.routes entries (which also derive from Domains.SafeString()) match
directly. Mirror that here.

Also fix the resolved IP lookup: resolvedDomains is keyed by the
resolved domain (e.g. api.ipify.org), not the configured pattern
(e.g. *.ipify.org). Group entries by ParentDomain like the daemon does
in client/server/network.go, so wildcard route patterns get their
resolved IPs populated.
2026-05-06 17:14:11 +02:00
Viktor Liu
f532976e05 [client] Add public key to debug bundle config.txt (#6092) 2026-05-06 13:42:47 +02:00
Viktor Liu
71a400f90f [client] Include MTU and SSH auth/JWT cache config in debug bundle (#6071) 2026-05-06 13:23:43 +02:00
Pascal Fischer
bfeb9b19ec [management] remove permissions from geolocations api (#6091) 2026-05-06 13:07:01 +02:00
Pascal Fischer
b19b7464ea [management] fix flaky invite token test (#6077) v0.70.5 2026-05-05 18:48:51 +02:00
Pascal Fischer
cfb1b3fe31 [proxy] consolidate mapping update (#6072) 2026-05-05 18:40:42 +02:00
Bethuel Mmbaga
3c28d29725 [management] Map Entra oid claim as Dex user ID (#6067) 2026-05-05 18:12:18 +03:00
Nicolas Frati
1795bc801d chores: updated discussions and issues templates (#6073) 2026-05-05 07:53:01 -07:00
Viktor Liu
31395f8bd2 [client] Use fwmark-aware route lookup for raw socket UDP checksum source (#6070)
* Use fwmark-aware route lookup for raw socket UDP checksum source

* Guard nil raw socket in sharedsock WriteTo
2026-05-05 16:18:22 +02:00
Viktor Liu
cd8e71002f [client] Bump go-netroute to v0.4.0 and drop fork (#6062) 2026-05-05 15:26:27 +02:00
Pascal Fischer
97db824929 [management] fix proxy reconnect (#6063) 2026-05-04 20:43:25 +02:00
Viktor Liu
77a0992dc2 [misc] Disable govet inline analyzer and tidy go.mod (#6066) 2026-05-05 02:59:41 +09:00
JungwooShin
104990dfdd [client] Display QR code for device auth login URL (#5415) 2026-05-04 18:59:29 +02:00
alexsavio
bde632c3b2 [client] Replace WG interface monitor polling with netlink subscription on Linux (#5857) 2026-05-04 18:49:39 +02:00
Lauri Tirkkonen
4268a5cfb7 [client] Use atomic write/rename pattern for ssh config 2026-05-04 18:24:52 +02:00
Zoltan Papp
a547fc74ed [client] Use ctx.Err() instead of gRPC codes.Canceled to detect shutdown (#6019)
Detecting shutdown by inspecting the gRPC status code conflates a local
context cancellation with a server- or proxy-sent codes.Canceled. When
the latter occurs (e.g. an intermediary proxy resets the stream), the
retry loop silently terminates and the client never reconnects.

Switch to ctx.Err() in the signal Receive loop and management Sync/Job
handlers, and stop matching gRPC Canceled/DeadlineExceeded in the flow
client's isContextDone helper. With this change, a server-sent Canceled
is treated as a transient error and the backoff retry loop continues.
2026-05-04 11:59:25 +02:00
Zoltan Papp
a21f6ecb0a [client] release Status.mux before invoking notifier callbacks (#6039)
The Status recorder used to fire notifier callbacks while holding d.mux:
- notifyPeerListChanged / notifyPeerStateChangeListeners ran from inside
  the locked section of every Update*/AddPeerStateRoute/etc.
- notifyAddressChanged ran from UpdateLocalPeerState and CleanLocalPeerState
  while d.mux was held.
- onConnectionChanged was registered with a defer above defer d.mux.Unlock,
  so it executed before the mutex was released in the Mark*Connected/
  Disconnected helpers.
- notifyPeerStateChangeListeners did a blocking channel send under d.mux,
  so a slow subscriber stalled every other d.mux holder.

A listener that re-enters the recorder (e.g. calls GetFullStatus from
within a callback) deadlocks against d.mux, and any callback that takes
longer than expected stalls every other state query for its duration.

Capture the values needed for notification under the lock, release d.mux,
then call the notifier. Build per-peer router-state snapshots inside the
lock and dispatch them via dispatchRouterPeers afterwards. The router-peer
channel send stays blocking, but now happens outside d.mux so a slow
consumer cannot stall any other d.mux holder, and no peer state
transitions are silently dropped.

The notifier itself is unchanged: its internal state was already protected
by its own locks, and the field d.notifier is set once in NewRecorder and
never reassigned, so reading it without d.mux is safe.

Also fix a pre-existing race in Test_notifier_RemoveListener /
Test_notifier_SetListener: setListener spawns a goroutine that writes
listener.peers, but the tests read listener.peers without waiting for it.
2026-05-04 11:59:01 +02:00
Bethuel Mmbaga
6262b0d841 [management] Track pending approval in peer event metadata (#6040) 2026-05-04 12:47:13 +03:00
Viktor Liu
50b58a6828 [client, relay] Advertise relay server IP via signal for foreign-relay fallback dial (#6004) 2026-05-04 11:40:25 +02:00
Viktor Liu
057d651d2e [client, proxy] Add packet capture to debug bundle and CLI (#5891) 2026-05-04 11:28:56 +02:00
Misha Bragin
c4b2da4c92 [management] Add public connection ipv4 and ipv6 posture check (#6038)
This change enables admins to configure posture checks for connecting public IPs of their peers.

It changes the behavior of the check as well and now the evaluation is if the received network is part of the configured network.
2026-04-30 18:36:50 +02:00