Compare commits

..

45 Commits

Author SHA1 Message Date
riccardom
7706f578fe use API 2026-06-18 19:25:55 +02:00
riccardom
daf5026192 Adds restart for MDM 2026-06-18 19:23:41 +02:00
riccardom
ec18b07959 NOP 2026-06-18 17:27:18 +02:00
riccardom
9628f016da Bridges embed to use the supervisor + remove providing of establishedChan from caller
(We now do AsyncStart + waitEstablishedOrDone, so everything is managed inside the
supervisor)
2026-06-18 17:10:17 +02:00
riccardom
b39e9df194 Renaming 2026-06-18 16:51:06 +02:00
riccardom
0388e0f262 Merge branch 'main' into lock_removal
# Conflicts:
#	client/internal/connect.go
#	client/ios/NetBirdSDK/client.go
#	client/server/server.go
#	client/server/server_test.go
2026-06-18 16:10:40 +02:00
riccardom
86f896723d Wait on broadcasted ended signal for establishment / done 2026-06-18 15:20:11 +02:00
Viktor Liu
d3710d4bb2 [signal] Serialize concurrent sends to a peer signal stream (#6463) 2026-06-18 15:00:19 +02:00
riccardom
29ee84999c conn established (success) or done (end/failure..) are signals of the supervisor 2026-06-18 14:40:03 +02:00
riccardom
0e8fd22f36 Highlight what signals that the sup is running (which means the Connection is running because of UP/auto start) 2026-06-18 14:40:03 +02:00
riccardom
ff98105212 Clarifies: service -> ServiceRunning -> up -> ConnectionRunning -> connestablished ->connEstablished -> end of run -> connDone 2026-06-18 14:40:03 +02:00
riccardom
6465997a69 Aligns tests 2026-06-18 14:40:03 +02:00
riccardom
3204270c4b Removes all unrequired checks, since the lifetime now guarantees the non nil presence of connectClient 2026-06-18 14:40:03 +02:00
riccardom
6d3bcef2c4 Rename to clarify 2026-06-18 14:40:03 +02:00
riccardom
5d7cb30e5b Removes other occurrencies of connectClient check 2026-06-18 14:40:03 +02:00
riccardom
aff5da2c8e Log something better when UP doesn't find service grpc socket 2026-06-18 14:40:03 +02:00
Theodor Midtlien
ee360963f9 [client] Migrate profile identity from display name to ID and allow renaming of profiles (#6367)
* Migrate to profile ids

* Migrate android profile manager

* Clean up

* Fix review

* Add ID type

* Fix test and runes in ShortID()

* Fix profile switch on up and android comments

* Revert android profile to string id

* Fix feedback

* Fix UI feedback

* Fix id assignment

* Add renaming of profiles

* Fix review

* Remove ui binary
* Fix getProfileConfigPath not validating id

* Change resolve handle order and fix server merge problems

* Fix mdm test
2026-06-18 08:49:19 +02:00
riccardom
9b179be324 Defines an API for knowing if the SERVICE is running (regardless of up and down state)
New() builds s.connecClient and is called when the gRPC service is started.
Up() is invoked only IF a gRPC service IS running which is possible only if the New() was
called.
2026-06-17 23:30:30 +02:00
riccardom
33e7b6a8f1 Align tests 2026-06-17 23:00:53 +02:00
riccardom
e0cff5e240 If New creates, Starts MUST find a connectClient 2026-06-17 23:00:53 +02:00
Maycon Santos
8d9580e491 [misc] improve goreleaser with RC handling and update docker builds (#6438)
- introduce variables to avoid publishing latest docker tags and installers
- Refactor .goreleaser.yaml to simplify docker configurations and add environment-driven flags
- removed management debug containers (it was doing only log var)
- Stopped building arm v6 32bits in favor of v7 32 bits for services (not client)
- Add target argument to docker files
2026-06-17 20:13:13 +02:00
Viktor Liu
5bd7c6c7ea [client] Detect and recover from a stalled signal receive stream (#6459) 2026-06-17 18:48:09 +02:00
Zoltan Papp
8ae2cd0a08 [client] Fix ios route notify ordering (#6454)
* [client] fix iOS route-update reordering that black-holed IPv6 on exit-node disable

On iOS the route notifier delivered each prefix update from its own
fire-and-forget goroutine (notify -> `go func`), so Go provided no ordering
guarantee between consecutive updates. It also read currentPrefixes inside
that goroutine without holding the lock, racing the next OnNewPrefixes write.

On exit-node disable the core removes the default routes as two separate
prefix updates (0.0.0.0/0, then the synthesized ::/0). When the two
goroutines were reordered, the stale snapshot still containing ::/0 was
delivered last and clobbered the correct default-free one. iOS then kept the
::/0 default route on the tunnel with no exit node to carry it, black-holing
all IPv6 traffic while IPv4 recovered correctly.

Fix: deliver updates through a single worker goroutine fed by a buffered
channel, preserving production order, and snapshot the joined prefix string
under the mutex so it can't race a concurrent update. Buffered so producers
(which run under the route manager lock) don't block on the listener callback.

* [client] close iOS notifier delivery goroutine on Stop, unbounded queue

The delivery goroutine was never stopped, leaking on every engine
restart. Add Notifier.Close, called from the route manager Stop after
routing cleanup.

Replace the buffered update channel with a cond-driven linked-list
queue so route-update producers (running under the route manager lock)
never block when the listener callback is slow.
2026-06-17 18:29:33 +02:00
Pascal Fischer
e4397d4d46 [management] remove nmap calc from login (#6449) 2026-06-17 16:37:24 +02:00
Viktor Liu
6fbc90b4d3 [client, relay] Expose relay transport and connection errors in status and metrics (#6342) 2026-06-17 15:41:48 +02:00
Riccardo Manfrin
5095e17cc5 [management] fix flaky Test_SaveAccount_Large from random IP collision (#6452) 2026-06-17 14:00:50 +02:00
riccardom
0085aebf77 Removes external connectWithRetryRuns 2026-06-17 08:52:02 +02:00
riccardom
91d2d341b7 Guard is done inside not from external. Stop called unconditionally 2026-06-16 23:18:50 +02:00
riccardom
8d46580c13 Not the run duty 2026-06-16 23:09:59 +02:00
riccardom
b42fe6a10f And now let's just avoid it at all 2026-06-16 21:58:05 +02:00
riccardom
0f5d7fdc07 Removes deadlock 2026-06-16 21:57:07 +02:00
riccardom
13c78d98f5 Client (and the supervisor within) now lives forever.
So checkin that it's nil isn't anymore an indirect
way to know the cleanup has succeeded
2026-06-16 18:55:08 +02:00
riccardom
d1229ed84c Restores context MD passed to GetInfo to mgmt (is it valuable data?) 2026-06-16 18:55:08 +02:00
riccardom
9758145517 Discriminate auth fails from mgmt unreachable.
Needed to avoid this workaround

if status != internal.StatusIdle && status != internal.StatusConnected && status != internal.StatusConnecting {
     s.actCancel()
}

in server.go Status(...)
2026-06-16 18:55:08 +02:00
riccardom
200a5a6a70 Rename. We keep only start command (long lasting command)
For stop we do synchronously.
2026-06-16 18:55:08 +02:00
riccardom
1f7b1ea863 IsRunning V1 2026-06-16 18:55:08 +02:00
riccardom
4abb10c1aa Fixes DisableAutoConnect semantics
DisableAutoConnect semantics
============================

Scope: governs ONLY the service-Start auto-connect decision. Once the
connection goroutine has been spawned (by any path), the flag is never
consulted again — the retry loop keeps trying to connect until ctx is
cancelled by Down / Stop / Logout.

Cases
-----

1. Service Start + DisableAutoConnect = true
   - No connection goroutine spawned.
   - clientRunning stays false.
   - State set to StatusIdle.
   - Daemon stays passive until an explicit Up RPC.

2. Service Start + DisableAutoConnect = false
   - Spawn connectWithRetryRuns.
   - clientRunning = true.
   - Retry loop runs until ctx cancelled.

3. Up RPC (any value of DisableAutoConnect)
   - Flag ignored. The user / admin explicitly asked to connect — by
     definition not "auto".
   - Spawn connectWithRetryRuns.
   - clientRunning = true.

4. MDM-triggered restart (any value of DisableAutoConnect)
   - Flag ignored. An MDM policy change applies new config to an
     already-running engine; treated as an implicit Up.
   - Spawn connectWithRetryRuns.
   - clientRunning = true.

5. Down / Stop / Logout
   - Cancels ctx → connectWithRetryRuns exits → close(giveUpChan).
   - cleanupConnection clears clientRunning = false.
   - DisableAutoConnect not involved.
Prepare for next step

Collapse error log into s.connect and rename it to more explicit connectOnce

# Conflicts:
#	client/server/server.go
2026-06-16 18:55:08 +02:00
riccardom
a45cefe57a IsRunning V0 2026-06-16 18:55:08 +02:00
riccardom
a6d504633f Use Stop not other direct calls like actCancel() things.. for now adding then removing 2026-06-16 18:55:08 +02:00
riccardom
70f2097fff config becomes start/run arg so we are sure it gets updated on any (re)start
This is because we now have the supervisor not destroyed over time

Still intermediate step where server.go regenerates the client entirely.
2026-06-16 18:31:46 +02:00
riccardom
befa9a879c DO NOT recreate ConnectClient! 2026-06-16 18:31:46 +02:00
riccardom
4152c41796 Wire the stop-for-any-reason to the sup stop (and remove race on engine!) 2026-06-16 18:31:46 +02:00
riccardom
8b76b3d824 Keep the command DON'T DO COPIES!!! 2026-06-16 18:31:46 +02:00
riccardom
0503a18644 Wraps client lifetime into a supervisor
- define needed supervisor context/variables
- will use runCancel as knob to know if the client is running. No extra boolean flags
- runWaiter is used to signal to the async run caller
2026-06-16 18:31:46 +02:00
riccardom
ec6512d660 Docker build env for windows/android 2026-06-16 15:31:52 +02:00
90 changed files with 2159 additions and 1618 deletions

View File

@@ -9,10 +9,13 @@ on:
pull_request:
env:
SIGN_PIPE_VER: "v0.1.5"
GORELEASER_VER: "v2.14.3"
SIGN_PIPE_VER: "v0.1.6"
GORELEASER_VER: "v2.16.0"
PRODUCT_NAME: "NetBird"
COPYRIGHT: "NetBird GmbH"
flags: ""
SKIP_PUBLISH: "true"
SKIP_DOCKER_PUSH: "false"
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}-${{ github.head_ref || github.actor_id }}
@@ -130,8 +133,6 @@ jobs:
windows_packages_artifact_url: ${{ steps.upload_windows_packages.outputs.artifact-url }}
macos_packages_artifact_url: ${{ steps.upload_macos_packages.outputs.artifact-url }}
ghcr_images: ${{ steps.tag_and_push_images.outputs.images_markdown }}
env:
flags: ""
steps:
- name: Checkout
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
@@ -143,8 +144,27 @@ jobs:
id: semver_parser
uses: netbirdio/shared-actions/actions/parse-semver@be5df6047383da2236e02243cceb857d8567c27e # v0.0.2
- if: ${{ !startsWith(github.ref, 'refs/tags/v') }}
run: echo "flags=--snapshot" >> $GITHUB_ENV
- name: Set snapshot flag
if: ${{ !startsWith(github.ref, 'refs/tags/v') }}
run: |
echo "flags=--snapshot" >> $GITHUB_ENV
- name: Set build vars
if: ${{ startsWith(github.ref, 'refs/tags/v') }}
run: |
if [[ "x-${{ steps.semver_parser.outputs.prerelease }}" == "x-" && "x-${{ github.repository }}" == "x-netbirdio/netbird" ]]; then
echo "x-${{ github.repository }}"
echo "x-${{ steps.semver_parser.outputs.prerelease }}"
echo "SKIP_PUBLISH=false" >> $GITHUB_ENV
else
echo "x-${{ github.repository }}"
echo "x-${{ steps.semver_parser.outputs.prerelease }}"
fi
if [[ "x-${{ github.repository }}" != "x-netbirdio/netbird" ]]; then
echo "SKIP_DOCKER_PUSH=true" >> $GITHUB_ENV
fi
- name: Set up Go
uses: actions/setup-go@4b73464bb391d4059bd26b0524d20df3927bd417 # v6.3.0
with:
@@ -212,6 +232,8 @@ jobs:
UPLOAD_YUM_SECRET: ${{ secrets.PKG_UPLOAD_SECRET }}
GPG_RPM_KEY_FILE: ${{ env.GPG_RPM_KEY_FILE }}
NFPM_NETBIRD_RPM_PASSPHRASE: ${{ secrets.GPG_RPM_PASSPHRASE }}
SKIP_PUBLISH: ${{ env.SKIP_PUBLISH }}
SKIP_DOCKER_PUSH: ${{ env.SKIP_DOCKER_PUSH }}
- name: Verify RPM signatures
run: |
docker run --rm -v $(pwd)/dist:/dist fedora:41 bash -c '
@@ -334,8 +356,22 @@ jobs:
id: semver_parser
uses: netbirdio/shared-actions/actions/parse-semver@be5df6047383da2236e02243cceb857d8567c27e # v0.0.2
- if: ${{ !startsWith(github.ref, 'refs/tags/v') }}
run: echo "flags=--snapshot" >> $GITHUB_ENV
- name: Set snapshot flag
if: ${{ !startsWith(github.ref, 'refs/tags/v') }}
run: |
echo "flags=--snapshot" >> $GITHUB_ENV
- name: Set build vars
if: ${{ startsWith(github.ref, 'refs/tags/v') }}
run: |
if [[ "x-${{ steps.semver_parser.outputs.prerelease }}" == "x-" && "x-${{ github.repository }}" == "x-netbirdio/netbird" ]]; then
echo "x-${{ github.repository }}"
echo "x-${{ steps.semver_parser.outputs.prerelease }}"
echo "SKIP_PUBLISH=false" >> $GITHUB_ENV
else
echo "x-${{ github.repository }}"
echo "x-${{ steps.semver_parser.outputs.prerelease }}"
fi
- name: Set up Go
uses: actions/setup-go@4b73464bb391d4059bd26b0524d20df3927bd417 # v6.3.0
@@ -395,6 +431,7 @@ jobs:
UPLOAD_YUM_SECRET: ${{ secrets.PKG_UPLOAD_SECRET }}
GPG_RPM_KEY_FILE: ${{ env.GPG_RPM_KEY_FILE }}
NFPM_NETBIRD_UI_RPM_PASSPHRASE: ${{ secrets.GPG_RPM_PASSPHRASE }}
SKIP_PUBLISH: ${{ env.SKIP_PUBLISH }}
- name: Verify RPM signatures
run: |
docker run --rm -v $(pwd)/dist:/dist fedora:41 bash -c '

View File

@@ -1,5 +1,7 @@
version: 2
env:
- SKIP_PUBLISH={{ if index .Env "SKIP_PUBLISH" }}{{ .Env.SKIP_PUBLISH }}{{ else }}true{{ end }}
- SKIP_DOCKER_PUSH={{ if index .Env "SKIP_DOCKER_PUSH" }}{{ .Env.SKIP_DOCKER_PUSH }}{{ else }}false{{ end }}
project_name: netbird
builds:
- id: netbird-wasm
@@ -74,6 +76,8 @@ builds:
- amd64
- arm64
- arm
goarm:
- 7
ldflags:
- -s -w -X github.com/netbirdio/netbird/version.version={{.Version}} -X main.commit={{.Commit}} -X main.date={{.CommitDate}} -X main.builtBy=goreleaser
mod_timestamp: "{{ .CommitTimestamp }}"
@@ -88,6 +92,8 @@ builds:
- amd64
- arm64
- arm
goarm:
- 7
ldflags:
- -s -w -X github.com/netbirdio/netbird/version.version={{.Version}} -X main.commit={{.Commit}} -X main.date={{.CommitDate}} -X main.builtBy=goreleaser
mod_timestamp: "{{ .CommitTimestamp }}"
@@ -102,6 +108,8 @@ builds:
- amd64
- arm64
- arm
goarm:
- 7
ldflags:
- -s -w -X github.com/netbirdio/netbird/version.version={{.Version}} -X main.commit={{.Commit}} -X main.date={{.CommitDate}} -X main.builtBy=goreleaser
mod_timestamp: "{{ .CommitTimestamp }}"
@@ -122,6 +130,8 @@ builds:
- amd64
- arm64
- arm
goarm:
- 7
ldflags:
- -s -w -X github.com/netbirdio/netbird/version.version={{.Version}} -X main.commit={{.Commit}} -X main.date={{.CommitDate}} -X main.builtBy=goreleaser
mod_timestamp: "{{ .CommitTimestamp }}"
@@ -136,6 +146,8 @@ builds:
- amd64
- arm64
- arm
goarm:
- 7
ldflags:
- -s -w -X github.com/netbirdio/netbird/version.version={{.Version}} -X main.commit={{.Commit}} -X main.date={{.CommitDate}} -X main.builtBy=goreleaser
mod_timestamp: "{{ .CommitTimestamp }}"
@@ -150,6 +162,8 @@ builds:
- amd64
- arm64
- arm
goarm:
- 7
ldflags:
- -s -w -X main.Version={{.Version}} -X main.Commit={{.Commit}} -X main.BuildDate={{.CommitDate}}
mod_timestamp: "{{ .CommitTimestamp }}"
@@ -170,6 +184,8 @@ builds:
- amd64
- arm64
- arm
goarm:
- 7
ldflags:
- -s -w -X github.com/netbirdio/netbird/version.version={{.Version}} -X main.commit={{.Commit}} -X main.date={{.CommitDate}} -X main.builtBy=goreleaser
mod_timestamp: "{{ .CommitTimestamp }}"
@@ -222,670 +238,192 @@ nfpms:
rpm:
signature:
key_file: '{{ if index .Env "GPG_RPM_KEY_FILE" }}{{ .Env.GPG_RPM_KEY_FILE }}{{ end }}'
dockers:
- image_templates:
- netbirdio/netbird:{{ .Version }}-amd64
- ghcr.io/netbirdio/netbird:{{ .Version }}-amd64
ids:
- netbird
goarch: amd64
use: buildx
dockerfile: client/Dockerfile
extra_files:
- client/netbird-entrypoint.sh
build_flag_templates:
- "--platform=linux/amd64"
- "--label=org.opencontainers.image.created={{.Date}}"
- "--label=org.opencontainers.image.title={{.ProjectName}}"
- "--label=org.opencontainers.image.version={{.Version}}"
- "--label=org.opencontainers.image.revision={{.FullCommit}}"
- "--label=org.opencontainers.image.source=https://github.com/netbirdio/{{.ProjectName}}"
- "--label=maintainer=dev@netbird.io"
- image_templates:
- netbirdio/netbird:{{ .Version }}-arm64v8
- ghcr.io/netbirdio/netbird:{{ .Version }}-arm64v8
ids:
- netbird
goarch: arm64
use: buildx
dockerfile: client/Dockerfile
extra_files:
- client/netbird-entrypoint.sh
build_flag_templates:
- "--platform=linux/arm64"
- "--label=org.opencontainers.image.created={{.Date}}"
- "--label=org.opencontainers.image.title={{.ProjectName}}"
- "--label=org.opencontainers.image.version={{.Version}}"
- "--label=org.opencontainers.image.revision={{.FullCommit}}"
- "--label=org.opencontainers.image.source=https://github.com/netbirdio/{{.ProjectName}}"
- "--label=maintainer=dev@netbird.io"
- image_templates:
- netbirdio/netbird:{{ .Version }}-arm
- ghcr.io/netbirdio/netbird:{{ .Version }}-arm
ids:
- netbird
goarch: arm
goarm: 6
use: buildx
dockerfile: client/Dockerfile
extra_files:
- client/netbird-entrypoint.sh
build_flag_templates:
- "--platform=linux/arm"
- "--label=org.opencontainers.image.created={{.Date}}"
- "--label=org.opencontainers.image.title={{.ProjectName}}"
- "--label=org.opencontainers.image.version={{.Version}}"
- "--label=org.opencontainers.image.revision={{.FullCommit}}"
- "--label=org.opencontainers.image.source=https://github.com/netbirdio/{{.ProjectName}}"
- "--label=maintainer=dev@netbird.io"
- image_templates:
- netbirdio/netbird:{{ .Version }}-rootless-amd64
- ghcr.io/netbirdio/netbird:{{ .Version }}-rootless-amd64
ids:
- netbird
goarch: amd64
use: buildx
dockerfile: client/Dockerfile-rootless
extra_files:
- client/netbird-entrypoint.sh
build_flag_templates:
- "--platform=linux/amd64"
- "--label=org.opencontainers.image.created={{.Date}}"
- "--label=org.opencontainers.image.title={{.ProjectName}}"
- "--label=org.opencontainers.image.version={{.Version}}"
- "--label=org.opencontainers.image.revision={{.FullCommit}}"
- "--label=org.opencontainers.image.source=https://github.com/netbirdio/{{.ProjectName}}"
- "--label=maintainer=dev@netbird.io"
- image_templates:
- netbirdio/netbird:{{ .Version }}-rootless-arm64v8
- ghcr.io/netbirdio/netbird:{{ .Version }}-rootless-arm64v8
ids:
- netbird
goarch: arm64
use: buildx
dockerfile: client/Dockerfile-rootless
extra_files:
- client/netbird-entrypoint.sh
build_flag_templates:
- "--platform=linux/arm64"
- "--label=org.opencontainers.image.created={{.Date}}"
- "--label=org.opencontainers.image.title={{.ProjectName}}"
- "--label=org.opencontainers.image.version={{.Version}}"
- "--label=org.opencontainers.image.revision={{.FullCommit}}"
- "--label=org.opencontainers.image.source=https://github.com/netbirdio/{{.ProjectName}}"
- "--label=maintainer=dev@netbird.io"
- image_templates:
- netbirdio/netbird:{{ .Version }}-rootless-arm
- ghcr.io/netbirdio/netbird:{{ .Version }}-rootless-arm
ids:
- netbird
goarch: arm
goarm: 6
use: buildx
dockerfile: client/Dockerfile-rootless
extra_files:
- client/netbird-entrypoint.sh
build_flag_templates:
- "--platform=linux/arm"
- "--label=org.opencontainers.image.created={{.Date}}"
- "--label=org.opencontainers.image.title={{.ProjectName}}"
- "--label=org.opencontainers.image.version={{.Version}}"
- "--label=org.opencontainers.image.revision={{.FullCommit}}"
- "--label=org.opencontainers.image.source=https://github.com/netbirdio/{{.ProjectName}}"
- "--label=maintainer=dev@netbird.io"
- image_templates:
- netbirdio/relay:{{ .Version }}-amd64
- ghcr.io/netbirdio/relay:{{ .Version }}-amd64
ids:
- netbird-relay
goarch: amd64
use: buildx
dockerfile: relay/Dockerfile
build_flag_templates:
- "--platform=linux/amd64"
- "--label=org.opencontainers.image.created={{.Date}}"
- "--label=org.opencontainers.image.title={{.ProjectName}}"
- "--label=org.opencontainers.image.version={{.Version}}"
- "--label=org.opencontainers.image.revision={{.FullCommit}}"
- "--label=org.opencontainers.image.source=https://github.com/netbirdio/{{.ProjectName}}"
- "--label=maintainer=dev@netbird.io"
- image_templates:
- netbirdio/relay:{{ .Version }}-arm64v8
- ghcr.io/netbirdio/relay:{{ .Version }}-arm64v8
ids:
- netbird-relay
goarch: arm64
use: buildx
dockerfile: relay/Dockerfile
build_flag_templates:
- "--platform=linux/arm64"
- "--label=org.opencontainers.image.created={{.Date}}"
- "--label=org.opencontainers.image.title={{.ProjectName}}"
- "--label=org.opencontainers.image.version={{.Version}}"
- "--label=org.opencontainers.image.revision={{.FullCommit}}"
- "--label=org.opencontainers.image.source=https://github.com/netbirdio/{{.ProjectName}}"
- "--label=maintainer=dev@netbird.io"
- image_templates:
- netbirdio/relay:{{ .Version }}-arm
- ghcr.io/netbirdio/relay:{{ .Version }}-arm
ids:
- netbird-relay
goarch: arm
goarm: 6
use: buildx
dockerfile: relay/Dockerfile
build_flag_templates:
- "--platform=linux/arm"
- "--label=org.opencontainers.image.created={{.Date}}"
- "--label=org.opencontainers.image.title={{.ProjectName}}"
- "--label=org.opencontainers.image.version={{.Version}}"
- "--label=org.opencontainers.image.revision={{.FullCommit}}"
- "--label=org.opencontainers.image.source=https://github.com/netbirdio/{{.ProjectName}}"
- "--label=maintainer=dev@netbird.io"
- image_templates:
- netbirdio/signal:{{ .Version }}-amd64
- ghcr.io/netbirdio/signal:{{ .Version }}-amd64
ids:
- netbird-signal
goarch: amd64
use: buildx
dockerfile: signal/Dockerfile
build_flag_templates:
- "--platform=linux/amd64"
- "--label=org.opencontainers.image.created={{.Date}}"
- "--label=org.opencontainers.image.title={{.ProjectName}}"
- "--label=org.opencontainers.image.version={{.Version}}"
- "--label=org.opencontainers.image.revision={{.FullCommit}}"
- "--label=org.opencontainers.image.source=https://github.com/netbirdio/{{.ProjectName}}"
- "--label=maintainer=dev@netbird.io"
- image_templates:
- netbirdio/signal:{{ .Version }}-arm64v8
- ghcr.io/netbirdio/signal:{{ .Version }}-arm64v8
ids:
- netbird-signal
goarch: arm64
use: buildx
dockerfile: signal/Dockerfile
build_flag_templates:
- "--platform=linux/arm64"
- "--label=org.opencontainers.image.created={{.Date}}"
- "--label=org.opencontainers.image.title={{.ProjectName}}"
- "--label=org.opencontainers.image.version={{.Version}}"
- "--label=org.opencontainers.image.revision={{.FullCommit}}"
- "--label=org.opencontainers.image.source=https://github.com/netbirdio/{{.ProjectName}}"
- "--label=maintainer=dev@netbird.io"
- image_templates:
- netbirdio/signal:{{ .Version }}-arm
- ghcr.io/netbirdio/signal:{{ .Version }}-arm
ids:
- netbird-signal
goarch: arm
goarm: 6
use: buildx
dockerfile: signal/Dockerfile
build_flag_templates:
- "--platform=linux/arm"
- "--label=org.opencontainers.image.created={{.Date}}"
- "--label=org.opencontainers.image.title={{.ProjectName}}"
- "--label=org.opencontainers.image.version={{.Version}}"
- "--label=org.opencontainers.image.revision={{.FullCommit}}"
- "--label=org.opencontainers.image.source=https://github.com/netbirdio/{{.ProjectName}}"
- "--label=maintainer=dev@netbird.io"
- image_templates:
- netbirdio/management:{{ .Version }}-amd64
- ghcr.io/netbirdio/management:{{ .Version }}-amd64
ids:
- netbird-mgmt
goarch: amd64
use: buildx
dockerfile: management/Dockerfile
build_flag_templates:
- "--platform=linux/amd64"
- "--label=org.opencontainers.image.created={{.Date}}"
- "--label=org.opencontainers.image.title={{.ProjectName}}"
- "--label=org.opencontainers.image.version={{.Version}}"
- "--label=org.opencontainers.image.revision={{.FullCommit}}"
- "--label=org.opencontainers.image.source=https://github.com/netbirdio/{{.ProjectName}}"
- "--label=maintainer=dev@netbird.io"
- image_templates:
- netbirdio/management:{{ .Version }}-arm64v8
- ghcr.io/netbirdio/management:{{ .Version }}-arm64v8
ids:
- netbird-mgmt
goarch: arm64
use: buildx
dockerfile: management/Dockerfile
build_flag_templates:
- "--platform=linux/arm64"
- "--label=org.opencontainers.image.created={{.Date}}"
- "--label=org.opencontainers.image.title={{.ProjectName}}"
- "--label=org.opencontainers.image.version={{.Version}}"
- "--label=org.opencontainers.image.revision={{.FullCommit}}"
- "--label=org.opencontainers.image.source=https://github.com/netbirdio/{{.ProjectName}}"
- "--label=maintainer=dev@netbird.io"
- image_templates:
- netbirdio/management:{{ .Version }}-arm
- ghcr.io/netbirdio/management:{{ .Version }}-arm
ids:
- netbird-mgmt
goarch: arm
goarm: 6
use: buildx
dockerfile: management/Dockerfile
build_flag_templates:
- "--platform=linux/arm"
- "--label=org.opencontainers.image.created={{.Date}}"
- "--label=org.opencontainers.image.title={{.ProjectName}}"
- "--label=org.opencontainers.image.version={{.Version}}"
- "--label=org.opencontainers.image.revision={{.FullCommit}}"
- "--label=org.opencontainers.image.source=https://github.com/netbirdio/{{.ProjectName}}"
- "--label=maintainer=dev@netbird.io"
- image_templates:
- netbirdio/management:{{ .Version }}-debug-amd64
- ghcr.io/netbirdio/management:{{ .Version }}-debug-amd64
ids:
- netbird-mgmt
goarch: amd64
use: buildx
dockerfile: management/Dockerfile.debug
build_flag_templates:
- "--platform=linux/amd64"
- "--label=org.opencontainers.image.created={{.Date}}"
- "--label=org.opencontainers.image.title={{.ProjectName}}"
- "--label=org.opencontainers.image.version={{.Version}}"
- "--label=org.opencontainers.image.revision={{.FullCommit}}"
- "--label=org.opencontainers.image.source=https://github.com/netbirdio/{{.ProjectName}}"
- "--label=maintainer=dev@netbird.io"
- image_templates:
- netbirdio/management:{{ .Version }}-debug-arm64v8
- ghcr.io/netbirdio/management:{{ .Version }}-debug-arm64v8
ids:
- netbird-mgmt
goarch: arm64
use: buildx
dockerfile: management/Dockerfile.debug
build_flag_templates:
- "--platform=linux/arm64"
- "--label=org.opencontainers.image.created={{.Date}}"
- "--label=org.opencontainers.image.title={{.ProjectName}}"
- "--label=org.opencontainers.image.version={{.Version}}"
- "--label=org.opencontainers.image.revision={{.FullCommit}}"
- "--label=org.opencontainers.image.source=https://github.com/netbirdio/{{.ProjectName}}"
- "--label=maintainer=dev@netbird.io"
- image_templates:
- netbirdio/management:{{ .Version }}-debug-arm
- ghcr.io/netbirdio/management:{{ .Version }}-debug-arm
ids:
- netbird-mgmt
goarch: arm
goarm: 6
use: buildx
dockerfile: management/Dockerfile.debug
build_flag_templates:
- "--platform=linux/arm"
- "--label=org.opencontainers.image.created={{.Date}}"
- "--label=org.opencontainers.image.title={{.ProjectName}}"
- "--label=org.opencontainers.image.version={{.Version}}"
- "--label=org.opencontainers.image.revision={{.FullCommit}}"
- "--label=org.opencontainers.image.source=https://github.com/netbirdio/{{.ProjectName}}"
- "--label=maintainer=dev@netbird.io"
- image_templates:
- netbirdio/upload:{{ .Version }}-amd64
- ghcr.io/netbirdio/upload:{{ .Version }}-amd64
ids:
- netbird-upload
goarch: amd64
use: buildx
dockerfile: upload-server/Dockerfile
build_flag_templates:
- "--platform=linux/amd64"
- "--label=org.opencontainers.image.created={{.Date}}"
- "--label=org.opencontainers.image.title={{.ProjectName}}"
- "--label=org.opencontainers.image.version={{.Version}}"
- "--label=org.opencontainers.image.revision={{.FullCommit}}"
- "--label=org.opencontainers.image.source=https://github.com/netbirdio/{{.ProjectName}}"
- "--label=maintainer=dev@netbird.io"
- image_templates:
- netbirdio/upload:{{ .Version }}-arm64v8
- ghcr.io/netbirdio/upload:{{ .Version }}-arm64v8
ids:
- netbird-upload
goarch: arm64
use: buildx
dockerfile: upload-server/Dockerfile
build_flag_templates:
- "--platform=linux/arm64"
- "--label=org.opencontainers.image.created={{.Date}}"
- "--label=org.opencontainers.image.title={{.ProjectName}}"
- "--label=org.opencontainers.image.version={{.Version}}"
- "--label=org.opencontainers.image.revision={{.FullCommit}}"
- "--label=org.opencontainers.image.source=https://github.com/netbirdio/{{.ProjectName}}"
- "--label=maintainer=dev@netbird.io"
- image_templates:
- netbirdio/upload:{{ .Version }}-arm
- ghcr.io/netbirdio/upload:{{ .Version }}-arm
ids:
- netbird-upload
goarch: arm
goarm: 6
use: buildx
dockerfile: upload-server/Dockerfile
build_flag_templates:
- "--platform=linux/arm"
- "--label=org.opencontainers.image.created={{.Date}}"
- "--label=org.opencontainers.image.title={{.ProjectName}}"
- "--label=org.opencontainers.image.version={{.Version}}"
- "--label=org.opencontainers.image.revision={{.FullCommit}}"
- "--label=org.opencontainers.image.source=https://github.com/netbirdio/{{.ProjectName}}"
- "--label=maintainer=dev@netbird.io"
- image_templates:
- netbirdio/netbird-server:{{ .Version }}-amd64
- ghcr.io/netbirdio/netbird-server:{{ .Version }}-amd64
ids:
- netbird-server
goarch: amd64
use: buildx
dockerfile: combined/Dockerfile
build_flag_templates:
- "--platform=linux/amd64"
- "--label=org.opencontainers.image.created={{.Date}}"
- "--label=org.opencontainers.image.title={{.ProjectName}}"
- "--label=org.opencontainers.image.version={{.Version}}"
- "--label=org.opencontainers.image.revision={{.FullCommit}}"
- "--label=org.opencontainers.image.source=https://github.com/netbirdio/{{.ProjectName}}"
- "--label=maintainer=dev@netbird.io"
- image_templates:
- netbirdio/netbird-server:{{ .Version }}-arm64v8
- ghcr.io/netbirdio/netbird-server:{{ .Version }}-arm64v8
ids:
- netbird-server
goarch: arm64
use: buildx
dockerfile: combined/Dockerfile
build_flag_templates:
- "--platform=linux/arm64"
- "--label=org.opencontainers.image.created={{.Date}}"
- "--label=org.opencontainers.image.title={{.ProjectName}}"
- "--label=org.opencontainers.image.version={{.Version}}"
- "--label=org.opencontainers.image.revision={{.FullCommit}}"
- "--label=org.opencontainers.image.source=https://github.com/netbirdio/{{.ProjectName}}"
- "--label=maintainer=dev@netbird.io"
- image_templates:
- netbirdio/netbird-server:{{ .Version }}-arm
- ghcr.io/netbirdio/netbird-server:{{ .Version }}-arm
ids:
- netbird-server
goarch: arm
goarm: 6
use: buildx
dockerfile: combined/Dockerfile
build_flag_templates:
- "--platform=linux/arm"
- "--label=org.opencontainers.image.created={{.Date}}"
- "--label=org.opencontainers.image.title={{.ProjectName}}"
- "--label=org.opencontainers.image.version={{.Version}}"
- "--label=org.opencontainers.image.revision={{.FullCommit}}"
- "--label=org.opencontainers.image.source=https://github.com/netbirdio/{{.ProjectName}}"
- "--label=maintainer=dev@netbird.io"
- image_templates:
- netbirdio/reverse-proxy:{{ .Version }}-amd64
- ghcr.io/netbirdio/reverse-proxy:{{ .Version }}-amd64
ids:
- netbird-proxy
goarch: amd64
use: buildx
dockerfile: proxy/Dockerfile
build_flag_templates:
- "--platform=linux/amd64"
- "--label=org.opencontainers.image.created={{.Date}}"
- "--label=org.opencontainers.image.title={{.ProjectName}}"
- "--label=org.opencontainers.image.version={{.Version}}"
- "--label=org.opencontainers.image.revision={{.FullCommit}}"
- "--label=org.opencontainers.image.source=https://github.com/netbirdio/{{.ProjectName}}"
- "--label=maintainer=dev@netbird.io"
- image_templates:
- netbirdio/reverse-proxy:{{ .Version }}-arm64v8
- ghcr.io/netbirdio/reverse-proxy:{{ .Version }}-arm64v8
ids:
- netbird-proxy
goarch: arm64
use: buildx
dockerfile: proxy/Dockerfile
build_flag_templates:
- "--platform=linux/arm64"
- "--label=org.opencontainers.image.created={{.Date}}"
- "--label=org.opencontainers.image.title={{.ProjectName}}"
- "--label=org.opencontainers.image.version={{.Version}}"
- "--label=org.opencontainers.image.revision={{.FullCommit}}"
- "--label=org.opencontainers.image.source=https://github.com/netbirdio/{{.ProjectName}}"
- "--label=maintainer=dev@netbird.io"
- image_templates:
- netbirdio/reverse-proxy:{{ .Version }}-arm
- ghcr.io/netbirdio/reverse-proxy:{{ .Version }}-arm
ids:
- netbird-proxy
goarch: arm
goarm: 6
use: buildx
dockerfile: proxy/Dockerfile
build_flag_templates:
- "--platform=linux/arm"
- "--label=org.opencontainers.image.created={{.Date}}"
- "--label=org.opencontainers.image.title={{.ProjectName}}"
- "--label=org.opencontainers.image.version={{.Version}}"
- "--label=org.opencontainers.image.revision={{.FullCommit}}"
- "--label=org.opencontainers.image.source=https://github.com/netbirdio/{{.ProjectName}}"
- "--label=maintainer=dev@netbird.io"
docker_manifests:
- name_template: netbirdio/netbird:{{ .Version }}
image_templates:
- netbirdio/netbird:{{ .Version }}-arm64v8
- netbirdio/netbird:{{ .Version }}-arm
- netbirdio/netbird:{{ .Version }}-amd64
- name_template: netbirdio/netbird:latest
image_templates:
- netbirdio/netbird:{{ .Version }}-arm64v8
- netbirdio/netbird:{{ .Version }}-arm
- netbirdio/netbird:{{ .Version }}-amd64
- name_template: netbirdio/netbird:{{ .Version }}-rootless
image_templates:
- netbirdio/netbird:{{ .Version }}-rootless-arm64v8
- netbirdio/netbird:{{ .Version }}-rootless-arm
- netbirdio/netbird:{{ .Version }}-rootless-amd64
- name_template: netbirdio/netbird:rootless-latest
image_templates:
- netbirdio/netbird:{{ .Version }}-rootless-arm64v8
- netbirdio/netbird:{{ .Version }}-rootless-arm
- netbirdio/netbird:{{ .Version }}-rootless-amd64
- name_template: netbirdio/relay:{{ .Version }}
image_templates:
- netbirdio/relay:{{ .Version }}-arm64v8
- netbirdio/relay:{{ .Version }}-arm
- netbirdio/relay:{{ .Version }}-amd64
- name_template: netbirdio/relay:latest
image_templates:
- netbirdio/relay:{{ .Version }}-arm64v8
- netbirdio/relay:{{ .Version }}-arm
- netbirdio/relay:{{ .Version }}-amd64
- name_template: netbirdio/signal:{{ .Version }}
image_templates:
- netbirdio/signal:{{ .Version }}-arm64v8
- netbirdio/signal:{{ .Version }}-arm
- netbirdio/signal:{{ .Version }}-amd64
- name_template: netbirdio/signal:latest
image_templates:
- netbirdio/signal:{{ .Version }}-arm64v8
- netbirdio/signal:{{ .Version }}-arm
- netbirdio/signal:{{ .Version }}-amd64
- name_template: netbirdio/management:{{ .Version }}
image_templates:
- netbirdio/management:{{ .Version }}-arm64v8
- netbirdio/management:{{ .Version }}-arm
- netbirdio/management:{{ .Version }}-amd64
- name_template: netbirdio/management:latest
image_templates:
- netbirdio/management:{{ .Version }}-arm64v8
- netbirdio/management:{{ .Version }}-arm
- netbirdio/management:{{ .Version }}-amd64
- name_template: netbirdio/management:debug-latest
image_templates:
- netbirdio/management:{{ .Version }}-debug-arm64v8
- netbirdio/management:{{ .Version }}-debug-arm
- netbirdio/management:{{ .Version }}-debug-amd64
- name_template: netbirdio/upload:{{ .Version }}
image_templates:
- netbirdio/upload:{{ .Version }}-arm64v8
- netbirdio/upload:{{ .Version }}-arm
- netbirdio/upload:{{ .Version }}-amd64
- name_template: netbirdio/upload:latest
image_templates:
- netbirdio/upload:{{ .Version }}-arm64v8
- netbirdio/upload:{{ .Version }}-arm
- netbirdio/upload:{{ .Version }}-amd64
- name_template: netbirdio/netbird-server:{{ .Version }}
image_templates:
- netbirdio/netbird-server:{{ .Version }}-arm64v8
- netbirdio/netbird-server:{{ .Version }}-arm
- netbirdio/netbird-server:{{ .Version }}-amd64
- name_template: netbirdio/netbird-server:latest
image_templates:
- netbirdio/netbird-server:{{ .Version }}-arm64v8
- netbirdio/netbird-server:{{ .Version }}-arm
- netbirdio/netbird-server:{{ .Version }}-amd64
- name_template: ghcr.io/netbirdio/netbird:{{ .Version }}
image_templates:
- ghcr.io/netbirdio/netbird:{{ .Version }}-arm64v8
- ghcr.io/netbirdio/netbird:{{ .Version }}-arm
- ghcr.io/netbirdio/netbird:{{ .Version }}-amd64
- name_template: ghcr.io/netbirdio/netbird:latest
image_templates:
- ghcr.io/netbirdio/netbird:{{ .Version }}-arm64v8
- ghcr.io/netbirdio/netbird:{{ .Version }}-arm
- ghcr.io/netbirdio/netbird:{{ .Version }}-amd64
- name_template: ghcr.io/netbirdio/netbird:{{ .Version }}-rootless
image_templates:
- ghcr.io/netbirdio/netbird:{{ .Version }}-rootless-arm64v8
- ghcr.io/netbirdio/netbird:{{ .Version }}-rootless-arm
- ghcr.io/netbirdio/netbird:{{ .Version }}-rootless-amd64
- name_template: ghcr.io/netbirdio/netbird:rootless-latest
image_templates:
- ghcr.io/netbirdio/netbird:{{ .Version }}-rootless-arm64v8
- ghcr.io/netbirdio/netbird:{{ .Version }}-rootless-arm
- ghcr.io/netbirdio/netbird:{{ .Version }}-rootless-amd64
- name_template: ghcr.io/netbirdio/relay:{{ .Version }}
image_templates:
- ghcr.io/netbirdio/relay:{{ .Version }}-arm64v8
- ghcr.io/netbirdio/relay:{{ .Version }}-arm
- ghcr.io/netbirdio/relay:{{ .Version }}-amd64
- name_template: ghcr.io/netbirdio/relay:latest
image_templates:
- ghcr.io/netbirdio/relay:{{ .Version }}-arm64v8
- ghcr.io/netbirdio/relay:{{ .Version }}-arm
- ghcr.io/netbirdio/relay:{{ .Version }}-amd64
- name_template: ghcr.io/netbirdio/signal:{{ .Version }}
image_templates:
- ghcr.io/netbirdio/signal:{{ .Version }}-arm64v8
- ghcr.io/netbirdio/signal:{{ .Version }}-arm
- ghcr.io/netbirdio/signal:{{ .Version }}-amd64
- name_template: ghcr.io/netbirdio/signal:latest
image_templates:
- ghcr.io/netbirdio/signal:{{ .Version }}-arm64v8
- ghcr.io/netbirdio/signal:{{ .Version }}-arm
- ghcr.io/netbirdio/signal:{{ .Version }}-amd64
- name_template: ghcr.io/netbirdio/management:{{ .Version }}
image_templates:
- ghcr.io/netbirdio/management:{{ .Version }}-arm64v8
- ghcr.io/netbirdio/management:{{ .Version }}-arm
- ghcr.io/netbirdio/management:{{ .Version }}-amd64
- name_template: ghcr.io/netbirdio/management:latest
image_templates:
- ghcr.io/netbirdio/management:{{ .Version }}-arm64v8
- ghcr.io/netbirdio/management:{{ .Version }}-arm
- ghcr.io/netbirdio/management:{{ .Version }}-amd64
- name_template: ghcr.io/netbirdio/management:debug-latest
image_templates:
- ghcr.io/netbirdio/management:{{ .Version }}-debug-arm64v8
- ghcr.io/netbirdio/management:{{ .Version }}-debug-arm
- ghcr.io/netbirdio/management:{{ .Version }}-debug-amd64
- name_template: ghcr.io/netbirdio/upload:{{ .Version }}
image_templates:
- ghcr.io/netbirdio/upload:{{ .Version }}-arm64v8
- ghcr.io/netbirdio/upload:{{ .Version }}-arm
- ghcr.io/netbirdio/upload:{{ .Version }}-amd64
- name_template: ghcr.io/netbirdio/upload:latest
image_templates:
- ghcr.io/netbirdio/upload:{{ .Version }}-arm64v8
- ghcr.io/netbirdio/upload:{{ .Version }}-arm
- ghcr.io/netbirdio/upload:{{ .Version }}-amd64
- name_template: ghcr.io/netbirdio/netbird-server:{{ .Version }}
image_templates:
- ghcr.io/netbirdio/netbird-server:{{ .Version }}-arm64v8
- ghcr.io/netbirdio/netbird-server:{{ .Version }}-arm
- ghcr.io/netbirdio/netbird-server:{{ .Version }}-amd64
- name_template: ghcr.io/netbirdio/netbird-server:latest
image_templates:
- ghcr.io/netbirdio/netbird-server:{{ .Version }}-arm64v8
- ghcr.io/netbirdio/netbird-server:{{ .Version }}-arm
- ghcr.io/netbirdio/netbird-server:{{ .Version }}-amd64
- name_template: netbirdio/reverse-proxy:{{ .Version }}
image_templates:
- netbirdio/reverse-proxy:{{ .Version }}-arm64v8
- netbirdio/reverse-proxy:{{ .Version }}-arm
- netbirdio/reverse-proxy:{{ .Version }}-amd64
- name_template: netbirdio/reverse-proxy:latest
image_templates:
- netbirdio/reverse-proxy:{{ .Version }}-arm64v8
- netbirdio/reverse-proxy:{{ .Version }}-arm
- netbirdio/reverse-proxy:{{ .Version }}-amd64
- name_template: ghcr.io/netbirdio/reverse-proxy:{{ .Version }}
image_templates:
- ghcr.io/netbirdio/reverse-proxy:{{ .Version }}-arm64v8
- ghcr.io/netbirdio/reverse-proxy:{{ .Version }}-arm
- ghcr.io/netbirdio/reverse-proxy:{{ .Version }}-amd64
- name_template: ghcr.io/netbirdio/reverse-proxy:latest
image_templates:
- ghcr.io/netbirdio/reverse-proxy:{{ .Version }}-arm64v8
- ghcr.io/netbirdio/reverse-proxy:{{ .Version }}-arm
- ghcr.io/netbirdio/reverse-proxy:{{ .Version }}-amd64
dockers_v2:
- id: netbird
disable: "{{ .Env.SKIP_DOCKER_PUSH }}"
ids:
- netbird
images:
- netbirdio/netbird
- ghcr.io/netbirdio/netbird
tags:
- "v{{ .Version }}"
- "{{ if eq .Env.SKIP_PUBLISH \"false\" }}latest{{ end }}"
dockerfile: client/Dockerfile
extra_files:
- client/netbird-entrypoint.sh
platforms:
- linux/amd64
- linux/arm64
- linux/arm/6
annotations:
"org.opencontainers.image.created": "{{.Date}}"
"org.opencontainers.image.title": "{{.ProjectName}}"
"org.opencontainers.image.version": "{{.Version}}"
"org.opencontainers.image.revision": "{{.FullCommit}}"
"org.opencontainers.image.source": "{{.GitURL}}"
"maintainer": "dev@netbird.io"
- id: netbird-rootless
disable: "{{ .Env.SKIP_DOCKER_PUSH }}"
ids:
- netbird
images:
- netbirdio/netbird
- ghcr.io/netbirdio/netbird
tags:
- "v{{ .Version }}-rootless"
- "{{ if eq .Env.SKIP_PUBLISH \"false\" }}latest{{ end }}"
dockerfile: client/Dockerfile-rootless
extra_files:
- client/netbird-entrypoint.sh
platforms:
- linux/amd64
- linux/arm64
- linux/arm/6
annotations:
"org.opencontainers.image.created": "{{.Date}}"
"org.opencontainers.image.title": "{{.ProjectName}}"
"org.opencontainers.image.version": "{{.Version}}"
"org.opencontainers.image.revision": "{{.FullCommit}}"
"org.opencontainers.image.source": "{{.GitURL}}"
"maintainer": "dev@netbird.io"
- id: relay
disable: "{{ .Env.SKIP_DOCKER_PUSH }}"
ids:
- netbird-relay
images:
- netbirdio/relay
- ghcr.io/netbirdio/relay
tags:
- "v{{ .Version }}"
- "{{ if eq .Env.SKIP_PUBLISH \"false\" }}latest{{ end }}"
dockerfile: relay/Dockerfile
platforms:
- linux/amd64
- linux/arm64
- linux/arm
annotations:
"org.opencontainers.image.created": "{{.Date}}"
"org.opencontainers.image.title": "{{.ProjectName}}"
"org.opencontainers.image.version": "{{.Version}}"
"org.opencontainers.image.revision": "{{.FullCommit}}"
"org.opencontainers.image.source": "{{.GitURL}}"
"maintainer": "dev@netbird.io"
- id: signal
disable: "{{ .Env.SKIP_DOCKER_PUSH }}"
ids:
- netbird-signal
images:
- netbirdio/signal
- ghcr.io/netbirdio/signal
tags:
- "v{{ .Version }}"
- "{{ if eq .Env.SKIP_PUBLISH \"false\" }}latest{{ end }}"
dockerfile: signal/Dockerfile
platforms:
- linux/amd64
- linux/arm64
- linux/arm
annotations:
"org.opencontainers.image.created": "{{.Date}}"
"org.opencontainers.image.title": "{{.ProjectName}}"
"org.opencontainers.image.version": "{{.Version}}"
"org.opencontainers.image.revision": "{{.FullCommit}}"
"org.opencontainers.image.source": "{{.GitURL}}"
"maintainer": "dev@netbird.io"
- id: management
disable: "{{ .Env.SKIP_DOCKER_PUSH }}"
ids:
- netbird-mgmt
images:
- netbirdio/management
- ghcr.io/netbirdio/management
tags:
- "v{{ .Version }}"
- "{{ if eq .Env.SKIP_PUBLISH \"false\" }}latest{{ end }}"
dockerfile: management/Dockerfile
platforms:
- linux/amd64
- linux/arm64
- linux/arm
annotations:
"org.opencontainers.image.created": "{{.Date}}"
"org.opencontainers.image.title": "{{.ProjectName}}"
"org.opencontainers.image.version": "{{.Version}}"
"org.opencontainers.image.revision": "{{.FullCommit}}"
"org.opencontainers.image.source": "{{.GitURL}}"
"maintainer": "dev@netbird.io"
- id: upload
disable: "{{ .Env.SKIP_DOCKER_PUSH }}"
ids:
- netbird-upload
images:
- netbirdio/upload
- ghcr.io/netbirdio/upload
tags:
- "v{{ .Version }}"
- "{{ if eq .Env.SKIP_PUBLISH \"false\" }}latest{{ end }}"
dockerfile: upload-server/Dockerfile
platforms:
- linux/amd64
- linux/arm64
- linux/arm
annotations:
"org.opencontainers.image.created": "{{.Date}}"
"org.opencontainers.image.title": "{{.ProjectName}}"
"org.opencontainers.image.version": "{{.Version}}"
"org.opencontainers.image.revision": "{{.FullCommit}}"
"org.opencontainers.image.source": "{{.GitURL}}"
"maintainer": "dev@netbird.io"
- id: netbird-server
disable: "{{ .Env.SKIP_DOCKER_PUSH }}"
ids:
- netbird-server
images:
- netbirdio/netbird-server
- ghcr.io/netbirdio/netbird-server
tags:
- "v{{ .Version }}"
- "{{ if eq .Env.SKIP_PUBLISH \"false\" }}latest{{ end }}"
dockerfile: combined/Dockerfile
platforms:
- linux/amd64
- linux/arm64
- linux/arm
annotations:
"org.opencontainers.image.created": "{{.Date}}"
"org.opencontainers.image.title": "{{.ProjectName}}"
"org.opencontainers.image.version": "{{.Version}}"
"org.opencontainers.image.revision": "{{.FullCommit}}"
"org.opencontainers.image.source": "{{.GitURL}}"
"maintainer": "dev@netbird.io"
- id: netbird-proxy
disable: "{{ .Env.SKIP_DOCKER_PUSH }}"
ids:
- netbird-proxy
images:
- netbirdio/reverse-proxy
- ghcr.io/netbirdio/reverse-proxy
tags:
- "v{{ .Version }}"
- "{{ if eq .Env.SKIP_PUBLISH \"false\" }}latest{{ end }}"
dockerfile: proxy/Dockerfile
platforms:
- linux/amd64
- linux/arm64
- linux/arm
annotations:
"org.opencontainers.image.created": "{{.Date}}"
"org.opencontainers.image.title": "{{.ProjectName}}"
"org.opencontainers.image.version": "{{.Version}}"
"org.opencontainers.image.revision": "{{.FullCommit}}"
"org.opencontainers.image.source": "{{.GitURL}}"
"maintainer": "dev@netbird.io"
brews:
- ids:
- default
skip_upload: "{{ .Env.SKIP_PUBLISH }}"
repository:
owner: netbirdio
name: homebrew-tap
@@ -902,6 +440,7 @@ brews:
uploads:
- name: debian
skip: "{{ .Env.SKIP_PUBLISH }}"
ids:
- netbird_deb
mode: archive
@@ -910,6 +449,7 @@ uploads:
method: PUT
- name: yum
skip: "{{ .Env.SKIP_PUBLISH }}"
ids:
- netbird_rpm
mode: archive

View File

@@ -1,5 +1,6 @@
version: 2
env:
- SKIP_PUBLISH={{ if index .Env "SKIP_PUBLISH" }}{{ .Env.SKIP_PUBLISH }}{{ else }}true{{ end }}
project_name: netbird-ui
builds:
- id: netbird-ui
@@ -101,6 +102,7 @@ nfpms:
uploads:
- name: debian
skip: "{{ .Env.SKIP_PUBLISH }}"
ids:
- netbird_ui_deb
mode: archive
@@ -109,6 +111,7 @@ uploads:
method: PUT
- name: yum
skip: "{{ .Env.SKIP_PUBLISH }}"
ids:
- netbird_ui_rpm
mode: archive

View File

@@ -4,7 +4,7 @@
# sudo podman build -t localhost/netbird:latest -f client/Dockerfile --ignorefile .dockerignore-client .
# sudo podman run --rm -it --cap-add={BPF,NET_ADMIN,NET_RAW} localhost/netbird:latest
FROM alpine:3.23.3
FROM alpine:3.24
# iproute2: busybox doesn't display ip rules properly
RUN apk add --no-cache \
bash \
@@ -21,7 +21,7 @@ ENV \
NB_ENTRYPOINT_SERVICE_TIMEOUT="30"
ENTRYPOINT [ "/usr/local/bin/netbird-entrypoint.sh" ]
ARG NETBIRD_BINARY=netbird
ARG TARGETPLATFORM
ARG NETBIRD_BINARY=$TARGETPLATFORM/netbird
COPY client/netbird-entrypoint.sh /usr/local/bin/netbird-entrypoint.sh
COPY "${NETBIRD_BINARY}" /usr/local/bin/netbird

View File

@@ -4,7 +4,7 @@
# podman build -t localhost/netbird:latest -f client/Dockerfile --ignorefile .dockerignore-client .
# podman run --rm -it --cap-add={BPF,NET_ADMIN,NET_RAW} localhost/netbird:latest
FROM alpine:3.22.0
FROM alpine:3.24
RUN apk add --no-cache \
bash \
@@ -27,7 +27,7 @@ ENV \
NB_ENTRYPOINT_SERVICE_TIMEOUT="30"
ENTRYPOINT [ "/usr/local/bin/netbird-entrypoint.sh" ]
ARG NETBIRD_BINARY=netbird
ARG TARGETPLATFORM
ARG NETBIRD_BINARY=$TARGETPLATFORM/netbird
COPY client/netbird-entrypoint.sh /usr/local/bin/netbird-entrypoint.sh
COPY "${NETBIRD_BINARY}" /usr/local/bin/netbird

View File

@@ -151,9 +151,9 @@ func (c *Client) Run(platformFiles PlatformFiles, urlOpener URLOpener, isAndroid
// todo do not throw error in case of cancelled context
ctx = internal.CtxInitState(ctx)
connectClient := internal.NewConnectClient(ctx, cfg, c.recorder)
connectClient := internal.NewConnectClient(ctx, c.recorder)
c.setState(cfg, cacheDir, connectClient)
return connectClient.RunOnAndroid(c.tunAdapter, c.iFaceDiscover, c.networkChangeListener, slices.Clone(dns.items), dnsReadyListener, stateFile, cacheDir)
return connectClient.RunOnAndroid(cfg, c.tunAdapter, c.iFaceDiscover, c.networkChangeListener, slices.Clone(dns.items), dnsReadyListener, stateFile, cacheDir)
}
// RunWithoutLogin we apply this type of run function when the backed has been started without UI (i.e. after reboot).
@@ -186,9 +186,9 @@ func (c *Client) RunWithoutLogin(platformFiles PlatformFiles, dns *DNSList, dnsR
// todo do not throw error in case of cancelled context
ctx = internal.CtxInitState(ctx)
connectClient := internal.NewConnectClient(ctx, cfg, c.recorder)
connectClient := internal.NewConnectClient(ctx, c.recorder)
c.setState(cfg, cacheDir, connectClient)
return connectClient.RunOnAndroid(c.tunAdapter, c.iFaceDiscover, c.networkChangeListener, slices.Clone(dns.items), dnsReadyListener, stateFile, cacheDir)
return connectClient.RunOnAndroid(cfg, c.tunAdapter, c.iFaceDiscover, c.networkChangeListener, slices.Clone(dns.items), dnsReadyListener, stateFile, cacheDir)
}
// Stop the internal client and free the resources

View File

@@ -20,6 +20,7 @@ import (
"github.com/spf13/cobra"
"github.com/spf13/pflag"
"google.golang.org/grpc"
"google.golang.org/grpc/connectivity"
"google.golang.org/grpc/credentials/insecure"
daddr "github.com/netbirdio/netbird/client/internal/daemonaddr"
@@ -261,17 +262,46 @@ func FlagNameToEnvVar(cmdFlag string, prefix string) string {
return prefix + upper
}
// DialClientGRPCServer returns client connection to the daemon server.
// DialClientGRPCServer returns client connection to the daemon server. It waits
// (up to the timeout) for the daemon to become reachable so an `up` issued right
// after `service start` tolerates the startup race. Instead of grpc's blocking
// dial — whose raw "transport failed" retry warnings are silenced by the logger
// config — we drive the wait ourselves and emit one clean line per failed attempt.
func DialClientGRPCServer(ctx context.Context, addr string) (*grpc.ClientConn, error) {
ctx, cancel := context.WithTimeout(ctx, time.Second*10)
defer cancel()
return grpc.DialContext(
conn, err := grpc.DialContext(
ctx,
strings.TrimPrefix(addr, "tcp://"),
grpc.WithTransportCredentials(insecure.NewCredentials()),
grpc.WithBlock(),
)
if err != nil {
return nil, err
}
conn.Connect()
for {
state := conn.GetState()
if state == connectivity.Ready {
return conn, nil
}
// Log only once the connection has actually failed — not during the
// brief Idle/Connecting phase on a healthy daemon (avoids a spurious
// line + wait when the daemon is already up).
if state == connectivity.TransientFailure {
log.Infof("waiting for the netbird daemon to become available at %s...", addr)
}
// Wake on the next state change, but at least every second so a stuck
// TransientFailure re-logs at a steady cadence until the timeout.
waitCtx, waitCancel := context.WithTimeout(ctx, time.Second)
conn.WaitForStateChange(waitCtx, state)
waitCancel()
if ctx.Err() != nil {
_ = conn.Close()
return nil, fmt.Errorf("daemon not reachable at %s: %w", addr, ctx.Err())
}
}
}
// WithBackOff execute function in backoff cycle.

View File

@@ -201,10 +201,10 @@ func runInForegroundMode(ctx context.Context, cmd *cobra.Command, activeProf *pr
r := peer.NewRecorder(config.ManagementURL.String())
r.GetFullStatus()
connectClient := internal.NewConnectClient(ctx, config, r)
connectClient := internal.NewConnectClient(ctx, r)
SetupDebugHandler(ctx, config, r, connectClient, "")
return connectClient.Run(nil, util.FindFirstLogPath(logFiles))
return connectClient.Run(config, nil, util.FindFirstLogPath(logFiles))
}
func runInDaemonMode(ctx context.Context, cmd *cobra.Command, pm *profilemanager.ProfileManager, activeProf *profilemanager.Profile, profileSwitched bool) error {

View File

@@ -264,32 +264,24 @@ func (c *Client) Start(startCtx context.Context) error {
if err, _ := authClient.Login(ctx, c.setupKey, c.jwtToken); err != nil {
return fmt.Errorf("login: %w", err)
}
client := internal.NewConnectClient(ctx, c.config, c.recorder)
client := internal.NewConnectClient(ctx, c.recorder)
client.SetSyncResponsePersistence(true)
// either startup error (permanent backoff err) or nil err (successful engine up)
// The supervisor owns the run; we wait until it is established, ends with a
// startup error (permanent backoff err), or startCtx expires.
// TODO: make after-startup backoff err available
run := make(chan struct{})
clientErr := make(chan error, 1)
go func() {
if err := client.Run(run, ""); err != nil {
clientErr <- err
}
}()
client.RunAsync(c.config, nil)
select {
case <-startCtx.Done():
// Cancel the client context before stopping: Engine.Start blocks on the
// signal stream while holding the engine mutex and only unblocks on
// cancellation. Stopping first would deadlock on that mutex.
if err := client.WaitEstablishedOrDone(startCtx); err != nil {
// Either startCtx expired while connecting, or the run ended before it
// established. Cancel the client context before stopping: Engine.Start
// blocks on the signal stream while holding the engine mutex and only
// unblocks on cancellation. Stopping first would deadlock on that mutex.
cancel()
if stopErr := client.Stop(); stopErr != nil {
return fmt.Errorf("stop error after context done. Stop error: %w. Context done: %w", stopErr, startCtx.Err())
return fmt.Errorf("stop error after startup failure. Stop error: %w. Startup: %w", stopErr, err)
}
return startCtx.Err()
case err := <-clientErr:
return fmt.Errorf("startup: %w", err)
case <-run:
}
c.connect = client

View File

@@ -18,6 +18,7 @@ import (
"golang.zx2c4.com/wireguard/wgctrl/wgtypes"
"google.golang.org/grpc/codes"
"google.golang.org/grpc/metadata"
gstatus "google.golang.org/grpc/status"
"github.com/netbirdio/netbird/client/iface/wgaddr"
@@ -48,13 +49,23 @@ import (
"github.com/netbirdio/netbird/version"
)
// androidRunOverride is set on Android to inject mobile dependencies
// when using embed.Client (which calls Run() with empty MobileDependency).
var androidRunOverride func(c *ConnectClient, runningChan chan struct{}, logPath string) error
// androidMobileDep is set on Android to inject the MobileDependency for runs
// started through the generic entry points (Run/RunAsync, e.g. embed.Client).
// nil on other platforms, where the dependency is empty.
var androidMobileDep func(config *profilemanager.Config) MobileDependency
// mobileDependency returns the MobileDependency for a run started via the
// generic entry points. On Android the androidMobileDep provider supplies
// platform stubs (or real implementations); elsewhere it is empty.
func (c *ConnectClient) mobileDependency(config *profilemanager.Config) MobileDependency {
if androidMobileDep != nil {
return androidMobileDep(config)
}
return MobileDependency{}
}
type ConnectClient struct {
ctx context.Context
config *profilemanager.Config
statusRecorder *peer.Status
engine *Engine
@@ -63,35 +74,62 @@ type ConnectClient struct {
updateManager *updater.Manager
persistSyncResponse bool
// sup serializes all start/stop requests so two lifecycle operations can
// never overlap. See connect_lifecycle.go.
sup *supervisor
}
func NewConnectClient(
ctx context.Context,
config *profilemanager.Config,
statusRecorder *peer.Status,
) *ConnectClient {
return &ConnectClient{
c := &ConnectClient{
ctx: ctx,
config: config,
statusRecorder: statusRecorder,
engineMutex: sync.Mutex{},
}
c.sup = newSupervisor(ctx, c.run)
return c
}
func (c *ConnectClient) SetUpdateManager(um *updater.Manager) {
c.updateManager = um
}
// Run with main logic.
func (c *ConnectClient) Run(runningChan chan struct{}, logPath string) error {
if androidRunOverride != nil {
return androidRunOverride(c, runningChan, logPath)
}
return c.run(MobileDependency{}, runningChan, logPath)
// Run with main logic. md carries optional gRPC metadata (e.g. the UI
// user-agent) to forward to the management/signal services; nil when none.
func (c *ConnectClient) Run(config *profilemanager.Config, md metadata.MD, logPath string) error {
return c.sup.start(config, md, c.mobileDependency(config), logPath)
}
// RunAsync starts a client run without blocking. Used by the daemon and embed,
// which drive the lifecycle through the supervisor rather than blocking on Run;
// they then wait for the outcome via WaitEstablishedOrDone. The run's lifecycle
// channels are created and owned by the supervisor — callers never hold them.
func (c *ConnectClient) RunAsync(config *profilemanager.Config, md metadata.MD) {
c.sup.startAsync(config, md, c.mobileDependency(config), "", nil)
}
// Restart atomically stops any in-flight run and starts a fresh one with the
// given config. The stop+start happens as a single supervisor operation, so no
// other lifecycle request can interleave between them — used for explicit
// restarts (e.g. an MDM policy change) that must not expose a "stopped" window.
func (c *ConnectClient) Restart(config *profilemanager.Config, md metadata.MD) {
c.sup.restartAsync(config, md, c.mobileDependency(config), "")
}
// WaitEstablishedOrDone blocks until the in-flight run becomes established (nil),
// ends before that (the run error, or a sentinel on a clean stop), or ctx is
// cancelled. Returns errNoRunInFlight if no run is in flight. Wraps the wait on
// the supervisor-owned channels so callers never touch them directly.
func (c *ConnectClient) WaitEstablishedOrDone(ctx context.Context) error {
return c.sup.waitEstablishedOrDone(ctx)
}
// RunOnAndroid with main logic on mobile system
func (c *ConnectClient) RunOnAndroid(
config *profilemanager.Config,
tunAdapter device.TunAdapter,
iFaceDiscover stdnet.ExternalIFaceDiscover,
networkChangeListener listener.NetworkChangeListener,
@@ -110,10 +148,11 @@ func (c *ConnectClient) RunOnAndroid(
StateFilePath: stateFilePath,
TempDir: cacheDir,
}
return c.run(mobileDependency, nil, "")
return c.sup.start(config, nil, mobileDependency, "")
}
func (c *ConnectClient) RunOniOS(
config *profilemanager.Config,
fileDescriptor int32,
networkChangeListener listener.NetworkChangeListener,
dnsManager dns.IosDnsManager,
@@ -131,10 +170,12 @@ func (c *ConnectClient) RunOniOS(
StateFilePath: stateFilePath,
TempDir: cacheDir,
}
return c.run(mobileDependency, nil, logFilePath)
return c.sup.start(config, nil, mobileDependency, logFilePath)
}
func (c *ConnectClient) run(mobileDependency MobileDependency, runningChan chan struct{}, logPath string) error {
// run executes a single client run. runCtx is owned by the supervisor: cancelling
// it tears the run down (it is the parent of the per-attempt engine context).
func (c *ConnectClient) run(runCtx context.Context, config *profilemanager.Config, mobileDependency MobileDependency, connEstablishedChan chan struct{}, logPath string) error {
defer func() {
if r := recover(); r != nil {
rec := c.statusRecorder
@@ -198,18 +239,18 @@ func (c *ConnectClient) run(mobileDependency MobileDependency, runningChan chan
}()
wrapErr := state.Wrap
myPrivateKey, err := wgtypes.ParseKey(c.config.PrivateKey)
myPrivateKey, err := wgtypes.ParseKey(config.PrivateKey)
if err != nil {
log.Errorf("failed parsing Wireguard key %s: [%s]", c.config.PrivateKey, err.Error())
log.Errorf("failed parsing Wireguard key %s: [%s]", config.PrivateKey, err.Error())
return wrapErr(err)
}
var mgmTlsEnabled bool
if c.config.ManagementURL.Scheme == "https" {
if config.ManagementURL.Scheme == "https" {
mgmTlsEnabled = true
}
publicSSHKey, err := ssh.GeneratePublicKey([]byte(c.config.SSHKey))
publicSSHKey, err := ssh.GeneratePublicKey([]byte(config.SSHKey))
if err != nil {
return err
}
@@ -243,13 +284,13 @@ func (c *ConnectClient) run(mobileDependency MobileDependency, runningChan chan
defer c.statusRecorder.ClientStop()
operation := func() error {
// if context cancelled we not start new backoff cycle
if c.ctx.Err() != nil {
if runCtx.Err() != nil {
return nil
}
state.Set(StatusConnecting)
engineCtx, cancel := context.WithCancel(c.ctx)
engineCtx, cancel := context.WithCancel(runCtx)
defer func() {
_, err := state.Status()
c.statusRecorder.MarkManagementDisconnected(err)
@@ -257,8 +298,8 @@ func (c *ConnectClient) run(mobileDependency MobileDependency, runningChan chan
cancel()
}()
log.Debugf("connecting to the Management service %s", c.config.ManagementURL.Host)
mgmClient, err := mgm.NewClient(engineCtx, c.config.ManagementURL.Host, myPrivateKey, mgmTlsEnabled)
log.Debugf("connecting to the Management service %s", config.ManagementURL.Host)
mgmClient, err := mgm.NewClient(engineCtx, config.ManagementURL.Host, myPrivateKey, mgmTlsEnabled)
if err != nil {
return wrapErr(gstatus.Errorf(codes.FailedPrecondition, "failed connecting to Management Service : %s", err))
}
@@ -275,7 +316,7 @@ func (c *ConnectClient) run(mobileDependency MobileDependency, runningChan chan
}
c.clientMetrics.UpdateAgentInfo(agentInfo, myPrivateKey.PublicKey().String())
log.Debugf("connected to the Management service %s", c.config.ManagementURL.Host)
log.Debugf("connected to the Management service %s", config.ManagementURL.Host)
defer func() {
if err = mgmClient.Close(); err != nil {
log.Warnf("failed to close the Management service client %v", err)
@@ -284,13 +325,14 @@ func (c *ConnectClient) run(mobileDependency MobileDependency, runningChan chan
// connect (just a connection, no stream yet) and login to Management Service to get an initial global Netbird config
loginStarted := time.Now()
loginResp, err := loginToManagement(engineCtx, mgmClient, publicSSHKey, c.config)
loginResp, err := loginToManagement(engineCtx, mgmClient, publicSSHKey, config)
if err != nil {
c.clientMetrics.RecordLoginDuration(engineCtx, time.Since(loginStarted), false)
log.Debug(err)
if s, ok := gstatus.FromError(err); ok && (s.Code() == codes.PermissionDenied) {
state.Set(StatusNeedsLogin)
_ = c.Stop()
// No teardown needed: login fails before the engine is started
// (engine.Start is below), so there is nothing running to stop.
return backoff.Permanent(wrapErr(err)) // unrecoverable error
}
return wrapErr(err)
@@ -344,7 +386,7 @@ func (c *ConnectClient) run(mobileDependency MobileDependency, runningChan chan
}
peerConfig := loginResp.GetPeerConfig()
engineConfig, err := createEngineConfig(myPrivateKey, c.config, peerConfig, logPath)
engineConfig, err := createEngineConfig(myPrivateKey, config, peerConfig, logPath)
if err != nil {
log.Error(err)
return wrapErr(err)
@@ -388,7 +430,7 @@ func (c *ConnectClient) run(mobileDependency MobileDependency, runningChan chan
c.engine = engine
c.engineMutex.Unlock()
if err := engine.Start(loginResp.GetNetbirdConfig(), c.config.ManagementURL); err != nil {
if err := engine.Start(loginResp.GetNetbirdConfig(), config.ManagementURL); err != nil {
log.Errorf("error while starting Netbird Connection Engine: %s", err)
return wrapErr(err)
}
@@ -396,12 +438,13 @@ func (c *ConnectClient) run(mobileDependency MobileDependency, runningChan chan
log.Infof("Netbird engine started, the IP is: %s", peerConfig.GetAddress())
state.Set(StatusConnected)
if runningChan != nil {
select {
case <-runningChan:
default:
close(runningChan)
}
// The supervisor owns connEstablishedChan and it is always present. Guard
// against a double close: operation re-runs on ErrResetConnection retries
// within the same run, and the channel is closed only on the first connect.
select {
case <-connEstablishedChan:
default:
close(connEstablishedChan)
}
<-engineCtx.Done()
@@ -410,14 +453,12 @@ func (c *ConnectClient) run(mobileDependency MobileDependency, runningChan chan
c.engine = nil
c.engineMutex.Unlock()
// todo: consider to remove this condition. Is not thread safe.
// We should always call Stop(), but we need to verify that it is idempotent
if engine.wgInterface != nil {
log.Infof("ensuring %s is removed, Netbird engine context cancelled", engine.wgInterface.Name())
if err := engine.Stop(); err != nil {
log.Errorf("Failed to stop engine: %v", err)
}
// Always tear the engine down once its context is cancelled. engine.Stop
// is nil-guarded per component, so calling it unconditionally is safe and
// avoids both the data race on engine.wgInterface and skipping teardown
// when the interface was never brought up (e.g. a mid-start failure).
if err := engine.Stop(); err != nil {
log.Errorf("Failed to stop engine: %v", err)
}
c.statusRecorder.ClientTeardown()
@@ -437,8 +478,9 @@ func (c *ConnectClient) run(mobileDependency MobileDependency, runningChan chan
if err != nil {
log.Debugf("exiting client retry loop due to unrecoverable error: %s", err)
if s, ok := gstatus.FromError(err); ok && (s.Code() == codes.PermissionDenied) {
// Login failed permanently: the engine was never started, so there
// is nothing to tear down — just record that a login is needed.
state.Set(StatusNeedsLogin)
_ = c.Stop()
}
return err
}
@@ -459,6 +501,22 @@ func parseRelayInfo(loginResp *mgmProto.LoginResponse) ([]string, *hmac.Token) {
return relayCfg.GetUrls(), token
}
// ConnectionRunning reports whether a connection run is currently in flight
// (connecting, connected, or reconnecting). Answered by the supervisor via a
// serialized query, so it settles behind an in-flight stop. Distinct from
// ServiceRunning, which reports whether the service itself is alive.
func (c *ConnectClient) ConnectionRunning() bool {
return c.sup.isRunning()
}
// ServiceRunning reports whether the client's lifecycle supervisor is alive and
// able to accept start/stop commands — i.e. its context has not been cancelled
// (the daemon is not shutting down). Independent of whether a connection run is
// up (that is ConnectionRunning).
func (c *ConnectClient) ServiceRunning() bool {
return c.sup.ctx.Err() == nil
}
func (c *ConnectClient) Engine() *Engine {
if c == nil {
return nil
@@ -515,14 +573,10 @@ func (c *ConnectClient) Status() StatusType {
return status
}
// Stop serializes a stop request through the lifecycle supervisor and blocks
// until the in-flight run is fully torn down.
func (c *ConnectClient) Stop() error {
engine := c.Engine()
if engine != nil {
if err := engine.Stop(); err != nil {
return fmt.Errorf("stop engine: %w", err)
}
}
return nil
return c.sup.stop()
}
// SetSyncResponsePersistence enables or disables sync response persistence.

View File

@@ -7,6 +7,7 @@ import (
"github.com/netbirdio/netbird/client/internal/dns"
"github.com/netbirdio/netbird/client/internal/listener"
"github.com/netbirdio/netbird/client/internal/profilemanager"
"github.com/netbirdio/netbird/client/internal/stdnet"
)
@@ -59,19 +60,17 @@ var _ listener.NetworkChangeListener = noopNetworkChangeListener{}
var _ dns.ReadyListener = noopDnsReadyListener{}
func init() {
// Wire up the default override so embed.Client.Start() works on Android
// with netstack mode. Provides complete no-op stubs for all mobile
// Wire up the default MobileDependency provider so embed.Client.Start() works
// on Android with netstack mode. Provides complete no-op stubs for all mobile
// dependencies so the engine's existing Android code paths work unchanged.
// Applications that need P2P ICE or real DNS should replace this by
// setting androidRunOverride before calling Start().
androidRunOverride = func(c *ConnectClient, runningChan chan struct{}, logPath string) error {
return c.runOnAndroidEmbed(
// Applications that need P2P ICE or real DNS should replace this by setting
// androidMobileDep before calling Start().
androidMobileDep = func(config *profilemanager.Config) MobileDependency {
return mobileDependencyForEmbed(
noopIFaceDiscover{},
noopNetworkChangeListener{},
[]netip.AddrPort{},
noopDnsReadyListener{},
runningChan,
logPath,
)
}
}

View File

@@ -10,23 +10,18 @@ import (
"github.com/netbirdio/netbird/client/internal/stdnet"
)
// runOnAndroidEmbed is like RunOnAndroid but accepts a runningChan
// so embed.Client.Start() can detect when the engine is ready.
// It provides complete MobileDependency so the engine's existing
// Android code paths work unchanged.
func (c *ConnectClient) runOnAndroidEmbed(
// mobileDependencyForEmbed builds the MobileDependency used by embed.Client on
// Android so the engine's existing Android code paths work unchanged.
func mobileDependencyForEmbed(
iFaceDiscover stdnet.ExternalIFaceDiscover,
networkChangeListener listener.NetworkChangeListener,
dnsAddresses []netip.AddrPort,
dnsReadyListener dns.ReadyListener,
runningChan chan struct{},
logPath string,
) error {
mobileDependency := MobileDependency{
) MobileDependency {
return MobileDependency{
IFaceDiscover: iFaceDiscover,
NetworkChangeListener: networkChangeListener,
HostDNSAddresses: dnsAddresses,
DnsReadyListener: dnsReadyListener,
}
return c.run(mobileDependency, runningChan, logPath)
}

View File

@@ -0,0 +1,362 @@
package internal
import (
"context"
"errors"
"google.golang.org/grpc/metadata"
"github.com/netbirdio/netbird/client/internal/profilemanager"
)
// errAlreadyRunning is returned when a start is requested while a run is already
// in flight.
var errAlreadyRunning = errors.New("client is already running")
// errNoRunInFlight is returned by waitEstablishedOrDone when no run is active.
var errNoRunInFlight = errors.New("no connection run in flight")
// errStoppedBeforeEstablished is returned when a run ended (cleanly) before the
// connection was established.
var errStoppedBeforeEstablished = errors.New("run stopped before the connection was established")
// lifecycleOp is a serialized lifecycle operation processed by the supervisor.
type lifecycleOp int
const (
opStart lifecycleOp = iota
opStop
opRestart
opStatus
opWaitEstablished
)
// lifecycleCmd is a single lifecycle request handed to the supervisor goroutine.
// They all flow through the same cmdCh so they are strictly ordered (FIFO) with
// respect to each other.
type lifecycleCmd struct {
op lifecycleOp
config *profilemanager.Config
md metadata.MD
mobileDep MobileDependency
logPath string
// done is the caller's notification channel (nil for fire-and-forget). Its
// meaning depends on op:
// - opStart: receives the run's end result when the run terminates, or
// errAlreadyRunning immediately if a run is already in flight.
// - opStop: receives nil once the in-flight run has fully unwound.
// - opWaitEstablished: receives the wait outcome (see waitEstablishedOrDone).
done chan error
reply chan bool // opStatus only: receives whether a run is in flight
waitCtx context.Context // opWaitEstablished only: the waiter's cancellation context
}
// runState holds the lifecycle channels of a single in-flight run, owned by the
// loop goroutine. It never escapes the supervisor as an API; the only readers
// are the per-wait goroutines the loop spawns for opWaitEstablished.
//
// connEstablishedChan is closed by the run once the connection is established.
// The supervisor creates and owns it — callers no longer supply it; they observe
// it through waitEstablishedOrDone. ended is closed (broadcast) when the run
// terminates, so any number of waiters can observe it; err is the run's end
// result, valid only after ended is closed.
type runState struct {
connEstablishedChan chan struct{} // closed by the run on established
ended chan struct{} // closed by finishRun when the run terminates
err error // run end result, valid after ended is closed
}
// runEndResult is sent by the run goroutine to the supervisor when a run ends,
// whether on its own (error / external context cancellation) or because of a Stop.
type runEndResult struct {
err error
}
// runFunc executes a single client run bound to the supervisor-owned context,
// with the config supplied by the start request.
type runFunc func(ctx context.Context, config *profilemanager.Config, mobileDep MobileDependency, connEstablishedChan chan struct{}, logPath string) error
// supervisor serializes start/stop of a single client run. Every request goes
// through cmdCh and is handled one at a time by the loop goroutine, so two
// lifecycle operations can never overlap and their order is preserved (FIFO).
// The loop goroutine is the sole owner of curStart/runCancel, so that state
// needs no locking. The loop exits when the parent context is cancelled.
type supervisor struct {
ctx context.Context
run runFunc
cmdCh chan lifecycleCmd
runEnded chan runEndResult
// owned exclusively by the loop goroutine. curStart is the in-flight start
// command (nil = idle); its done channel is notified when the run ends.
// curRun holds that run's lifecycle channels; runCancel cancels it.
curStart *lifecycleCmd
curRun *runState
runCancel context.CancelFunc
}
func newSupervisor(ctx context.Context, run runFunc) *supervisor {
s := &supervisor{
ctx: ctx,
run: run,
cmdCh: make(chan lifecycleCmd, 16),
runEnded: make(chan runEndResult, 1),
}
go s.loop()
return s
}
func (s *supervisor) loop() {
for {
select {
case <-s.ctx.Done():
s.shutdown()
return
case cmd := <-s.cmdCh:
switch cmd.op {
case opStart:
s.handleStart(cmd)
case opStop:
s.handleStop(cmd)
case opRestart:
s.handleRestart(cmd)
case opStatus:
cmd.reply <- (s.isRunningInternal())
case opWaitEstablished:
s.handleWaitEstablished(cmd)
}
case res := <-s.runEnded:
// Run ended on its own, without an explicit Stop.
s.finishRun(res.err)
}
}
}
func (s *supervisor) handleStart(cmd lifecycleCmd) {
if s.isRunningInternal() {
notify(cmd.done, errAlreadyRunning)
return
}
runCtx, cancel := context.WithCancel(s.ctx)
if cmd.md != nil {
// Carry caller-supplied gRPC metadata (e.g. UI user-agent) into the run
// context so the engine's management/signal calls forward it. The cancel
// still drives runCtx (metadata wrapping preserves cancellation).
runCtx = metadata.NewOutgoingContext(runCtx, cmd.md)
}
s.runCancel = cancel
s.curStart = &cmd
s.curRun = &runState{connEstablishedChan: make(chan struct{}), ended: make(chan struct{})}
go func(ctx context.Context, cfg *profilemanager.Config, m MobileDependency, established chan struct{}, lp string) {
err := s.run(ctx, cfg, m, established, lp)
s.runEnded <- runEndResult{err: err}
}(runCtx, cmd.config, cmd.mobileDep, s.curRun.connEstablishedChan, cmd.logPath)
}
func (s *supervisor) handleStop(cmd lifecycleCmd) {
if !s.isRunningInternal() {
notify(cmd.done, nil)
return
}
s.stopCurrentRun()
notify(cmd.done, nil)
}
// handleRestart tears down any in-flight run and starts a fresh one in a single
// loop turn. No other command can interleave between the stop and the start
// (the loop is single-threaded), so the swap is atomic without relying on any
// daemon-side lock — that is what an explicit restart (e.g. MDM config change)
// needs to avoid a window where the client is observably stopped.
func (s *supervisor) handleRestart(cmd lifecycleCmd) {
if s.isRunningInternal() {
s.stopCurrentRun()
}
s.handleStart(cmd)
}
// stopCurrentRun cancels the in-flight run and blocks the supervisor until it
// has fully unwound, so the next action starts from a clean slate. The run
// goroutine reports completion via runEnded. Caller must hold an in-flight run
// (curStart != nil).
func (s *supervisor) stopCurrentRun() {
s.runCancel()
res := <-s.runEnded
s.finishRun(res.err)
}
// finishRun resets lifecycle state after a run terminates and hands the run
// error back to whoever asked to be notified of the start.
func (s *supervisor) finishRun(err error) {
s.runCancel = nil
if s.isRunningInternal() {
// Publish the result to the broadcast channel before nil-ing curRun, so
// any opWaitEstablished goroutines blocked on ended observe err.
s.curRun.err = err
close(s.curRun.ended)
s.curRun = nil
notify(s.curStart.done, err)
s.curStart = nil
}
}
// handleWaitEstablished answers an opWaitEstablished request. The select itself
// runs in a spawned goroutine on the run's channels so it never blocks the loop;
// the loop only snapshots the in-flight run's channels (which it owns) here.
func (s *supervisor) handleWaitEstablished(cmd lifecycleCmd) {
caller := cmd.done
if !s.isRunningInternal() {
notify(caller, errNoRunInFlight)
return
}
rs := s.curRun
established := rs.connEstablishedChan
ctx := cmd.waitCtx
go func() {
select {
case <-established:
notify(caller, nil)
case <-rs.ended:
if rs.err != nil {
notify(caller, rs.err)
return
}
notify(caller, errStoppedBeforeEstablished)
case <-ctx.Done():
notify(caller, ctx.Err())
}
}()
}
// shutdown tears down the in-flight run when the parent context is cancelled,
// then fails any still-queued commands so their callers never hang.
func (s *supervisor) shutdown() {
if s.runCancel != nil {
s.runCancel()
res := <-s.runEnded
s.finishRun(res.err)
}
for {
select {
case cmd := <-s.cmdCh:
notify(cmd.done, s.ctx.Err())
default:
return
}
}
}
// startAsync enqueues a start without blocking. If done is non-nil it receives
// the run's end result (or errAlreadyRunning on rejection, or the context error
// on shutdown).
func (s *supervisor) startAsync(config *profilemanager.Config, md metadata.MD, mobileDep MobileDependency, logPath string, done chan error) {
cmd := lifecycleCmd{op: opStart, config: config, md: md, mobileDep: mobileDep, logPath: logPath, done: done}
select {
case s.cmdCh <- cmd:
case <-s.ctx.Done():
notify(done, s.ctx.Err())
}
}
// restartAsync enqueues an atomic stop+start without blocking. The supervisor
// tears down any in-flight run and starts a fresh one with the supplied config
// in a single loop turn (see handleRestart). Fire-and-forget: the new run owns
// its lifecycle channels, observed via waitEstablishedOrDone.
func (s *supervisor) restartAsync(config *profilemanager.Config, md metadata.MD, mobileDep MobileDependency, logPath string) {
cmd := lifecycleCmd{op: opRestart, config: config, md: md, mobileDep: mobileDep, logPath: logPath}
select {
case s.cmdCh <- cmd:
case <-s.ctx.Done():
}
}
// start enqueues a start and blocks until the run terminates, preserving the
// blocking contract of the legacy Run entry points.
func (s *supervisor) start(config *profilemanager.Config, md metadata.MD, mobileDep MobileDependency, logPath string) error {
done := make(chan error, 1)
s.startAsync(config, md, mobileDep, logPath, done)
select {
case err := <-done:
return err
case <-s.ctx.Done():
return s.ctx.Err()
}
}
// isRunning asks the loop whether a run is in flight. The query is serialized
// with start/stop, so during a stop it waits for the teardown to settle and
// then reports the final state — never a transient "half-stopped".
func (s *supervisor) isRunning() bool {
reply := make(chan bool, 1)
select {
case s.cmdCh <- lifecycleCmd{op: opStatus, reply: reply}:
case <-s.ctx.Done():
return false
}
select {
case r := <-reply:
return r
case <-s.ctx.Done():
return false
}
}
func (s *supervisor) isRunningInternal() bool {
return s.curStart != nil
}
// waitEstablishedOrDone blocks until the in-flight run becomes established
// (returns nil) or ends before that (returns the run error, or
// errStoppedBeforeEstablished on a clean stop), or ctx is cancelled. Returns
// errNoRunInFlight if no run is in flight. The wait is performed by a goroutine
// spawned inside the loop (see handleWaitEstablished); the run's channels never
// leave the supervisor.
func (s *supervisor) waitEstablishedOrDone(ctx context.Context) error {
reply := make(chan error, 1)
select {
case s.cmdCh <- lifecycleCmd{op: opWaitEstablished, waitCtx: ctx, done: reply}:
case <-ctx.Done():
return ctx.Err()
case <-s.ctx.Done():
return s.ctx.Err()
}
select {
case err := <-reply:
return err
case <-s.ctx.Done():
return s.ctx.Err()
}
}
// stop enqueues a stop and blocks until the in-flight run is fully torn down.
func (s *supervisor) stop() error {
done := make(chan error, 1)
select {
case s.cmdCh <- lifecycleCmd{op: opStop, done: done}:
case <-s.ctx.Done():
return s.ctx.Err()
}
select {
case err := <-done:
return err
case <-s.ctx.Done():
return s.ctx.Err()
}
}
// notify sends on a caller-supplied channel without blocking. The channel is
// expected to be buffered (cap 1); a nil channel means the caller did not ask
// to be notified.
func notify(ch chan error, err error) {
if ch == nil {
return
}
select {
case ch <- err:
default:
}
}

View File

@@ -22,6 +22,8 @@ import (
log "github.com/sirupsen/logrus"
"golang.zx2c4.com/wireguard/tun/netstack"
"golang.zx2c4.com/wireguard/wgctrl/wgtypes"
"google.golang.org/grpc/codes"
gstatus "google.golang.org/grpc/status"
nberrors "github.com/netbirdio/netbird/client/errors"
"github.com/netbirdio/netbird/client/firewall"
@@ -1125,6 +1127,20 @@ func (e *Engine) hasIPv6Changed(conf *mgmProto.PeerConfig) bool {
return !current.HasIPv6() || current.IPv6 != prefix.Addr() || current.IPv6Net != prefix.Masked()
}
// wrapDisconnectError classifies a receive-loop failure before the run is torn
// down. An auth rejection (PermissionDenied/Unauthenticated) means the session
// needs re-login and retrying is futile, so mark it terminal (NeedsLogin) — run()
// then exits on its own instead of spinning the backoff. Any other failure is a
// recoverable connection reset that the backoff should retry.
func (e *Engine) wrapDisconnectError(err error) {
state := CtxGetState(e.ctx)
if s, ok := gstatus.FromError(err); ok && (s.Code() == codes.PermissionDenied || s.Code() == codes.Unauthenticated) {
state.Set(StatusNeedsLogin)
return
}
_ = state.Wrap(ErrResetConnection)
}
func (e *Engine) receiveJobEvents() {
e.jobExecutorWG.Add(1)
go func() {
@@ -1151,9 +1167,9 @@ func (e *Engine) receiveJobEvents() {
}
})
if err != nil {
// happens if management is unavailable for a long time.
// We want to cancel the operation of the whole client
_ = CtxGetState(e.ctx).Wrap(ErrResetConnection)
// happens if management is unavailable for a long time, or rejects
// us (auth). wrapDisconnectError decides retry vs needs-login.
e.wrapDisconnectError(err)
e.clientCancel()
return
}
@@ -1235,9 +1251,9 @@ func (e *Engine) receiveManagementEvents() {
err = e.mgmClient.Sync(e.ctx, info, e.handleSync)
if err != nil {
// happens if management is unavailable for a long time.
// We want to cancel the operation of the whole client
_ = CtxGetState(e.ctx).Wrap(ErrResetConnection)
// happens if management is unavailable for a long time, or rejects
// us (auth). wrapDisconnectError decides retry vs needs-login.
e.wrapDisconnectError(err)
e.clientCancel()
return
}
@@ -1714,6 +1730,13 @@ func (e *Engine) receiveSignalEvents() {
return e.ctx.Err()
}
// Self-addressed heartbeat: the signal client's receive watchdog
// round-trips this through the server to confirm the receive stream
// is delivering. Liveness is already recorded before this handler.
if msg.GetBody().GetType() == sProto.Body_HEARTBEAT {
return nil
}
conn, ok := e.peerStore.PeerConn(msg.Key)
if !ok {
return fmt.Errorf("wrongly addressed message %s", msg.Key)
@@ -1754,9 +1777,9 @@ func (e *Engine) receiveSignalEvents() {
return nil
})
if err != nil {
// happens if signal is unavailable for a long time.
// We want to cancel the operation of the whole client
_ = CtxGetState(e.ctx).Wrap(ErrResetConnection)
// happens if signal is unavailable for a long time, or rejects us
// (auth). wrapDisconnectError decides retry vs needs-login.
e.wrapDisconnectError(err)
e.clientCancel()
return
}

View File

@@ -1024,14 +1024,17 @@ func (d *Status) GetRelayStates() []relay.ProbeResult {
return d.relayStates
}
// extend the list of stun, turn servers with relay address
// extend the list of stun, turn servers with the relay server connections
relayStates := slices.Clone(d.relayStates)
// if the server connection is not established then we will use the general address
// in case of connection we will use the instance specific address
instanceAddr, _, err := d.relayMgr.RelayInstanceAddress()
if err != nil {
// TODO add their status
states := d.relayMgr.RelayStates()
if len(states) == 0 {
// no relay connection tracked yet; surface configured servers as
// unavailable with the real reconnect error when known
err := relayClient.ErrRelayClientNotConnected
if connErr := d.relayMgr.RelayConnectError(); connErr != nil {
err = connErr
}
for _, r := range d.relayMgr.ServerURLs() {
relayStates = append(relayStates, relay.ProbeResult{
URI: r,
@@ -1041,10 +1044,14 @@ func (d *Status) GetRelayStates() []relay.ProbeResult {
return relayStates
}
relayState := relay.ProbeResult{
URI: instanceAddr,
for _, rs := range states {
relayStates = append(relayStates, relay.ProbeResult{
URI: rs.URL,
Err: rs.Err,
Transport: rs.Transport,
})
}
return append(relayStates, relayState)
return relayStates
}
func (d *Status) ForwardingRules() []firewall.ForwardRule {
@@ -1405,6 +1412,7 @@ func (fs FullStatus) ToProto() *proto.FullStatus {
pbRelayState := &proto.RelayState{
URI: relayState.URI,
Available: relayState.Err == nil,
Transport: relayState.Transport,
}
if err := relayState.Err; err != nil {
pbRelayState.Error = err.Error()

View File

@@ -575,8 +575,8 @@ func (s *ServiceManager) activeProfileID() (ID, bool) {
}
// ResolveProfile turns a user-supplied handle into a Profile. Resolution
// precedence is: exact ID match, then unique ID prefix, then unique exact
// name. Ambiguous matches return *ErrAmbiguousHandle so callers can
// precedence is: exact ID match, then unique exact name, then unique ID
// prefix. Ambiguous matches return *ErrAmbiguousHandle so callers can
// surface the candidates.
func (s *ServiceManager) ResolveProfile(handle, username string) (*Profile, error) {
if handle == "" {
@@ -594,6 +594,23 @@ func (s *ServiceManager) ResolveProfile(handle, username string) (*Profile, erro
}
}
var nameMatches []Profile
for i := range profiles {
if profiles[i].Name == handle {
nameMatches = append(nameMatches, profiles[i])
}
}
if len(nameMatches) == 1 {
return &nameMatches[0], nil
}
if len(nameMatches) > 1 {
return nil, &ErrAmbiguousHandle{
Handle: handle,
Candidates: nameMatches,
Kind: AmbiguityKindName,
}
}
// ID prefix match. Skip the default profile so `select d` does not
// accidentally pick it via prefix.
var prefixMatches []Profile
@@ -616,22 +633,5 @@ func (s *ServiceManager) ResolveProfile(handle, username string) (*Profile, erro
}
}
var nameMatches []Profile
for i := range profiles {
if profiles[i].Name == handle {
nameMatches = append(nameMatches, profiles[i])
}
}
if len(nameMatches) == 1 {
return &nameMatches[0], nil
}
if len(nameMatches) > 1 {
return nil, &ErrAmbiguousHandle{
Handle: handle,
Candidates: nameMatches,
Kind: AmbiguityKindName,
}
}
return nil, ErrProfileNotFound
}

View File

@@ -32,6 +32,9 @@ type ProbeResult struct {
URI string
Err error
Addr string
// Transport is the negotiated relay transport, empty
// for stun/turn probes or when not connected.
Transport string
}
type StunTurnProbe struct {

View File

@@ -333,6 +333,8 @@ func (m *DefaultManager) Stop(stateManager *statemanager.Manager) {
}
}
m.notifier.Close()
m.mux.Lock()
defer m.mux.Unlock()
m.clientRoutes = nil

View File

@@ -16,7 +16,7 @@ import (
type Notifier struct {
initialRoutes []*route.Route
currentRoutes []*route.Route
fakeIPRoutes []*route.Route
fakeIPRoutes []*route.Route
listener listener.NetworkChangeListener
listenerMux sync.Mutex
@@ -119,3 +119,7 @@ func (n *Notifier) GetInitialRouteRanges() []string {
sort.Strings(initialStrings)
return initialStrings
}
func (n *Notifier) Close() {
// unused
}

View File

@@ -3,6 +3,7 @@
package notifier
import (
"container/list"
"net/netip"
"slices"
"sort"
@@ -14,19 +15,26 @@ import (
)
type Notifier struct {
mu sync.Mutex
cond *sync.Cond
currentPrefixes []string
listener listener.NetworkChangeListener
listenerMux sync.Mutex
listener listener.NetworkChangeListener
queue *list.List
closed bool
}
func NewNotifier() *Notifier {
return &Notifier{}
n := &Notifier{
queue: list.New(),
}
n.cond = sync.NewCond(&n.mu)
go n.deliverLoop()
return n
}
func (n *Notifier) SetListener(listener listener.NetworkChangeListener) {
n.listenerMux.Lock()
defer n.listenerMux.Unlock()
n.mu.Lock()
defer n.mu.Unlock()
n.listener = listener
}
@@ -43,32 +51,52 @@ func (n *Notifier) OnNewRoutes(route.HAMap) {
}
func (n *Notifier) OnNewPrefixes(prefixes []netip.Prefix) {
newNets := make([]string, 0)
newNets := make([]string, 0, len(prefixes))
for _, prefix := range prefixes {
newNets = append(newNets, prefix.String())
}
sort.Strings(newNets)
n.mu.Lock()
if slices.Equal(n.currentPrefixes, newNets) {
n.mu.Unlock()
return
}
n.currentPrefixes = newNets
n.notify()
routes := strings.Join(n.currentPrefixes, ",")
n.queue.PushBack(routes)
n.cond.Signal()
n.mu.Unlock()
}
func (n *Notifier) notify() {
n.listenerMux.Lock()
defer n.listenerMux.Unlock()
if n.listener == nil {
return
}
go func(l listener.NetworkChangeListener) {
l.OnNetworkChanged(strings.Join(n.currentPrefixes, ","))
}(n.listener)
func (n *Notifier) Close() {
n.mu.Lock()
n.closed = true
n.cond.Signal()
n.mu.Unlock()
}
func (n *Notifier) GetInitialRouteRanges() []string {
return nil
}
func (n *Notifier) deliverLoop() {
for {
n.mu.Lock()
for n.queue.Len() == 0 && !n.closed {
n.cond.Wait()
}
if n.closed && n.queue.Len() == 0 {
n.mu.Unlock()
return
}
routes := n.queue.Remove(n.queue.Front()).(string)
l := n.listener
n.mu.Unlock()
if l != nil {
l.OnNetworkChanged(routes)
}
}
}

View File

@@ -38,3 +38,7 @@ func (n *Notifier) OnNewPrefixes(prefixes []netip.Prefix) {
func (n *Notifier) GetInitialRouteRanges() []string {
return []string{}
}
func (n *Notifier) Close() {
// unused
}

View File

@@ -171,13 +171,13 @@ func (c *Client) Run(fd int32, interfaceName string, envList *EnvList) error {
c.onHostDnsFn = func([]string) {}
cfg.WgIface = interfaceName
connectClient := internal.NewConnectClient(ctx, cfg, c.recorder)
connectClient := internal.NewConnectClient(ctx, c.recorder)
c.setState(cfg, connectClient)
// Persist the latest sync response so DebugBundle can include the network
// map. On iOS this is backed by disk to keep it out of the constrained
// process memory (see the syncstore package).
connectClient.SetSyncResponsePersistence(true)
return connectClient.RunOniOS(fd, c.networkChangeListener, c.dnsManager, c.stateFile, c.cacheDir, c.logFilePath)
return connectClient.RunOniOS(cfg, fd, c.networkChangeListener, c.dnsManager, c.stateFile, c.cacheDir, c.logFilePath)
}
// Stop the internal client and free the resources

View File

@@ -1849,10 +1849,13 @@ func (x *ManagementState) GetError() string {
// RelayState contains the latest state of the relay
type RelayState struct {
state protoimpl.MessageState `protogen:"open.v1"`
URI string `protobuf:"bytes,1,opt,name=URI,proto3" json:"URI,omitempty"`
Available bool `protobuf:"varint,2,opt,name=available,proto3" json:"available,omitempty"`
Error string `protobuf:"bytes,3,opt,name=error,proto3" json:"error,omitempty"`
state protoimpl.MessageState `protogen:"open.v1"`
URI string `protobuf:"bytes,1,opt,name=URI,proto3" json:"URI,omitempty"`
Available bool `protobuf:"varint,2,opt,name=available,proto3" json:"available,omitempty"`
Error string `protobuf:"bytes,3,opt,name=error,proto3" json:"error,omitempty"`
// transport is the negotiated relay transport (e.g. "ws", "quic"),
// empty for stun/turn probes or when not connected.
Transport string `protobuf:"bytes,4,opt,name=transport,proto3" json:"transport,omitempty"`
unknownFields protoimpl.UnknownFields
sizeCache protoimpl.SizeCache
}
@@ -1908,6 +1911,13 @@ func (x *RelayState) GetError() string {
return ""
}
func (x *RelayState) GetTransport() string {
if x != nil {
return x.Transport
}
return ""
}
type NSGroupState struct {
state protoimpl.MessageState `protogen:"open.v1"`
Servers []string `protobuf:"bytes,1,rep,name=servers,proto3" json:"servers,omitempty"`
@@ -6646,12 +6656,13 @@ const file_daemon_proto_rawDesc = "" +
"\x0fManagementState\x12\x10\n" +
"\x03URL\x18\x01 \x01(\tR\x03URL\x12\x1c\n" +
"\tconnected\x18\x02 \x01(\bR\tconnected\x12\x14\n" +
"\x05error\x18\x03 \x01(\tR\x05error\"R\n" +
"\x05error\x18\x03 \x01(\tR\x05error\"p\n" +
"\n" +
"RelayState\x12\x10\n" +
"\x03URI\x18\x01 \x01(\tR\x03URI\x12\x1c\n" +
"\tavailable\x18\x02 \x01(\bR\tavailable\x12\x14\n" +
"\x05error\x18\x03 \x01(\tR\x05error\"r\n" +
"\x05error\x18\x03 \x01(\tR\x05error\x12\x1c\n" +
"\ttransport\x18\x04 \x01(\tR\ttransport\"r\n" +
"\fNSGroupState\x12\x18\n" +
"\aservers\x18\x01 \x03(\tR\aservers\x12\x18\n" +
"\adomains\x18\x02 \x03(\tR\adomains\x12\x18\n" +

View File

@@ -380,6 +380,9 @@ message RelayState {
string URI = 1;
bool available = 2;
string error = 3;
// transport is the negotiated relay transport (e.g. "ws", "quic"),
// empty for stun/turn probes or when not connected.
string transport = 4;
}
message NSGroupState {

View File

@@ -344,9 +344,6 @@ func (s *Server) clearCaptureIfOwner(sess *capture.Session, engine *internal.Eng
}
func (s *Server) getCaptureEngineLocked() (*internal.Engine, error) {
if s.connectClient == nil {
return nil, status.Error(codes.FailedPrecondition, "client not connected")
}
engine := s.connectClient.Engine()
if engine == nil {
return nil, status.Error(codes.FailedPrecondition, "engine not initialized")

View File

@@ -5,7 +5,6 @@ package server
import (
"bytes"
"context"
"errors"
"fmt"
"runtime/pprof"
@@ -28,11 +27,9 @@ func (s *Server) DebugBundle(_ context.Context, req *proto.DebugBundleRequest) (
}
var clientMetrics debug.MetricsExporter
if s.connectClient != nil {
if engine := s.connectClient.Engine(); engine != nil {
if cm := engine.GetClientMetrics(); cm != nil {
clientMetrics = cm
}
if engine := s.connectClient.Engine(); engine != nil {
if cm := engine.GetClientMetrics(); cm != nil {
clientMetrics = cm
}
}
@@ -48,13 +45,10 @@ func (s *Server) DebugBundle(_ context.Context, req *proto.DebugBundleRequest) (
defer s.cleanupBundleCapture()
var refreshStatus func()
if s.connectClient != nil {
engine := s.connectClient.Engine()
if engine != nil {
refreshStatus = func() {
log.Debug("refreshing system health status for debug bundle")
engine.RunHealthProbes(true)
}
if engine := s.connectClient.Engine(); engine != nil {
refreshStatus = func() {
log.Debug("refreshing system health status for debug bundle")
engine.RunHealthProbes(true)
}
}
@@ -118,9 +112,7 @@ func (s *Server) SetLogLevel(_ context.Context, req *proto.SetLogLevelRequest) (
log.SetLevel(level)
if s.connectClient != nil {
s.connectClient.SetLogLevel(level)
}
s.connectClient.SetLogLevel(level)
log.Infof("Log level set to %s", level.String())
@@ -134,20 +126,13 @@ func (s *Server) SetSyncResponsePersistence(_ context.Context, req *proto.SetSyn
enabled := req.GetEnabled()
s.persistSyncResponse = enabled
if s.connectClient != nil {
s.connectClient.SetSyncResponsePersistence(enabled)
}
s.connectClient.SetSyncResponsePersistence(enabled)
return &proto.SetSyncResponsePersistenceResponse{}, nil
}
func (s *Server) getLatestSyncResponse() (*mgmProto.SyncResponse, error) {
cClient := s.connectClient
if cClient == nil {
return nil, errors.New("connect client is not initialized")
}
return cClient.GetLatestSyncResponse()
return s.connectClient.GetLatestSyncResponse()
}
// StartCPUProfile starts CPU profiling in the daemon.

View File

@@ -3,7 +3,6 @@ package server
import (
"context"
"fmt"
"time"
log "github.com/sirupsen/logrus"
"google.golang.org/grpc/codes"
@@ -39,12 +38,11 @@ type conflictCheck struct {
// OS-native managed-config store reports a diff vs the last observation.
//
// Restart sequence:
// 1. Cancel the active engine context (terminates connectWithRetryRuns).
// 2. Wait briefly for that goroutine to exit (giveUpChan is closed on exit).
// 3. Re-resolve Config from disk + MDM policy (Config.apply re-runs
// 1. Stop the in-flight run via the supervisor (blocks until fully torn down).
// 2. Re-resolve Config from disk + MDM policy (Config.apply re-runs
// applyMDMPolicy with the freshly loaded Policy).
// 4. Spawn a fresh connectWithRetryRuns with the new context and config.
// 5. Broadcast a SystemEvent so any GUI / CLI subscriber (SubscribeEvents
// 3. Start a fresh run with the new config.
// 4. Broadcast a SystemEvent so any GUI / CLI subscriber (SubscribeEvents
// RPC) can refresh its cached config view without polling.
//
// The callback runs in the ticker's own goroutine. Ticker has already
@@ -52,39 +50,24 @@ type conflictCheck struct {
func (s *Server) onMDMPolicyChange(_, _ *mdm.Policy) error {
log.Warn("MDM policy changed; restarting engine to apply new configuration")
// Hold s.mutex for the entire restart sequence (cancel + quiescence
// wait + re-spawn). Any concurrent Up/Down/Status arriving while
// MDM is restarting blocks on the Lock until we are done — they
// then observe the post-restart state coherently. This is safe
// because the connectWithRetryRuns goroutine no longer acquires
// s.mutex in its defer (intent vs. goroutine-alive concerns are
// fully separated; see the connectionGoroutineRunning helper).
// Hold s.mutex for the entire restart sequence (stop + re-start). Any
// concurrent Up/Down/Status arriving while MDM is restarting blocks on the
// Lock until we are done — they then observe the post-restart state coherently.
s.mutex.Lock()
defer s.mutex.Unlock()
if !s.clientRunning {
// The client is not running, so there's no engine to restart.
if !s.connectClient.ConnectionRunning() {
// No run in flight, so there's no engine to restart.
return nil
}
// Cancel daemon-side login/status activities tied to the old run; the run
// itself is torn down atomically by the supervisor inside Restart (see
// restartEngineForMDMLocked), which stops and re-starts in one operation.
if s.actCancel != nil {
s.actCancel()
}
// Wait for previous connectWithRetryRuns to exit so we don't end up
// with two goroutines fighting over the same status recorder + engine.
// The teardown engages a fan-out of engine goroutines (peer workers,
// signal handler, route manager, ...). close(clientGiveUpChan)
// happens in the function-scope defer of connectWithRetryRuns, on
// every exit path (ctx cancel, backoff exhausted, panic) — see the
// defer in server.go.
if s.clientGiveUpChan != nil {
select {
case <-s.clientGiveUpChan:
case <-time.After(10 * time.Second):
return fmt.Errorf("failed to restart the engine due to timeout")
}
}
if err := s.restartEngineForMDMLocked(); err != nil {
log.Errorf("MDM restart failed: %v", err)
return err
@@ -131,14 +114,13 @@ func (s *Server) publishConfigChangedEvent(source string) {
}
// restartEngineForMDMLocked re-resolves the active profile config
// (re-running applyMDMPolicy via Config.apply) and re-spawns
// connectWithRetryRuns. Mirrors the tail of Server.Start so a runtime
// MDM change behaves identically to a fresh boot under the new policy.
// (re-running applyMDMPolicy via Config.apply) and starts a fresh run.
// Mirrors the tail of Server.Start so a runtime MDM change behaves
// identically to a fresh boot under the new policy.
//
// MUST be called with s.mutex held — onMDMPolicyChange holds the lock
// for the entire restart sequence (cancel + quiescence wait + re-spawn)
// so concurrent Up/Down/Status RPCs observe a coherent post-restart
// state.
// for the entire restart sequence so concurrent Up/Down/Status RPCs
// observe a coherent post-restart state.
func (s *Server) restartEngineForMDMLocked() error {
activeProf, err := s.profileManager.GetActiveProfileState()
if err != nil {
@@ -154,13 +136,13 @@ func (s *Server) restartEngineForMDMLocked() error {
s.statusRecorder.UpdateRosenpass(config.RosenpassEnabled, config.RosenpassPermissive)
s.statusRecorder.UpdateLazyConnection(config.LazyConnectionEnabled)
ctx, cancel := context.WithCancel(s.rootCtx)
_, cancel := context.WithCancel(s.rootCtx)
s.actCancel = cancel
s.clientRunning = true
s.clientRunningChan = make(chan struct{})
s.clientGiveUpChan = make(chan struct{})
log.Info("MDM restart: spawning connectWithRetryRuns with re-resolved config")
go s.connectWithRetryRuns(ctx, config, s.statusRecorder, s.clientRunningChan, s.clientGiveUpChan)
log.Info("MDM restart: atomically restarting the run with re-resolved config")
// MDM restart has no incoming RPC metadata; fire and forget. Restart is a
// single supervisor op (atomic stop+start), so there is no observable
// "stopped" window between tearing down the old run and starting the new.
s.connectClient.Restart(config, nil)
s.publishConfigChangedEvent("mdm")
return nil
}

View File

@@ -34,10 +34,6 @@ func (s *Server) ListNetworks(context.Context, *proto.ListNetworksRequest) (*pro
return nil, gstatus.Errorf(codes.Unavailable, errNetworksDisabled)
}
if s.connectClient == nil {
return nil, fmt.Errorf("not connected")
}
engine := s.connectClient.Engine()
if engine == nil {
return nil, fmt.Errorf("not connected")
@@ -147,10 +143,6 @@ func (s *Server) SelectNetworks(_ context.Context, req *proto.SelectNetworksRequ
return nil, gstatus.Errorf(codes.Unavailable, errNetworksDisabled)
}
if s.connectClient == nil {
return nil, fmt.Errorf("not connected")
}
engine := s.connectClient.Engine()
if engine == nil {
return nil, fmt.Errorf("not connected")
@@ -199,10 +191,6 @@ func (s *Server) DeselectNetworks(_ context.Context, req *proto.SelectNetworksRe
return nil, gstatus.Errorf(codes.Unavailable, errNetworksDisabled)
}
if s.connectClient == nil {
return nil, fmt.Errorf("not connected")
}
engine := s.connectClient.Engine()
if engine == nil {
return nil, fmt.Errorf("not connected")

View File

@@ -8,12 +8,10 @@ import (
"os"
"os/exec"
"runtime"
"strconv"
"sync"
"sync/atomic"
"time"
"github.com/cenkalti/backoff/v4"
log "github.com/sirupsen/logrus"
"golang.zx2c4.com/wireguard/wgctrl/wgtypes"
"google.golang.org/grpc/codes"
@@ -39,15 +37,7 @@ import (
)
const (
probeThreshold = time.Second * 5
retryInitialIntervalVar = "NB_CONN_RETRY_INTERVAL_TIME"
maxRetryIntervalVar = "NB_CONN_MAX_RETRY_INTERVAL_TIME"
maxRetryTimeVar = "NB_CONN_MAX_RETRY_TIME_TIME"
retryMultiplierVar = "NB_CONN_RETRY_MULTIPLIER"
defaultInitialRetryTime = 30 * time.Minute
defaultMaxRetryInterval = 60 * time.Minute
defaultMaxRetryTime = 14 * 24 * time.Hour
defaultRetryMultiplier = 1.7
probeThreshold = time.Second * 5
// JWT token cache TTL for the client daemon (disabled by default)
defaultJWTCacheTTL = 0
@@ -72,15 +62,8 @@ type Server struct {
mutex sync.Mutex
config *profilemanager.Config
proto.UnimplementedDaemonServiceServer
// clientRunning tracks "the daemon wants to be connected" — set true by
// Start / Up, cleared by Down / Logout. Persists across retry
// loops, signal disconnects, and ErrResetConnection cycles. NOT
// changed by connectWithRetryRuns goroutine exit — for that
// (goroutine-still-alive) check, see connectionGoroutineRunning() which
// derives from clientGiveUpChan close state. Protected by s.mutex.
clientRunning bool
clientRunningChan chan struct{}
clientGiveUpChan chan struct{} // closed when connectWithRetryRuns goroutine exits
// Run state (in-flight? established/done channels?) is owned entirely by the
// supervisor inside connectClient — the daemon keeps no per-run fields.
connectClient *internal.ConnectClient
@@ -136,6 +119,13 @@ func New(ctx context.Context, logFile string, configFile string, profilesDisable
networksDisabled: networksDisabled,
jwtCache: newJWTCache(),
}
// The ConnectClient is daemon-lifetime: build it exactly once, here. Its
// supervisor lives as long as the daemon; Up/Down/MDM and reconnects all
// drive this same instance. updateManager isn't ready yet (created in
// Start) and is injected there via SetUpdateManager.
s.connectClient = internal.NewConnectClient(ctx, s.statusRecorder)
s.connectClient.SetSyncResponsePersistence(s.persistSyncResponse)
agent := &serverAgent{s}
s.sleepHandler = sleephandler.New(agent)
s.startSleepDetector()
@@ -147,7 +137,7 @@ func (s *Server) Start() error {
s.mutex.Lock()
defer s.mutex.Unlock()
if s.clientRunning {
if s.connectClient.ConnectionRunning() {
return nil
}
@@ -165,6 +155,7 @@ func (s *Server) Start() error {
stateMgr := statemanager.New(s.profileManager.GetStatePath())
s.updateManager = updater.NewManager(s.statusRecorder, stateMgr)
s.updateManager.CheckUpdateSuccess(s.rootCtx)
s.connectClient.SetUpdateManager(s.updateManager)
}
// MDM policy reload ticker: every minute the desktop daemon re-reads
@@ -190,7 +181,9 @@ func (s *Server) Start() error {
return nil
}
ctx, cancel := context.WithCancel(s.rootCtx)
// actCancel cancels in-flight foreground operations (login/status); the run
// itself is owned by the supervisor and stopped via Stop, not this cancel.
_, cancel := context.WithCancel(s.rootCtx)
s.actCancel = cancel
// copy old default config
@@ -232,99 +225,14 @@ func (s *Server) Start() error {
return nil
}
s.clientRunning = true
s.clientRunningChan = make(chan struct{})
s.clientGiveUpChan = make(chan struct{})
go s.connectWithRetryRuns(ctx, config, s.statusRecorder, s.clientRunningChan, s.clientGiveUpChan)
// Boot autoconnect: no incoming RPC metadata. The supervisor runs the
// client and reconnects internally; we just fire and forget (the run owns
// its established/done channels).
s.connectClient.RunAsync(config, nil)
s.publishConfigChangedEvent("startup")
return nil
}
// connectWithRetryRuns runs the client connection with a backoff strategy where we retry the operation as additional
// mechanism to keep the client connected even when the connection is lost.
// we cancel retry if the client receive a stop or down command, or if disable auto connect is configured.
//
// The goroutine's exit is signalled to the daemon via close(giveUpChan)
// — placed in the function-scope defer so every return path (panic,
// DisableAutoConnect early-exit, backoff exhausted, ctx cancel) closes
// it. Callers that need to observe "is the goroutine still alive?" use
// Server.connectionGoroutineRunning() which non-blockingly checks the close state
// of clientGiveUpChan. The defer does NOT touch s.mutex; the daemon's
// "intent" (clientRunning) is maintained by the RPC handlers, not by this
// goroutine.
func (s *Server) connectWithRetryRuns(ctx context.Context, profileConfig *profilemanager.Config, statusRecorder *peer.Status, runningChan chan struct{}, giveUpChan chan struct{}) {
defer func() {
if giveUpChan != nil {
close(giveUpChan)
}
}()
if s.config.DisableAutoConnect {
if err := s.connect(ctx, s.config, s.statusRecorder, runningChan); err != nil {
log.Debugf("run client connection exited with error: %v", err)
}
log.Tracef("client connection exited")
return
}
backOff := getConnectWithBackoff(ctx)
go func() {
t := time.NewTicker(24 * time.Hour)
for {
select {
case <-ctx.Done():
t.Stop()
return
case <-t.C:
mgmtState := statusRecorder.GetManagementState()
signalState := statusRecorder.GetSignalState()
if mgmtState.Connected && signalState.Connected {
log.Tracef("resetting status")
backOff.Reset()
} else {
log.Tracef("not resetting status: mgmt: %v, signal: %v", mgmtState.Connected, signalState.Connected)
}
}
}
}()
runOperation := func() error {
err := s.connect(ctx, profileConfig, statusRecorder, runningChan)
if err != nil {
log.Debugf("run client connection exited with error: %v. Will retry in the background", err)
return err
}
log.Tracef("client connection exited gracefully, do not need to retry")
return nil
}
if err := backoff.Retry(runOperation, backOff); err != nil {
log.Errorf("operation failed: %v", err)
}
// giveUpChan is closed by the function-scope defer.
}
// connectionGoroutineRunning reports whether the connectWithRetryRuns goroutine is
// still running. Returns false when no goroutine has ever been started
// AND when the most recent one has already closed clientGiveUpChan on
// exit (whether due to ctx cancel, DisableAutoConnect single-shot
// completion, or backoff retry exhaustion).
//
// MUST be called with s.mutex held — accesses s.clientGiveUpChan which
// is written by Start/Up under the same lock.
func (s *Server) connectionGoroutineRunning() bool {
if s.clientGiveUpChan == nil {
return false
}
select {
case <-s.clientGiveUpChan:
return false
default:
return true
}
}
// loginAttempt attempts to login using the provided information. it returns a status in case something fails
func (s *Server) loginAttempt(ctx context.Context, setupKey, jwtToken string) (internal.StatusType, error) {
authClient, err := auth.NewAuth(ctx, s.config.PrivateKey, s.config.ManagementURL, s.config)
@@ -375,7 +283,7 @@ func (s *Server) SetConfig(callerCtx context.Context, msg *proto.SetConfigReques
return nil, err
}
config, err := setConfigInputFromRequest(msg)
config, err := s.setConfigInputFromRequest(msg)
if err != nil {
return nil, err
}
@@ -398,17 +306,18 @@ func (s *Server) SetConfig(callerCtx context.Context, msg *proto.SetConfigReques
// field is its own optional case. Returns the resolved ConfigInput
// and a non-nil error only when the active profile file path cannot
// be determined.
func setConfigInputFromRequest(msg *proto.SetConfigRequest) (profilemanager.ConfigInput, error) {
func (s *Server) setConfigInputFromRequest(msg *proto.SetConfigRequest) (profilemanager.ConfigInput, error) {
var config profilemanager.ConfigInput
resolved, err := s.resolveProfileHandle(msg.ProfileName, msg.Username)
if err != nil {
log.Errorf("failed to resolve profile %q: %v", msg.ProfileName, err)
return nil, err
return config, err
}
profPath := resolved.Path
if profPath == "" {
profPath = profilemanager.DefaultConfigPath
}
config.ConfigPath = profPath
if msg.ManagementUrl != "" {
@@ -719,13 +628,22 @@ func (s *Server) WaitSSOLogin(callerCtx context.Context, msg *proto.WaitSSOLogin
// Up starts engine work in the daemon.
func (s *Server) Up(callerCtx context.Context, msg *proto.UpRequest) (*proto.UpResponse, error) {
s.mutex.Lock()
// clientRunning is the daemon-intent flag (set by previous Up/Start, cleared
// by Down). connectionGoroutineRunning() reports whether the previous retry-loop
// goroutine is still trying. When intent is up AND goroutine is alive,
// the existing engine is on the job — just wait for it. When intent
// is up but the goroutine has given up (backoff exhausted) OR when
// intent is down, fall through to spawn a fresh retry loop.
if s.clientRunning && s.connectionGoroutineRunning() {
// The client (and its supervisor) is built once in New(), so a nil here
// never happens in production — Up is only reachable after New() has run and
// the gRPC server is serving. The real case this guards is the daemon
// SHUTTING DOWN: rootCtx is cancelled, the supervisor is no longer accepting
// commands, so ServiceRunning() is false even though the client exists. Bail
// loud instead of enqueuing a run that will never start. (nil only happens in
// tests that build a Server without New(); ServiceRunning is nil-safe.)
if !s.connectClient.ServiceRunning() {
s.mutex.Unlock()
return nil, fmt.Errorf("service is not running, start the netbird service for 'up' to take effect")
}
// If a connection run is already in flight, the existing engine is on the
// job — just wait for it. Otherwise fall through to start a fresh run.
if s.connectClient.ConnectionRunning() {
state := internal.CtxGetState(s.rootCtx)
status, err := state.Status()
if err != nil {
@@ -763,14 +681,14 @@ func (s *Server) Up(callerCtx context.Context, msg *proto.UpRequest) (*proto.UpR
if s.actCancel != nil {
s.actCancel()
}
ctx, cancel := context.WithCancel(s.rootCtx)
md, ok := metadata.FromIncomingContext(callerCtx)
if ok {
ctx = metadata.NewOutgoingContext(ctx, md)
}
// actCancel cancels in-flight foreground ops (login/status); the run is
// owned by the supervisor and stopped via Stop, not this cancel.
_, cancel := context.WithCancel(s.rootCtx)
s.actCancel = cancel
// Forward the caller's gRPC metadata (e.g. UI user-agent) into the run.
md, _ := metadata.FromIncomingContext(callerCtx)
if s.config == nil {
s.mutex.Unlock()
return nil, fmt.Errorf("config is not defined, please call login command first")
@@ -811,35 +729,26 @@ func (s *Server) Up(callerCtx context.Context, msg *proto.UpRequest) (*proto.UpR
s.statusRecorder.UpdateManagementAddress(s.config.ManagementURL.String())
s.statusRecorder.UpdateRosenpass(s.config.RosenpassEnabled, s.config.RosenpassPermissive)
s.clientRunning = true
s.clientRunningChan = make(chan struct{})
s.clientGiveUpChan = make(chan struct{})
go s.connectWithRetryRuns(ctx, s.config, s.statusRecorder, s.clientRunningChan, s.clientGiveUpChan)
s.connectClient.RunAsync(s.config, md)
s.publishConfigChangedEvent("up_rpc")
s.mutex.Unlock()
return s.waitForUp(callerCtx)
}
// todo: handle potential race conditions
// waitForUp blocks until the in-flight run becomes established (success) or ends
// before that (failure). The wait is owned by the supervisor (via the client) —
// the daemon holds no per-run state here.
func (s *Server) waitForUp(callerCtx context.Context) (*proto.UpResponse, error) {
timeoutCtx, cancel := context.WithTimeout(callerCtx, 50*time.Second)
defer cancel()
select {
case <-s.clientGiveUpChan:
return nil, fmt.Errorf("client gave up to connect")
case <-s.clientRunningChan:
s.isSessionActive.Store(true)
return &proto.UpResponse{}, nil
case <-callerCtx.Done():
log.Debug("context done, stopping the wait for engine to become ready")
return nil, callerCtx.Err()
case <-timeoutCtx.Done():
log.Debug("up is timed out, stopping the wait for engine to become ready")
return nil, timeoutCtx.Err()
if err := s.connectClient.WaitEstablishedOrDone(timeoutCtx); err != nil {
log.Debugf("waiting for the connection to be established failed: %v", err)
return nil, fmt.Errorf("connection not established: %w", err)
}
s.isSessionActive.Store(true)
return &proto.UpResponse{}, nil
}
// resolveProfileHandle resolves a wire-level profile handle (display
@@ -934,11 +843,11 @@ func (s *Server) SwitchProfile(callerCtx context.Context, msg *proto.SwitchProfi
// Down engine work in the daemon.
func (s *Server) Down(ctx context.Context, _ *proto.DownRequest) (*proto.DownResponse, error) {
s.mutex.Lock()
defer s.mutex.Unlock()
giveUpChan := s.clientGiveUpChan
// cleanupConnection stops the run through the supervisor, which blocks until
// the run has fully unwound — no separate goroutine-quiescence wait needed.
if err := s.cleanupConnection(); err != nil {
s.mutex.Unlock()
// todo review to update the status in case any type of error
log.Errorf("failed to shut down properly: %v", err)
return nil, err
@@ -947,20 +856,6 @@ func (s *Server) Down(ctx context.Context, _ *proto.DownRequest) (*proto.DownRes
state := internal.CtxGetState(s.rootCtx)
state.Set(internal.StatusIdle)
s.mutex.Unlock()
// Wait for the connectWithRetryRuns goroutine to finish with a short timeout.
// This prevents the goroutine from setting ErrResetConnection after Down() returns.
// The giveUpChan is closed at the end of connectWithRetryRuns.
if giveUpChan != nil {
select {
case <-giveUpChan:
log.Debugf("client goroutine finished successfully")
case <-time.After(5 * time.Second):
log.Warnf("timeout waiting for client goroutine to finish, proceeding anyway")
}
}
return &proto.DownResponse{}, nil
}
@@ -971,34 +866,19 @@ func (s *Server) cleanupConnection() error {
return ErrServiceNotUp
}
// Daemon intent flips to "down" — all callers (Down RPC,
// Logout RPC handlers) tear down the connection because the user
// explicitly asked for it. MDM restart does NOT go through this
// path, so its clientRunning stays true.
s.clientRunning = false
// Capture the engine reference before cancelling the context.
// After actCancel(), the connectWithRetryRuns goroutine wakes up
// and sets connectClient.engine = nil, causing connectClient.Stop()
// to skip the engine shutdown entirely.
var engine *internal.Engine
if s.connectClient != nil {
engine = s.connectClient.Engine()
// Tear the client down through the lifecycle supervisor BEFORE cancelling
// the retry context. Stop serializes on the supervisor queue and blocks
// until the in-flight run has fully unwound (a clean, synchronous teardown).
// It must run before actCancel: cancelling the context first would make
// Stop observe a dead context and return early without waiting.
if err := s.connectClient.Stop(); err != nil {
return err
}
// Stop the retry goroutine so it does not start a fresh run. The client
// itself is daemon-lifetime and intentionally kept (a later Up reuses it).
s.actCancel()
if s.connectClient == nil {
return nil
}
if engine != nil {
if err := engine.Stop(); err != nil {
return err
}
}
s.connectClient = nil
s.isSessionActive.Store(false)
log.Infof("service is down")
@@ -1133,7 +1013,7 @@ func (s *Server) validateProfileOperation(id profilemanager.ID, allowActiveProfi
func (s *Server) logoutFromProfile(ctx context.Context, profile *profilemanager.Profile) error {
activeProf, err := s.profileManager.GetActiveProfileState()
if err == nil && activeProf.ID == profile.ID && s.connectClient != nil {
if err == nil && activeProf.ID == profile.ID && s.connectClient.ConnectionRunning() {
return s.sendLogoutRequest(ctx)
}
@@ -1179,48 +1059,13 @@ func (s *Server) Status(
ctx context.Context,
msg *proto.StatusRequest,
) (*proto.StatusResponse, error) {
s.mutex.Lock()
// Only wait if the retry-loop goroutine is alive and making
// progress. clientRunning=true with connectionGoroutineRunning=false means the
// backoff has given up — there is nothing to wait for; let the
// caller observe the failed status directly.
alive := s.connectionGoroutineRunning()
s.mutex.Unlock()
if msg.WaitForReady != nil && *msg.WaitForReady && alive {
state := internal.CtxGetState(s.rootCtx)
status, err := state.Status()
if err != nil {
return nil, err
}
if status != internal.StatusIdle && status != internal.StatusConnected && status != internal.StatusConnecting {
s.actCancel()
}
ticker := time.NewTicker(1 * time.Second)
defer ticker.Stop()
loop:
for {
select {
case <-s.clientGiveUpChan:
ticker.Stop()
break loop
case <-s.clientRunningChan:
ticker.Stop()
break loop
case <-ticker.C:
status, err := state.Status()
if err != nil {
continue
}
if status != internal.StatusIdle && status != internal.StatusConnected && status != internal.StatusConnecting {
s.actCancel()
}
continue
case <-ctx.Done():
return nil, ctx.Err()
}
// A run that hits a terminal auth failure now exits on its own (engine marks
// NeedsLogin), so we no longer poll-and-cancel: we wait for the in-flight run
// to become established or to end. With no run in flight this returns
// immediately (errNoRunInFlight); either way we then report the status below.
if msg.WaitForReady != nil && *msg.WaitForReady {
if err := s.connectClient.WaitEstablishedOrDone(ctx); err != nil && ctx.Err() != nil {
return nil, ctx.Err()
}
}
@@ -1258,10 +1103,6 @@ func (s *Server) getSSHServerState() *proto.SSHServerState {
connectClient := s.connectClient
s.mutex.Unlock()
if connectClient == nil {
return nil
}
engine := connectClient.Engine()
if engine == nil {
return nil
@@ -1299,10 +1140,6 @@ func (s *Server) GetPeerSSHHostKey(
statusRecorder := s.statusRecorder
s.mutex.Unlock()
if connectClient == nil {
return nil, errors.New("client not initialized")
}
engine := connectClient.Engine()
if engine == nil {
return nil, errors.New("engine not started")
@@ -1469,17 +1306,13 @@ func (s *Server) WaitJWTToken(
// ExposeService exposes a local port via the NetBird reverse proxy.
func (s *Server) ExposeService(req *proto.ExposeServiceRequest, srv proto.DaemonService_ExposeServiceServer) error {
s.mutex.Lock()
if !s.clientRunning {
if !s.connectClient.ConnectionRunning() {
s.mutex.Unlock()
return gstatus.Errorf(codes.FailedPrecondition, "client is not running, run 'netbird up' first")
}
connectClient := s.connectClient
s.mutex.Unlock()
if connectClient == nil {
return gstatus.Errorf(codes.FailedPrecondition, "client not initialized")
}
engine := connectClient.Engine()
if engine == nil {
return gstatus.Errorf(codes.FailedPrecondition, "engine not initialized")
@@ -1533,10 +1366,6 @@ func isUnixRunningDesktop() bool {
}
func (s *Server) runProbes(waitForProbeResult bool) {
if s.connectClient == nil {
return
}
engine := s.connectClient.Engine()
if engine == nil {
return
@@ -1815,22 +1644,6 @@ func (s *Server) GetFeatures(ctx context.Context, msg *proto.GetFeaturesRequest)
return features, nil
}
func (s *Server) connect(ctx context.Context, config *profilemanager.Config, statusRecorder *peer.Status, runningChan chan struct{}) error {
log.Tracef("running client connection")
client := internal.NewConnectClient(ctx, config, statusRecorder)
client.SetUpdateManager(s.updateManager)
client.SetSyncResponsePersistence(s.persistSyncResponse)
s.mutex.Lock()
s.connectClient = client
s.mutex.Unlock()
if err := client.Run(runningChan, s.logFile); err != nil {
return err
}
return nil
}
// MDM authority: when the platform-native MDM source sets a kill switch
// key (regardless of true/false value), that value wins. The CLI flag
// supplied at service install time is the fallback used only when the
@@ -1892,45 +1705,6 @@ func (s *Server) onSessionExpire() {
}
}
// getConnectWithBackoff returns a backoff with exponential backoff strategy for connection retries
func getConnectWithBackoff(ctx context.Context) backoff.BackOff {
initialInterval := parseEnvDuration(retryInitialIntervalVar, defaultInitialRetryTime)
maxInterval := parseEnvDuration(maxRetryIntervalVar, defaultMaxRetryInterval)
maxElapsedTime := parseEnvDuration(maxRetryTimeVar, defaultMaxRetryTime)
multiplier := defaultRetryMultiplier
if envValue := os.Getenv(retryMultiplierVar); envValue != "" {
// parse the multiplier from the environment variable string value to float64
value, err := strconv.ParseFloat(envValue, 64)
if err != nil {
log.Warnf("unable to parse environment variable %s: %s. using default: %f", retryMultiplierVar, envValue, multiplier)
} else {
multiplier = value
}
}
return backoff.WithContext(&backoff.ExponentialBackOff{
InitialInterval: initialInterval,
RandomizationFactor: 1,
Multiplier: multiplier,
MaxInterval: maxInterval,
MaxElapsedTime: maxElapsedTime, // 14 days
Stop: backoff.Stop,
Clock: backoff.SystemClock,
}, ctx)
}
// parseEnvDuration parses the environment variable and returns the duration
func parseEnvDuration(envVar string, defaultDuration time.Duration) time.Duration {
if envValue := os.Getenv(envVar); envValue != "" {
if duration, err := time.ParseDuration(envValue); err == nil {
return duration
}
log.Warnf("unable to parse environment variable %s: %s. using default: %s", envVar, envValue, defaultDuration)
}
return defaultDuration
}
// sendTerminalNotification sends a terminal notification message
// to inform the user that the NetBird connection session has expired.
func sendTerminalNotification() error {

View File

@@ -15,14 +15,19 @@ import (
)
func newTestServer() *Server {
return &Server{
rootCtx: context.Background(),
ctx := context.Background()
s := &Server{
rootCtx: ctx,
statusRecorder: peer.NewRecorder(""),
}
// Honor the production invariant: the daemon-lifetime client always exists
// (built in New). Server methods rely on s.connectClient being non-nil.
s.connectClient = internal.NewConnectClient(ctx, s.statusRecorder)
return s
}
func newDummyConnectClient(ctx context.Context) *internal.ConnectClient {
return internal.NewConnectClient(ctx, nil, nil)
return internal.NewConnectClient(ctx, nil)
}
// TestConnectSetsClientWithMutex validates that connect() sets s.connectClient
@@ -87,41 +92,36 @@ func TestConcurrentConnectClientAccess(t *testing.T) {
assert.Equal(t, 50, nilCount+setCount, "all goroutines should complete without panic")
}
// TestCleanupConnection_ClearsConnectClient validates that cleanupConnection
// properly nils out connectClient.
func TestCleanupConnection_ClearsConnectClient(t *testing.T) {
// TestCleanupConnection_KeepsClientStopsRunning validates that cleanupConnection
// clears the daemon "up" intent but KEEPS the daemon-lifetime ConnectClient
// (it is reused across Up/Down; only the run is stopped).
func TestCleanupConnection_KeepsClientStopsRunning(t *testing.T) {
s := newTestServer()
_, cancel := context.WithCancel(context.Background())
s.actCancel = cancel
s.connectClient = newDummyConnectClient(context.Background())
s.clientRunning = true
err := s.cleanupConnection()
require.NoError(t, err)
assert.Nil(t, s.connectClient, "connectClient should be nil after cleanup")
assert.False(t, s.clientRunning, "clientRunning should be cleared after cleanup (intent = down)")
assert.NotNil(t, s.connectClient, "connectClient is daemon-lifetime and must persist after cleanup")
assert.False(t, s.connectClient.ConnectionRunning(), "no run should be in flight after cleanup")
}
// TestCleanState_NilConnectClient validates that CleanState doesn't panic
// when connectClient is nil.
func TestCleanState_NilConnectClient(t *testing.T) {
// TestCleanState_NotConnected validates that CleanState doesn't panic when no
// connection run is in flight.
func TestCleanState_NotConnected(t *testing.T) {
s := newTestServer()
s.connectClient = nil
s.profileManager = nil // will cause error if it tries to proceed past the nil check
s.profileManager = nil // will cause error if it tries to proceed
// Should not panic — the nil check should prevent calling Status() on nil
assert.NotPanics(t, func() {
_, _ = s.CleanState(context.Background(), &proto.CleanStateRequest{All: true})
})
}
// TestDeleteState_NilConnectClient validates that DeleteState doesn't panic
// when connectClient is nil.
func TestDeleteState_NilConnectClient(t *testing.T) {
// TestDeleteState_NotConnected validates that DeleteState doesn't panic when no
// connection run is in flight.
func TestDeleteState_NotConnected(t *testing.T) {
s := newTestServer()
s.connectClient = nil
s.profileManager = nil
assert.NotPanics(t, func() {
@@ -129,60 +129,6 @@ func TestDeleteState_NilConnectClient(t *testing.T) {
})
}
// TestDownThenUp_StaleRunningChan documents the known state issue where
// clientRunningChan from a previous connection is already closed, causing
// waitForUp() to return immediately on reconnect.
func TestDownThenUp_StaleRunningChan(t *testing.T) {
s := newTestServer()
// Simulate state after a successful connection
s.clientRunning = true
s.clientRunningChan = make(chan struct{})
close(s.clientRunningChan) // closed when engine started
s.clientGiveUpChan = make(chan struct{})
s.connectClient = newDummyConnectClient(context.Background())
_, cancel := context.WithCancel(context.Background())
s.actCancel = cancel
// Simulate Down(): cleanupConnection sets connectClient = nil and
// flips clientRunning to false (intent = down). The connectionGoroutineRunning state
// remains independent of intent — derived from clientGiveUpChan.
s.mutex.Lock()
err := s.cleanupConnection()
s.mutex.Unlock()
require.NoError(t, err)
// After cleanup: connectClient is nil, clientRunning is false (intent
// cleared by cleanupConnection), connectionGoroutineRunning may still be true
// (goroutine teardown is independent of the intent flag).
s.mutex.Lock()
assert.Nil(t, s.connectClient, "connectClient should be nil after cleanup")
assert.False(t, s.clientRunning, "clientRunning should be cleared by cleanupConnection (intent = down)")
s.mutex.Unlock()
// waitForUp() returns immediately due to stale closed clientRunningChan
ctx, ctxCancel := context.WithTimeout(context.Background(), 2*time.Second)
defer ctxCancel()
waitDone := make(chan error, 1)
go func() {
_, err := s.waitForUp(ctx)
waitDone <- err
}()
select {
case err := <-waitDone:
assert.NoError(t, err, "waitForUp returns success on stale channel")
// But connectClient is still nil — this is the stale state issue
s.mutex.Lock()
assert.Nil(t, s.connectClient, "connectClient is nil despite waitForUp success")
s.mutex.Unlock()
case <-time.After(1 * time.Second):
t.Fatal("waitForUp should have returned immediately due to stale closed channel")
}
}
// TestConnectClient_EngineNilOnFreshClient validates that a newly created
// ConnectClient has nil Engine (before Run is called).
func TestConnectClient_EngineNilOnFreshClient(t *testing.T) {

View File

@@ -31,7 +31,6 @@ import (
"google.golang.org/grpc/keepalive"
"github.com/netbirdio/netbird/client/internal"
"github.com/netbirdio/netbird/client/internal/peer"
"github.com/netbirdio/netbird/client/internal/profilemanager"
daemonProto "github.com/netbirdio/netbird/client/proto"
"github.com/netbirdio/netbird/management/server"
@@ -61,65 +60,6 @@ var (
}
)
// TestConnectWithRetryRuns checks that the connectWithRetry function runs and runs the retries according to the times specified via environment variables
// we will use a management server started via to simulate the server and capture the number of retries
func TestConnectWithRetryRuns(t *testing.T) {
// start the signal server
_, signalAddr, err := startSignal(t)
if err != nil {
t.Fatalf("failed to start signal server: %v", err)
}
counter := 0
// start the management server
_, mgmtAddr, err := startManagement(t, signalAddr, &counter)
if err != nil {
t.Fatalf("failed to start management server: %v", err)
}
ctx := internal.CtxInitState(context.Background())
ctx, cancel := context.WithDeadline(ctx, time.Now().Add(30*time.Second))
defer cancel()
// create new server
ic := profilemanager.ConfigInput{
ManagementURL: "http://" + mgmtAddr,
ConfigPath: t.TempDir() + "/test-profile.json",
}
config, err := profilemanager.UpdateOrCreateConfig(ic)
if err != nil {
t.Fatalf("failed to create config: %v", err)
}
currUser, err := user.Current()
require.NoError(t, err)
pm := profilemanager.ServiceManager{}
err = pm.SetActiveProfileState(&profilemanager.ActiveProfileState{
ID: "test-profile",
Username: currUser.Username,
})
if err != nil {
t.Fatalf("failed to set active profile state: %v", err)
}
s := New(ctx, "debug", "", false, false, false, false)
s.config = config
s.statusRecorder = peer.NewRecorder(config.ManagementURL.String())
t.Setenv(retryInitialIntervalVar, "1s")
t.Setenv(maxRetryIntervalVar, "2s")
t.Setenv(maxRetryTimeVar, "5s")
t.Setenv(retryMultiplierVar, "1")
s.connectWithRetryRuns(ctx, config, s.statusRecorder, nil, nil)
if counter < 3 {
t.Fatalf("expected counter > 2, got %d", counter)
}
}
func TestServer_Up(t *testing.T) {
tempDir := t.TempDir()
origDefaultProfileDir := profilemanager.DefaultConfigPathDir

View File

@@ -62,7 +62,7 @@ func setupServerWithProfile(t *testing.T) (s *Server, ctx context.Context, profN
pm := profilemanager.ServiceManager{}
require.NoError(t, pm.SetActiveProfileState(&profilemanager.ActiveProfileState{
Name: profName,
ID: profilemanager.ID(profName),
Username: currUser.Username,
}))
@@ -107,9 +107,9 @@ func TestSetConfig_MDMReject_SingleField(t *testing.T) {
func TestSetConfig_MDMReject_MultipleFields(t *testing.T) {
withMDMPolicy(t, mdm.NewPolicy(map[string]any{
mdm.KeyManagementURL: "https://mdm.example.com:443",
mdm.KeyBlockInbound: true,
mdm.KeyRosenpassEnabled: true,
mdm.KeyManagementURL: "https://mdm.example.com:443",
mdm.KeyBlockInbound: true,
mdm.KeyRosenpassEnabled: true,
}))
s, ctx, profName, username, _ := setupServerWithProfile(t)

View File

@@ -9,7 +9,6 @@ import (
"google.golang.org/grpc/status"
nberrors "github.com/netbirdio/netbird/client/errors"
"github.com/netbirdio/netbird/client/internal"
"github.com/netbirdio/netbird/client/internal/routemanager/systemops"
"github.com/netbirdio/netbird/client/internal/statemanager"
"github.com/netbirdio/netbird/client/proto"
@@ -38,7 +37,7 @@ func (s *Server) ListStates(_ context.Context, _ *proto.ListStatesRequest) (*pro
// CleanState handles cleaning of states (performing cleanup operations)
func (s *Server) CleanState(ctx context.Context, req *proto.CleanStateRequest) (*proto.CleanStateResponse, error) {
if s.connectClient != nil && (s.connectClient.Status() == internal.StatusConnected || s.connectClient.Status() == internal.StatusConnecting) {
if s.connectClient.ConnectionRunning() {
return nil, status.Errorf(codes.FailedPrecondition, "cannot clean state while connecting or connected, run 'netbird down' first.")
}
@@ -81,7 +80,7 @@ func (s *Server) CleanState(ctx context.Context, req *proto.CleanStateRequest) (
// DeleteState handles deletion of states without cleanup
func (s *Server) DeleteState(ctx context.Context, req *proto.DeleteStateRequest) (*proto.DeleteStateResponse, error) {
if s.connectClient != nil && (s.connectClient.Status() == internal.StatusConnected || s.connectClient.Status() == internal.StatusConnecting) {
if s.connectClient.ConnectionRunning() {
return nil, status.Errorf(codes.FailedPrecondition, "cannot clean state while connecting or connected, run 'netbird down' first.")
}

View File

@@ -62,10 +62,6 @@ func (s *Server) TracePacket(_ context.Context, req *proto.TracePacketRequest) (
}
func (s *Server) getPacketTracer() (packetTracer, *internal.Engine, error) {
if s.connectClient == nil {
return nil, nil, fmt.Errorf("connect client not initialized")
}
engine := s.connectClient.Engine()
if engine == nil {
return nil, nil, fmt.Errorf("engine not initialized")

View File

@@ -98,6 +98,7 @@ type RelayStateOutputDetail struct {
URI string `json:"uri" yaml:"uri"`
Available bool `json:"available" yaml:"available"`
Error string `json:"error" yaml:"error"`
Transport string `json:"transport,omitempty" yaml:"transport,omitempty"`
}
type RelayStateOutput struct {
@@ -219,7 +220,8 @@ func mapRelays(relays []*proto.RelayState) RelayStateOutput {
RelayStateOutputDetail{
URI: relay.URI,
Available: available,
Error: relay.GetError(),
Error: relayErrorString(relay.GetError()),
Transport: relay.GetTransport(),
},
)
@@ -235,6 +237,12 @@ func mapRelays(relays []*proto.RelayState) RelayStateOutput {
}
}
// relayErrorString flattens a newline-joined aggregated relay error onto a
// single line for status output.
func relayErrorString(s string) string {
return strings.ReplaceAll(s, "\n", "; ")
}
func mapNSGroups(servers []*proto.NSGroupState) []NsServerGroupStateOutput {
mappedNSGroups := make([]NsServerGroupStateOutput, 0, len(servers))
for _, pbNsGroupServer := range servers {
@@ -441,6 +449,8 @@ func (o *OutputOverview) GeneralSummary(showURL bool, showRelays bool, showNameS
available = "Unavailable"
reason = fmt.Sprintf(", reason: %s", relay.Error)
}
} else if relay.Transport != "" {
available = fmt.Sprintf("%s via %s", available, relay.Transport)
}
relaysString += fmt.Sprintf("\n [%s] is %s%s", relay.URI, available, reason)

View File

@@ -647,3 +647,13 @@ func TestTimeAgo(t *testing.T) {
})
}
}
func TestMapRelaysTransport(t *testing.T) {
out := mapRelays([]*proto.RelayState{
{URI: "rels://relay.example:443", Available: true, Transport: "quic"},
{URI: "rels://relay2.example:443", Available: true, Transport: "ws"},
})
require.Len(t, out.Details, 2)
assert.Equal(t, "quic", out.Details[0].Transport)
assert.Equal(t, "ws", out.Details[1].Transport)
}

View File

@@ -2,4 +2,5 @@ FROM ubuntu:24.04
RUN apt update && apt install -y ca-certificates && rm -fr /var/cache/apt
ENTRYPOINT [ "/go/bin/netbird-server" ]
CMD ["--config", "/etc/netbird/config.yaml"]
COPY netbird-server /go/bin/netbird-server
ARG TARGETPLATFORM
COPY ${TARGETPLATFORM}/netbird-server /go/bin/netbird-server

View File

@@ -0,0 +1,56 @@
# Build environments
Dockerfiles that pin the same toolchain CI uses, so a developer can
reproduce a CI build locally without installing platform SDKs on their
workstation. The version pins in each `Dockerfile` must stay in lockstep
with `.github/workflows/`.
## `android/`
Mirrors `.github/workflows/mobile-build-validation.yml` (`android_build`
job). Carries Go 1.25.5, Adopt JDK 11, Android cmdline-tools 8512546,
NDK 23.1.7779620 and gomobile pinned at the CI commit. Use it to
produce `netbird.aar` from `./client/android`:
```bash
docker build -t netbird/build-android docker/build-env/android
docker run --rm -v "$PWD:/src" -w /src netbird/build-android \
gomobile bind \
-o netbird.aar \
-javapkg=io.netbird.gomobile \
-ldflags="-checklinkname=0 \
-X golang.zx2c4.com/wireguard/ipc.socketDirectory=/data/data/io.netbird.client/cache/wireguard \
-X github.com/netbirdio/netbird/version.version=local" \
./client/android
```
To build the full Android APK, bind-mount the `android-client` repo as
well and run its own `./gradlew assembleDebug` from inside the
container (the gradle wrapper ships with `android-client`).
## `windows-cross/`
Cross-compiles Windows binaries from Linux using `mingw-w64`. Lets you
verify that `GOOS=windows go build ./...` compiles cleanly without
needing a Windows VM. Cannot run Windows tests — the `golang-test-windows`
CI job executes on a native `windows-latest` runner with wintun.dll
and PsExec, neither of which lives under Linux containers.
```bash
docker build -t netbird/build-windows docker/build-env/windows-cross
docker run --rm -v "$PWD:/src" -w /src netbird/build-windows \
bash -c 'GOOS=windows GOARCH=amd64 go build ./...'
```
## What is NOT here
- **iOS / macOS**: cannot legally run macOS in Docker (Apple EULA),
and Xcode is not redistributable. The `ios_build` CI job uses a
`macos-latest` GitHub runner; locally you need a real Mac.
- **Native Windows tests**: see note above. The Linux+mingw image
builds, it does not execute Windows-host code paths
(registry, wintun, services, PsExec workflows).
When CI version pins change, update the corresponding `ARG` lines in
the Dockerfiles and the README's table of versions.

View File

@@ -0,0 +1,86 @@
# Android build environment.
#
# Mirrors the toolchain pinned by .github/workflows/mobile-build-validation.yml
# so a `gomobile bind` against ./client/android in this image produces the
# same netbird.aar that CI builds.
#
# Tooling versions (must stay in sync with the CI workflow):
# - Ubuntu 22.04 (matches the ubuntu-latest GitHub runner)
# - Go 1.25.5 (matches go.mod)
# - Adopt JDK 11 (matches actions/setup-java@v3 java-version: 11, distribution: adopt)
# - Android SDK cmdline-tools 8512546
# - Android NDK 23.1.7779620
# - gomobile commit v0.0.0-20251113184115-a159579294ab
#
# Usage (from the netbird repo root):
#
# docker build -t netbird/build-android docker/build-env/android
#
# # bind the netbird checkout in and run the same gomobile command CI runs
# docker run --rm -v "$PWD:/src" -w /src netbird/build-android \
# gomobile bind \
# -o netbird.aar \
# -javapkg=io.netbird.gomobile \
# -ldflags="-checklinkname=0 \
# -X golang.zx2c4.com/wireguard/ipc.socketDirectory=/data/data/io.netbird.client/cache/wireguard \
# -X github.com/netbirdio/netbird/version.version=local" \
# ./client/android
#
# To build the full APK, mount the android-client repo too and run
# `./gradlew assembleDebug` from /android-client (this image carries
# gradle's prerequisites JDK + Android SDK but not the gradle wrapper —
# that ships with android-client).
FROM ubuntu:22.04
ARG DEBIAN_FRONTEND=noninteractive
# Versions — bump in lockstep with .github/workflows/mobile-build-validation.yml.
ARG GO_VERSION=1.25.5
ARG ANDROID_CMDLINE_TOOLS_VERSION=8512546
ARG ANDROID_NDK_VERSION=23.1.7779620
ARG GOMOBILE_VERSION=v0.0.0-20251113184115-a159579294ab
ENV ANDROID_HOME=/opt/android-sdk
ENV ANDROID_NDK_HOME=${ANDROID_HOME}/ndk/${ANDROID_NDK_VERSION}
ENV JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64
ENV GOPATH=/go
ENV GOTOOLCHAIN=local
ENV CGO_ENABLED=0
ENV PATH=${GOPATH}/bin:/usr/local/go/bin:${ANDROID_HOME}/cmdline-tools/latest/bin:${ANDROID_HOME}/platform-tools:${JAVA_HOME}/bin:${PATH}
RUN apt-get update && apt-get install -y --no-install-recommends \
ca-certificates \
curl \
unzip \
git \
openjdk-11-jdk-headless \
build-essential \
&& rm -rf /var/lib/apt/lists/*
# Install Go (matches go.mod). actions/setup-go fetches the same tarball.
RUN curl -fsSL "https://go.dev/dl/go${GO_VERSION}.linux-amd64.tar.gz" \
| tar -C /usr/local -xz \
&& go version
# Install Android SDK command-line tools, accept licenses, install NDK.
RUN mkdir -p "${ANDROID_HOME}/cmdline-tools" \
&& curl -fsSL -o /tmp/cmdline.zip \
"https://dl.google.com/android/repository/commandlinetools-linux-${ANDROID_CMDLINE_TOOLS_VERSION}_latest.zip" \
&& unzip -q /tmp/cmdline.zip -d "${ANDROID_HOME}/cmdline-tools" \
&& mv "${ANDROID_HOME}/cmdline-tools/cmdline-tools" "${ANDROID_HOME}/cmdline-tools/latest" \
&& rm /tmp/cmdline.zip \
&& yes | sdkmanager --licenses > /dev/null \
&& sdkmanager --install "ndk;${ANDROID_NDK_VERSION}" "platform-tools" > /dev/null
# Install gomobile at the same commit CI pins. Don't run `gomobile init` here:
# `init` resolves the NDK at runtime, do it on the first bind in the mounted
# workspace so the cache lands on the host volume.
RUN GOBIN=/usr/local/bin go install "golang.org/x/mobile/cmd/gomobile@${GOMOBILE_VERSION}" \
&& gomobile version
WORKDIR /src
# Default entrypoint is a plain shell so the image is composable: callers pass
# the full gomobile / gradle command they want to run.
CMD ["/bin/bash"]

View File

@@ -0,0 +1,63 @@
# Windows-cross build environment.
#
# Cross-compiles Windows .exe targets from a Linux container using
# mingw-w64. Mirrors the toolchain set used by
# .github/workflows/golang-test-windows.yml insofar as that is possible
# without a Windows kernel.
#
# IMPORTANT — what this image CAN do:
# - `GOOS=windows go build ./...` to validate that Windows builds compile
# - CGO Windows cross-compile via x86_64-w64-mingw32-gcc when CGO_ENABLED=1
# (matches CI's choco-installed mingw-w64)
#
# IMPORTANT — what this image CANNOT do:
# - Run Windows binaries (no Windows kernel under Docker on Linux).
# - Replicate the CI's `go test` runs which execute on a real
# windows-latest runner (wintun.dll, PsExec, registry, etc.).
# Use the CI for that or a native Windows VM.
#
# Usage (from the netbird repo root):
#
# docker build -t netbird/build-windows docker/build-env/windows-cross
#
# # Cross-compile a static client (.exe) from Linux:
# docker run --rm -v "$PWD:/src" -w /src netbird/build-windows \
# bash -c 'CGO_ENABLED=1 GOOS=windows GOARCH=amd64 \
# CC=x86_64-w64-mingw32-gcc CXX=x86_64-w64-mingw32-g++ \
# go build -o netbird.exe ./client'
#
# # Just validate that everything *compiles* on Windows (no CGO):
# docker run --rm -v "$PWD:/src" -w /src netbird/build-windows \
# bash -c 'GOOS=windows GOARCH=amd64 go build ./...'
#
# Tooling versions (keep in sync with go.mod and any future explicit pin
# documented in golang-test-windows.yml):
# - Ubuntu 22.04
# - Go 1.25.5 (matches go.mod)
# - mingw-w64 (Ubuntu package — pin further if drift becomes a problem)
FROM ubuntu:22.04
ARG DEBIAN_FRONTEND=noninteractive
ARG GO_VERSION=1.25.5
ENV GOPATH=/go
ENV GOTOOLCHAIN=local
ENV PATH=${GOPATH}/bin:/usr/local/go/bin:${PATH}
RUN apt-get update && apt-get install -y --no-install-recommends \
ca-certificates \
curl \
git \
build-essential \
mingw-w64 \
&& rm -rf /var/lib/apt/lists/*
# Install Go (matches go.mod).
RUN curl -fsSL "https://go.dev/dl/go${GO_VERSION}.linux-amd64.tar.gz" \
| tar -C /usr/local -xz \
&& go version
WORKDIR /src
CMD ["/bin/bash"]

View File

@@ -2,4 +2,5 @@ FROM ubuntu:24.04
RUN apt update && apt install -y ca-certificates && rm -fr /var/cache/apt
ENTRYPOINT [ "/go/bin/netbird-mgmt","management"]
CMD ["--log-file", "console"]
COPY netbird-mgmt /go/bin/netbird-mgmt
ARG TARGETPLATFORM
COPY ${TARGETPLATFORM}/netbird-mgmt /go/bin/netbird-mgmt

View File

@@ -1,5 +0,0 @@
FROM ubuntu:24.04
RUN apt update && apt install -y ca-certificates && rm -fr /var/cache/apt
ENTRYPOINT [ "/go/bin/netbird-mgmt","management","--log-level","debug"]
CMD ["--log-file", "console"]
COPY netbird-mgmt /go/bin/netbird-mgmt

View File

@@ -585,66 +585,66 @@ func (b *bufferAffectedUpdate) setTimer(d time.Duration, f func()) {
b.next.Reset(d)
}
func (c *Controller) GetValidatedPeerWithMap(ctx context.Context, isRequiresApproval bool, accountID string, peer *nbpeer.Peer) (*nbpeer.Peer, *types.NetworkMap, []*posture.Checks, int64, error) {
func (c *Controller) GetValidatedPeerWithMap(ctx context.Context, isRequiresApproval bool, accountID string, peerID string) (*types.NetworkMap, []*posture.Checks, int64, error) {
if isRequiresApproval {
network, err := c.repo.GetAccountNetwork(ctx, accountID)
if err != nil {
return nil, nil, nil, 0, err
return nil, nil, 0, err
}
emptyMap := &types.NetworkMap{
Network: network.Copy(),
}
return peer, emptyMap, nil, 0, nil
return emptyMap, nil, 0, nil
}
account, err := c.requestBuffer.GetAccountWithBackpressure(ctx, accountID)
if err != nil {
return nil, nil, nil, 0, err
return nil, nil, 0, err
}
account.InjectProxyPolicies(ctx)
approvedPeersMap, err := c.integratedPeerValidator.GetValidatedPeers(ctx, account.Id, maps.Values(account.Groups), maps.Values(account.Peers), account.Settings.Extra)
if err != nil {
return nil, nil, nil, 0, err
return nil, nil, 0, err
}
startPosture := time.Now()
postureChecks, err := c.getPeerPostureChecks(account, peer.ID)
postureChecks, err := c.getPeerPostureChecks(account, peerID)
if err != nil {
return nil, nil, nil, 0, err
return nil, nil, 0, err
}
log.WithContext(ctx).Debugf("getPeerPostureChecks took %s", time.Since(startPosture))
accountZones, err := c.repo.GetAccountZones(ctx, account.Id)
if err != nil {
log.WithContext(ctx).Errorf("failed to get account zones: %v", err)
return nil, nil, nil, 0, err
return nil, nil, 0, err
}
dnsDomain := c.GetDNSDomain(account.Settings)
peersCustomZone := account.GetPeersCustomZone(ctx, dnsDomain)
proxyNetworkMaps, err := c.proxyController.GetProxyNetworkMaps(ctx, account.Id, peer.ID, account.Peers)
proxyNetworkMaps, err := c.proxyController.GetProxyNetworkMaps(ctx, account.Id, peerID, account.Peers)
if err != nil {
log.WithContext(ctx).Errorf("failed to get proxy network maps: %v", err)
return nil, nil, nil, 0, err
return nil, nil, 0, err
}
resourcePolicies := account.GetResourcePoliciesMap()
routers := account.GetResourceRoutersMap()
groupIDToUserIDs := account.GetActiveGroupUsers()
networkMap := account.GetPeerNetworkMapFromComponents(ctx, peer.ID, peersCustomZone, accountZones, approvedPeersMap, resourcePolicies, routers, c.accountManagerMetrics, groupIDToUserIDs)
networkMap := account.GetPeerNetworkMapFromComponents(ctx, peerID, peersCustomZone, accountZones, approvedPeersMap, resourcePolicies, routers, c.accountManagerMetrics, groupIDToUserIDs)
proxyNetworkMap, ok := proxyNetworkMaps[peer.ID]
proxyNetworkMap, ok := proxyNetworkMaps[peerID]
if ok {
networkMap.Merge(proxyNetworkMap)
}
dnsFwdPort := computeForwarderPort(maps.Values(account.Peers), network_map.DnsForwarderPortMinVersion)
return peer, networkMap, postureChecks, dnsFwdPort, nil
return networkMap, postureChecks, dnsFwdPort, nil
}
// GetDNSDomain returns the configured dnsDomain

View File

@@ -23,7 +23,7 @@ type Controller interface {
BufferUpdateAffectedPeers(ctx context.Context, accountID string, peerIDs []string, reason types.UpdateReason) error
UpdateAccountPeer(ctx context.Context, accountId string, peerId string) error
BufferUpdateAccountPeers(ctx context.Context, accountID string, reason types.UpdateReason) error
GetValidatedPeerWithMap(ctx context.Context, isRequiresApproval bool, accountID string, p *nbpeer.Peer) (*nbpeer.Peer, *types.NetworkMap, []*posture.Checks, int64, error)
GetValidatedPeerWithMap(ctx context.Context, isRequiresApproval bool, accountID string, peerID string) (*types.NetworkMap, []*posture.Checks, int64, error)
GetDNSDomain(settings *types.Settings) string
StartWarmup(context.Context)
GetNetworkMap(ctx context.Context, peerID string) (*types.NetworkMap, error)

View File

@@ -127,21 +127,20 @@ func (mr *MockControllerMockRecorder) GetNetworkMap(ctx, peerID any) *gomock.Cal
}
// GetValidatedPeerWithMap mocks base method.
func (m *MockController) GetValidatedPeerWithMap(ctx context.Context, isRequiresApproval bool, accountID string, p *peer.Peer) (*peer.Peer, *types.NetworkMap, []*posture.Checks, int64, error) {
func (m *MockController) GetValidatedPeerWithMap(ctx context.Context, isRequiresApproval bool, accountID string, peerID string) (*types.NetworkMap, []*posture.Checks, int64, error) {
m.ctrl.T.Helper()
ret := m.ctrl.Call(m, "GetValidatedPeerWithMap", ctx, isRequiresApproval, accountID, p)
ret0, _ := ret[0].(*peer.Peer)
ret1, _ := ret[1].(*types.NetworkMap)
ret2, _ := ret[2].([]*posture.Checks)
ret3, _ := ret[3].(int64)
ret4, _ := ret[4].(error)
return ret0, ret1, ret2, ret3, ret4
ret := m.ctrl.Call(m, "GetValidatedPeerWithMap", ctx, isRequiresApproval, accountID, peerID)
ret0, _ := ret[0].(*types.NetworkMap)
ret1, _ := ret[1].([]*posture.Checks)
ret2, _ := ret[2].(int64)
ret3, _ := ret[3].(error)
return ret0, ret1, ret2, ret3
}
// GetValidatedPeerWithMap indicates an expected call of GetValidatedPeerWithMap.
func (mr *MockControllerMockRecorder) GetValidatedPeerWithMap(ctx, isRequiresApproval, accountID, p any) *gomock.Call {
func (mr *MockControllerMockRecorder) GetValidatedPeerWithMap(ctx, isRequiresApproval, accountID, peerID any) *gomock.Call {
mr.mock.ctrl.T.Helper()
return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "GetValidatedPeerWithMap", reflect.TypeOf((*MockController)(nil).GetValidatedPeerWithMap), ctx, isRequiresApproval, accountID, p)
return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "GetValidatedPeerWithMap", reflect.TypeOf((*MockController)(nil).GetValidatedPeerWithMap), ctx, isRequiresApproval, accountID, peerID)
}
// OnPeerConnected mocks base method.

View File

@@ -242,7 +242,7 @@ func (m *managerImpl) CreateProxyPeer(ctx context.Context, accountID string, pee
},
}
_, _, _, err = m.accountManager.AddPeer(ctx, accountID, "", "", peer, true)
_, _, _, _, err = m.accountManager.AddPeer(ctx, accountID, "", "", peer, true)
if err != nil {
return fmt.Errorf("failed to create proxy peer: %w", err)
}

View File

@@ -778,7 +778,7 @@ func (s *Server) Login(ctx context.Context, req *proto.EncryptedMessage) (*proto
sshKey = loginReq.GetPeerKeys().GetSshPubKey()
}
peer, netMap, postureChecks, err := s.accountManager.LoginPeer(ctx, types.PeerLogin{
peer, network, postureChecks, enableSSH, err := s.accountManager.LoginPeer(ctx, types.PeerLogin{
WireGuardPubKey: peerKey.String(),
SSHKey: string(sshKey),
Meta: peerMeta,
@@ -792,7 +792,7 @@ func (s *Server) Login(ctx context.Context, req *proto.EncryptedMessage) (*proto
return nil, mapError(ctx, err)
}
loginResp, err := s.prepareLoginResponse(ctx, peer, netMap, postureChecks)
loginResp, err := s.prepareLoginResponse(ctx, peer, network, postureChecks, enableSSH)
if err != nil {
log.WithContext(ctx).Warnf("failed preparing login response for peer %s: %s", peerKey, err)
return nil, status.Errorf(codes.Internal, "failed logging in peer")
@@ -895,7 +895,7 @@ func (s *Server) ExtendAuthSession(ctx context.Context, req *proto.EncryptedMess
}, nil
}
func (s *Server) prepareLoginResponse(ctx context.Context, peer *nbpeer.Peer, netMap *types.NetworkMap, postureChecks []*posture.Checks) (*proto.LoginResponse, error) {
func (s *Server) prepareLoginResponse(ctx context.Context, peer *nbpeer.Peer, network *types.Network, postureChecks []*posture.Checks, enableSSH bool) (*proto.LoginResponse, error) {
var relayToken *Token
var err error
if s.config.Relay != nil && len(s.config.Relay.Addresses) > 0 {
@@ -914,7 +914,7 @@ func (s *Server) prepareLoginResponse(ctx context.Context, peer *nbpeer.Peer, ne
// if peer has reached this point then it has logged in
loginResp := &proto.LoginResponse{
NetbirdConfig: toNetbirdConfig(s.config, nil, relayToken, nil),
PeerConfig: toPeerConfig(peer, netMap.Network, s.networkMapController.GetDNSDomain(settings), settings, s.config.HttpConfig, s.config.DeviceAuthorizationFlow, netMap.EnableSSH),
PeerConfig: toPeerConfig(peer, network, s.networkMapController.GetDNSDomain(settings), settings, s.config.HttpConfig, s.config.DeviceAuthorizationFlow, enableSSH),
Checks: toProtocolChecks(ctx, postureChecks),
}

View File

@@ -70,7 +70,7 @@ type Manager interface {
UpdatePeerIPv6(ctx context.Context, accountID, userID, peerID string, newIPv6 netip.Addr) error
GetNetworkMap(ctx context.Context, peerID string) (*types.NetworkMap, error)
GetPeerNetwork(ctx context.Context, peerID string) (*types.Network, error)
AddPeer(ctx context.Context, accountID, setupKey, userID string, p *nbpeer.Peer, temporary bool) (*nbpeer.Peer, *types.NetworkMap, []*posture.Checks, error)
AddPeer(ctx context.Context, accountID, setupKey, userID string, p *nbpeer.Peer, temporary bool) (*nbpeer.Peer, *types.Network, []*posture.Checks, bool, error)
CreatePAT(ctx context.Context, accountID string, initiatorUserID string, targetUserID string, tokenName string, expiresIn int) (*types.PersonalAccessTokenGenerated, error)
DeletePAT(ctx context.Context, accountID string, initiatorUserID string, targetUserID string, tokenID string) error
GetPAT(ctx context.Context, accountID string, initiatorUserID string, targetUserID string, tokenID string) (*types.PersonalAccessToken, error)
@@ -109,7 +109,7 @@ type Manager interface {
GetPeer(ctx context.Context, accountID, peerID, userID string) (*nbpeer.Peer, error)
UpdateAccountSettings(ctx context.Context, accountID, userID string, newSettings *types.Settings) (*types.Settings, error)
UpdateAccountOnboarding(ctx context.Context, accountID, userID string, newOnboarding *types.AccountOnboarding) (*types.AccountOnboarding, error)
LoginPeer(ctx context.Context, login types.PeerLogin) (*nbpeer.Peer, *types.NetworkMap, []*posture.Checks, error) // used by peer gRPC API
LoginPeer(ctx context.Context, login types.PeerLogin) (*nbpeer.Peer, *types.Network, []*posture.Checks, bool, error) // used by peer gRPC API
ExtendPeerSession(ctx context.Context, peerPubKey, userID string) (time.Time, error) // used by peer gRPC API for ExtendAuthSession
SyncPeer(ctx context.Context, sync types.PeerSync, accountID string) (*nbpeer.Peer, *types.NetworkMap, []*posture.Checks, int64, error) // used by peer gRPC API
GetExternalCacheManager() ExternalCacheManager

View File

@@ -80,14 +80,15 @@ func (mr *MockManagerMockRecorder) AccountExists(ctx, accountID interface{}) *go
}
// AddPeer mocks base method.
func (m *MockManager) AddPeer(ctx context.Context, accountID, setupKey, userID string, p *peer.Peer, temporary bool) (*peer.Peer, *types.NetworkMap, []*posture.Checks, error) {
func (m *MockManager) AddPeer(ctx context.Context, accountID, setupKey, userID string, p *peer.Peer, temporary bool) (*peer.Peer, *types.Network, []*posture.Checks, bool, error) {
m.ctrl.T.Helper()
ret := m.ctrl.Call(m, "AddPeer", ctx, accountID, setupKey, userID, p, temporary)
ret0, _ := ret[0].(*peer.Peer)
ret1, _ := ret[1].(*types.NetworkMap)
ret1, _ := ret[1].(*types.Network)
ret2, _ := ret[2].([]*posture.Checks)
ret3, _ := ret[3].(error)
return ret0, ret1, ret2, ret3
ret3, _ := ret[3].(bool)
ret4, _ := ret[4].(error)
return ret0, ret1, ret2, ret3, ret4
}
// AddPeer indicates an expected call of AddPeer.
@@ -1289,14 +1290,15 @@ func (mr *MockManagerMockRecorder) ListUsers(ctx, accountID interface{}) *gomock
}
// LoginPeer mocks base method.
func (m *MockManager) LoginPeer(ctx context.Context, login types.PeerLogin) (*peer.Peer, *types.NetworkMap, []*posture.Checks, error) {
func (m *MockManager) LoginPeer(ctx context.Context, login types.PeerLogin) (*peer.Peer, *types.Network, []*posture.Checks, bool, error) {
m.ctrl.T.Helper()
ret := m.ctrl.Call(m, "LoginPeer", ctx, login)
ret0, _ := ret[0].(*peer.Peer)
ret1, _ := ret[1].(*types.NetworkMap)
ret1, _ := ret[1].(*types.Network)
ret2, _ := ret[2].([]*posture.Checks)
ret3, _ := ret[3].(error)
return ret0, ret1, ret2, ret3
ret3, _ := ret[3].(bool)
ret4, _ := ret[4].(error)
return ret0, ret1, ret2, ret3, ret4
}
// LoginPeer indicates an expected call of LoginPeer.

View File

@@ -84,7 +84,7 @@ func verifyCanAddPeerToAccount(t *testing.T, manager nbAccount.Manager, account
setupKey = key.Key
}
_, _, _, err := manager.AddPeer(context.Background(), "", setupKey, userID, peer, false)
_, _, _, _, err := manager.AddPeer(context.Background(), "", setupKey, userID, peer, false)
if err != nil {
t.Error("expected to add new peer successfully after creating new account, but failed", err)
}
@@ -1092,7 +1092,7 @@ func TestAccountManager_AddPeer(t *testing.T) {
}
expectedPeerKey := key.PublicKey().String()
peer, _, _, err := manager.AddPeer(context.Background(), "", setupKey.Key, "", &nbpeer.Peer{
peer, _, _, _, err := manager.AddPeer(context.Background(), "", setupKey.Key, "", &nbpeer.Peer{
Key: expectedPeerKey,
Meta: nbpeer.PeerSystemMeta{Hostname: expectedPeerKey},
}, false)
@@ -1156,7 +1156,7 @@ func TestAccountManager_AddPeerWithUserID(t *testing.T) {
expectedPeerKey := key.PublicKey().String()
expectedUserID := userID
peer, _, _, err := manager.AddPeer(context.Background(), "", "", userID, &nbpeer.Peer{
peer, _, _, _, err := manager.AddPeer(context.Background(), "", "", userID, &nbpeer.Peer{
Key: expectedPeerKey,
Meta: nbpeer.PeerSystemMeta{Hostname: expectedPeerKey},
}, false)
@@ -1504,7 +1504,7 @@ func TestAccountManager_DeletePeer(t *testing.T) {
peerKey := key.PublicKey().String()
peer, _, _, err := manager.AddPeer(context.Background(), "", setupKey.Key, "", &nbpeer.Peer{
peer, _, _, _, err := manager.AddPeer(context.Background(), "", setupKey.Key, "", &nbpeer.Peer{
Key: peerKey,
Meta: nbpeer.PeerSystemMeta{Hostname: peerKey},
}, false)
@@ -1826,7 +1826,7 @@ func TestDefaultAccountManager_UpdatePeer_PeerLoginExpiration(t *testing.T) {
key, err := wgtypes.GenerateKey()
require.NoError(t, err, "unable to generate WireGuard key")
peer, _, _, err := manager.AddPeer(context.Background(), "", "", userID, &nbpeer.Peer{
peer, _, _, _, err := manager.AddPeer(context.Background(), "", "", userID, &nbpeer.Peer{
Key: key.PublicKey().String(),
Meta: nbpeer.PeerSystemMeta{Hostname: "test-peer"},
LoginExpirationEnabled: true,
@@ -1882,7 +1882,7 @@ func TestDefaultAccountManager_MarkPeerConnected_PeerLoginExpiration(t *testing.
key, err := wgtypes.GenerateKey()
require.NoError(t, err, "unable to generate WireGuard key")
_, _, _, err = manager.AddPeer(context.Background(), "", "", userID, &nbpeer.Peer{
_, _, _, _, err = manager.AddPeer(context.Background(), "", "", userID, &nbpeer.Peer{
Key: key.PublicKey().String(),
Meta: nbpeer.PeerSystemMeta{Hostname: "test-peer"},
LoginExpirationEnabled: true,
@@ -1927,7 +1927,7 @@ func TestDefaultAccountManager_OnPeerDisconnected_LastSeenCheck(t *testing.T) {
require.NoError(t, err, "unable to generate WireGuard key")
peerPubKey := key.PublicKey().String()
_, _, _, err = manager.AddPeer(context.Background(), "", "", userID, &nbpeer.Peer{
_, _, _, _, err = manager.AddPeer(context.Background(), "", "", userID, &nbpeer.Peer{
Key: peerPubKey,
Meta: nbpeer.PeerSystemMeta{Hostname: "test-peer"},
}, false)
@@ -2017,7 +2017,7 @@ func TestDefaultAccountManager_MarkPeerConnected_ConcurrentRace(t *testing.T) {
require.NoError(t, err, "unable to generate WireGuard key")
peerPubKey := key.PublicKey().String()
_, _, _, err = manager.AddPeer(context.Background(), "", "", userID, &nbpeer.Peer{
_, _, _, _, err = manager.AddPeer(context.Background(), "", "", userID, &nbpeer.Peer{
Key: peerPubKey,
Meta: nbpeer.PeerSystemMeta{Hostname: "race-peer"},
}, false)
@@ -2080,7 +2080,7 @@ func TestDefaultAccountManager_UpdateAccountSettings_PeerLoginExpiration(t *test
key, err := wgtypes.GenerateKey()
require.NoError(t, err, "unable to generate WireGuard key")
_, _, _, err = manager.AddPeer(context.Background(), "", "", userID, &nbpeer.Peer{
_, _, _, _, err = manager.AddPeer(context.Background(), "", "", userID, &nbpeer.Peer{
Key: key.PublicKey().String(),
Meta: nbpeer.PeerSystemMeta{Hostname: "test-peer"},
LoginExpirationEnabled: true,
@@ -3276,7 +3276,7 @@ func setupNetworkMapTest(t *testing.T) (*DefaultAccountManager, *update_channel.
}
expectedPeerKey := key.PublicKey().String()
peer, _, _, err := manager.AddPeer(context.Background(), "", setupKey.Key, "", &nbpeer.Peer{
peer, _, _, _, err := manager.AddPeer(context.Background(), "", setupKey.Key, "", &nbpeer.Peer{
Key: expectedPeerKey,
Meta: nbpeer.PeerSystemMeta{Hostname: expectedPeerKey},
Status: &nbpeer.PeerStatus{
@@ -3444,7 +3444,7 @@ func BenchmarkLoginPeer_ExistingPeer(b *testing.B) {
b.ResetTimer()
start := time.Now()
for i := 0; i < b.N; i++ {
_, _, _, err := manager.LoginPeer(context.Background(), types.PeerLogin{
_, _, _, _, err := manager.LoginPeer(context.Background(), types.PeerLogin{
WireGuardPubKey: account.Peers["peer-1"].Key,
SSHKey: "someKey",
Meta: nbpeer.PeerSystemMeta{Hostname: strconv.Itoa(i)},
@@ -3513,7 +3513,7 @@ func BenchmarkLoginPeer_NewPeer(b *testing.B) {
b.ResetTimer()
start := time.Now()
for i := 0; i < b.N; i++ {
_, _, _, err := manager.LoginPeer(context.Background(), types.PeerLogin{
_, _, _, _, err := manager.LoginPeer(context.Background(), types.PeerLogin{
WireGuardPubKey: "some-new-key" + strconv.Itoa(i),
SSHKey: "someKey",
Meta: nbpeer.PeerSystemMeta{Hostname: strconv.Itoa(i)},
@@ -3908,13 +3908,13 @@ func TestDefaultAccountManager_UpdatePeerIP(t *testing.T) {
key2, err := wgtypes.GenerateKey()
require.NoError(t, err, "unable to generate WireGuard key")
peer1, _, _, err := manager.AddPeer(context.Background(), "", "", userID, &nbpeer.Peer{
peer1, _, _, _, err := manager.AddPeer(context.Background(), "", "", userID, &nbpeer.Peer{
Key: key1.PublicKey().String(),
Meta: nbpeer.PeerSystemMeta{Hostname: "test-peer-1"},
}, false)
require.NoError(t, err, "unable to add peer1")
peer2, _, _, err := manager.AddPeer(context.Background(), "", "", userID, &nbpeer.Peer{
peer2, _, _, _, err := manager.AddPeer(context.Background(), "", "", userID, &nbpeer.Peer{
Key: key2.PublicKey().String(),
Meta: nbpeer.PeerSystemMeta{Hostname: "test-peer-2"},
}, false)

View File

@@ -1663,7 +1663,7 @@ func addPeerToAccount(t *testing.T, manager *DefaultAccountManager, _, setupKeyK
key, err := wgtypes.GeneratePrivateKey()
require.NoError(t, err)
peer, _, _, err := manager.AddPeer(context.Background(), "", setupKeyKey, "", &nbpeer.Peer{
peer, _, _, _, err := manager.AddPeer(context.Background(), "", setupKeyKey, "", &nbpeer.Peer{
Key: key.PublicKey().String(),
Meta: nbpeer.PeerSystemMeta{Hostname: key.PublicKey().String()},
}, false)

View File

@@ -298,11 +298,11 @@ func initTestDNSAccount(t *testing.T, am *DefaultAccountManager) (*types.Account
return nil, err
}
savedPeer1, _, _, err := am.AddPeer(context.Background(), "", "", dnsAdminUserID, peer1, false)
savedPeer1, _, _, _, err := am.AddPeer(context.Background(), "", "", dnsAdminUserID, peer1, false)
if err != nil {
return nil, err
}
_, _, _, err = am.AddPeer(context.Background(), "", "", dnsAdminUserID, peer2, false)
_, _, _, _, err = am.AddPeer(context.Background(), "", "", dnsAdminUserID, peer2, false)
if err != nil {
return nil, err
}

View File

@@ -55,7 +55,7 @@ func TestGroupIPv6Assignment(t *testing.T) {
key, err := wgtypes.GeneratePrivateKey()
require.NoError(t, err)
peer, _, _, err := am.AddPeer(ctx, "", setupKey.Key, "", &nbpeer.Peer{
peer, _, _, _, err := am.AddPeer(ctx, "", setupKey.Key, "", &nbpeer.Peer{
Key: key.PublicKey().String(),
Meta: nbpeer.PeerSystemMeta{Hostname: "ipv6-test-host"},
}, false)

View File

@@ -479,7 +479,7 @@ func (h *Handler) CreateTemporaryAccess(w http.ResponseWriter, r *http.Request)
return
}
peer, _, _, err := h.accountManager.AddPeer(r.Context(), userAuth.AccountId, "", userAuth.UserId, newPeer, true)
peer, _, _, _, err := h.accountManager.AddPeer(r.Context(), userAuth.AccountId, "", userAuth.UserId, newPeer, true)
if err != nil {
util.WriteError(r.Context(), err, w)
return

View File

@@ -728,7 +728,7 @@ func Test_LoginPerformance(t *testing.T) {
}
login := func() error {
_, _, _, err = am.LoginPeer(context.Background(), peerLogin)
_, _, _, _, err = am.LoginPeer(context.Background(), peerLogin)
if err != nil {
t.Logf("failed to login peer: %v", err)
return err
@@ -746,7 +746,7 @@ func Test_LoginPerformance(t *testing.T) {
go func(peerLogin types.PeerLogin, counterStart *int32) {
defer wgPeer.Done()
_, _, _, err = am.LoginPeer(context.Background(), peerLogin)
_, _, _, _, err = am.LoginPeer(context.Background(), peerLogin)
if err != nil {
t.Logf("failed to login peer: %v", err)
return

View File

@@ -45,7 +45,7 @@ type MockAccountManager struct {
DeletePeerFunc func(ctx context.Context, accountID, peerKey, userID string) error
GetNetworkMapFunc func(ctx context.Context, peerKey string) (*types.NetworkMap, error)
GetPeerNetworkFunc func(ctx context.Context, peerKey string) (*types.Network, error)
AddPeerFunc func(ctx context.Context, accountID string, setupKey string, userId string, peer *nbpeer.Peer, temporary bool) (*nbpeer.Peer, *types.NetworkMap, []*posture.Checks, error)
AddPeerFunc func(ctx context.Context, accountID string, setupKey string, userId string, peer *nbpeer.Peer, temporary bool) (*nbpeer.Peer, *types.Network, []*posture.Checks, bool, error)
GetGroupFunc func(ctx context.Context, accountID, groupID, userID string) (*types.Group, error)
GetAllGroupsFunc func(ctx context.Context, accountID, userID string) ([]*types.Group, error)
GetGroupByNameFunc func(ctx context.Context, groupName, accountID, userID string) (*types.Group, error)
@@ -98,7 +98,7 @@ type MockAccountManager struct {
SaveDNSSettingsFunc func(ctx context.Context, accountID, userID string, dnsSettingsToSave *types.DNSSettings) error
GetPeerFunc func(ctx context.Context, accountID, peerID, userID string) (*nbpeer.Peer, error)
UpdateAccountSettingsFunc func(ctx context.Context, accountID, userID string, newSettings *types.Settings) (*types.Settings, error)
LoginPeerFunc func(ctx context.Context, login types.PeerLogin) (*nbpeer.Peer, *types.NetworkMap, []*posture.Checks, error)
LoginPeerFunc func(ctx context.Context, login types.PeerLogin) (*nbpeer.Peer, *types.Network, []*posture.Checks, bool, error)
ExtendPeerSessionFunc func(ctx context.Context, peerPubKey, userID string) (time.Time, error)
SyncPeerFunc func(ctx context.Context, sync types.PeerSync, accountID string) (*nbpeer.Peer, *types.NetworkMap, []*posture.Checks, int64, error)
InviteUserFunc func(ctx context.Context, accountID string, initiatorUserID string, targetUserEmail string) error
@@ -424,11 +424,11 @@ func (am *MockAccountManager) AddPeer(
userId string,
peer *nbpeer.Peer,
temporary bool,
) (*nbpeer.Peer, *types.NetworkMap, []*posture.Checks, error) {
) (*nbpeer.Peer, *types.Network, []*posture.Checks, bool, error) {
if am.AddPeerFunc != nil {
return am.AddPeerFunc(ctx, accountID, setupKey, userId, peer, temporary)
}
return nil, nil, nil, status.Errorf(codes.Unimplemented, "method AddPeer is not implemented")
return nil, nil, nil, false, status.Errorf(codes.Unimplemented, "method AddPeer is not implemented")
}
// GetGroupByName mock implementation of GetGroupByName from server.AccountManager interface
@@ -862,11 +862,11 @@ func (am *MockAccountManager) UpdateAccountSettings(ctx context.Context, account
}
// LoginPeer mocks LoginPeer of the AccountManager interface
func (am *MockAccountManager) LoginPeer(ctx context.Context, login types.PeerLogin) (*nbpeer.Peer, *types.NetworkMap, []*posture.Checks, error) {
func (am *MockAccountManager) LoginPeer(ctx context.Context, login types.PeerLogin) (*nbpeer.Peer, *types.Network, []*posture.Checks, bool, error) {
if am.LoginPeerFunc != nil {
return am.LoginPeerFunc(ctx, login)
}
return nil, nil, nil, status.Errorf(codes.Unimplemented, "method LoginPeer is not implemented")
return nil, nil, nil, false, status.Errorf(codes.Unimplemented, "method LoginPeer is not implemented")
}
// ExtendPeerSession mocks ExtendPeerSession of the AccountManager interface

View File

@@ -896,11 +896,11 @@ func initTestNSAccount(t *testing.T, am *DefaultAccountManager) (*types.Account,
return nil, err
}
_, _, _, err = am.AddPeer(context.Background(), "", "", userID, peer1, false)
_, _, _, _, err = am.AddPeer(context.Background(), "", "", userID, peer1, false)
if err != nil {
return nil, err
}
_, _, _, err = am.AddPeer(context.Background(), "", "", userID, peer2, false)
_, _, _, _, err = am.AddPeer(context.Background(), "", "", userID, peer2, false)
if err != nil {
return nil, err
}

View File

@@ -718,10 +718,10 @@ func (am *DefaultAccountManager) handleSetupKeyAddedPeer(ctx context.Context, en
// to it. We also add the User ID to the peer metadata to identify registrant. If no userID provided, then fail with status.PermissionDenied
// Each new Peer will be assigned a new next net.IP from the Account.Network and Account.Network.LastIP will be updated (IP's are not reused).
// The peer property is just a placeholder for the Peer properties to pass further
func (am *DefaultAccountManager) AddPeer(ctx context.Context, accountID, setupKey, userID string, peer *nbpeer.Peer, temporary bool) (*nbpeer.Peer, *types.NetworkMap, []*posture.Checks, error) {
func (am *DefaultAccountManager) AddPeer(ctx context.Context, accountID, setupKey, userID string, peer *nbpeer.Peer, temporary bool) (*nbpeer.Peer, *types.Network, []*posture.Checks, bool, error) {
if setupKey == "" && userID == "" && !peer.ProxyMeta.Embedded {
// no auth method provided => reject access
return nil, nil, nil, status.Errorf(status.Unauthenticated, "no peer auth method provided, please use a setup key or interactive SSO login")
return nil, nil, nil, false, status.Errorf(status.Unauthenticated, "no peer auth method provided, please use a setup key or interactive SSO login")
}
upperKey := strings.ToUpper(setupKey)
@@ -737,7 +737,7 @@ func (am *DefaultAccountManager) AddPeer(ctx context.Context, accountID, setupKe
// The connecting peer should be able to recover with a retry.
_, err := am.Store.GetPeerByPeerPubKey(ctx, store.LockingStrengthNone, peer.Key)
if err == nil {
return nil, nil, nil, status.Errorf(status.PreconditionFailed, "peer has been already registered")
return nil, nil, nil, false, status.Errorf(status.PreconditionFailed, "peer has been already registered")
}
opEvent := &activity.Event{
@@ -748,7 +748,7 @@ func (am *DefaultAccountManager) AddPeer(ctx context.Context, accountID, setupKe
peerAddConfig, err := am.processPeerAddAuth(ctx, accountID, userID, encodedHashedKey, peer, temporary, addedByUser, addedBySetupKey, opEvent)
if err != nil {
return nil, nil, nil, err
return nil, nil, nil, false, err
}
accountID = peerAddConfig.AccountID
ephemeral := peerAddConfig.Ephemeral
@@ -763,7 +763,7 @@ func (am *DefaultAccountManager) AddPeer(ctx context.Context, accountID, setupKe
}
if err := domain.ValidateDomainsList(peer.ExtraDNSLabels); err != nil {
return nil, nil, nil, status.Errorf(status.InvalidArgument, "invalid extra DNS labels: %v", err)
return nil, nil, nil, false, status.Errorf(status.InvalidArgument, "invalid extra DNS labels: %v", err)
}
registrationTime := time.Now().UTC()
@@ -789,7 +789,7 @@ func (am *DefaultAccountManager) AddPeer(ctx context.Context, accountID, setupKe
}
settings, err := am.Store.GetAccountSettings(ctx, store.LockingStrengthNone, accountID)
if err != nil {
return nil, nil, nil, fmt.Errorf("failed to get account settings: %w", err)
return nil, nil, nil, false, fmt.Errorf("failed to get account settings: %w", err)
}
if am.geo != nil && newPeer.Location.ConnectionIP != nil {
@@ -807,30 +807,30 @@ func (am *DefaultAccountManager) AddPeer(ctx context.Context, accountID, setupKe
network, err := am.Store.GetAccountNetwork(ctx, store.LockingStrengthNone, accountID)
if err != nil {
return nil, nil, nil, fmt.Errorf("failed getting network: %w", err)
return nil, nil, nil, false, fmt.Errorf("failed getting network: %w", err)
}
maxAttempts := 10
for attempt := 1; attempt <= maxAttempts; attempt++ {
netPrefix, err := netip.ParsePrefix(network.Net.String())
if err != nil {
return nil, nil, nil, fmt.Errorf("parse network prefix: %w", err)
return nil, nil, nil, false, fmt.Errorf("parse network prefix: %w", err)
}
freeIP, err := types.AllocateRandomPeerIP(netPrefix)
if err != nil {
return nil, nil, nil, fmt.Errorf("failed to get free IP: %w", err)
return nil, nil, nil, false, fmt.Errorf("failed to get free IP: %w", err)
}
var freeLabel string
if ephemeral || attempt > 1 {
freeLabel, err = getPeerIPDNSLabel(freeIP, peer.Meta.Hostname)
if err != nil {
return nil, nil, nil, fmt.Errorf("failed to get free DNS label: %w", err)
return nil, nil, nil, false, fmt.Errorf("failed to get free DNS label: %w", err)
}
} else {
freeLabel, err = nbdns.GetParsedDomainLabel(peer.Meta.Hostname)
if err != nil {
return nil, nil, nil, fmt.Errorf("failed to get free DNS label: %w", err)
return nil, nil, nil, false, fmt.Errorf("failed to get free DNS label: %w", err)
}
}
newPeer.DNSLabel = freeLabel
@@ -852,11 +852,11 @@ func (am *DefaultAccountManager) AddPeer(ctx context.Context, accountID, setupKe
if allocate {
v6Prefix, err := netip.ParsePrefix(network.NetV6.String())
if err != nil {
return nil, nil, nil, fmt.Errorf("parse IPv6 prefix: %w", err)
return nil, nil, nil, false, fmt.Errorf("parse IPv6 prefix: %w", err)
}
freeIPv6, err := types.AllocateRandomPeerIPv6(v6Prefix)
if err != nil {
return nil, nil, nil, fmt.Errorf("allocate peer IPv6: %w", err)
return nil, nil, nil, false, fmt.Errorf("allocate peer IPv6: %w", err)
}
newPeer.IPv6 = freeIPv6
}
@@ -929,10 +929,10 @@ func (am *DefaultAccountManager) AddPeer(ctx context.Context, accountID, setupKe
continue
}
return nil, nil, nil, fmt.Errorf("failed to add peer to database: %w", err)
return nil, nil, nil, false, fmt.Errorf("failed to add peer to database: %w", err)
}
if newPeer == nil {
return nil, nil, nil, fmt.Errorf("new peer is nil")
return nil, nil, nil, false, fmt.Errorf("new peer is nil")
}
opEvent.TargetID = newPeer.ID
@@ -940,7 +940,8 @@ func (am *DefaultAccountManager) AddPeer(ctx context.Context, accountID, setupKe
if !addedByUser {
opEvent.Meta["setup_key_name"] = peerAddConfig.SetupKeyName
}
if newPeer.Status != nil && newPeer.Status.RequiresApproval {
requiresApproval := newPeer.Status != nil && newPeer.Status.RequiresApproval
if requiresApproval {
opEvent.Meta["pending_approval"] = true
}
@@ -948,18 +949,18 @@ func (am *DefaultAccountManager) AddPeer(ctx context.Context, accountID, setupKe
am.StoreEvent(ctx, opEvent.InitiatorID, opEvent.TargetID, opEvent.AccountID, opEvent.Activity, opEvent.Meta)
}
p, nmap, pc, _, err := am.networkMapController.GetValidatedPeerWithMap(ctx, false, accountID, newPeer)
network, postureChecks, enableSSH, err := getPeerLoginInfo(ctx, am.Store, accountID, newPeer, !requiresApproval)
if err != nil {
return p, nmap, pc, err
return nil, nil, nil, false, err
}
changedPeerIDs := []string{newPeer.ID}
affectedPeerIDs := affectedPeerIDsFromNetworkMap(nmap, newPeer.ID)
affectedPeerIDs := am.resolveAffectedPeersForPeerChanges(ctx, am.Store, accountID, changedPeerIDs)
if err := am.networkMapController.OnPeersAdded(ctx, accountID, changedPeerIDs, affectedPeerIDs); err != nil {
log.WithContext(ctx).Errorf("failed to update network map cache for peer %s: %v", newPeer.ID, err)
}
return p, nmap, pc, nil
return newPeer, network, postureChecks, enableSSH, nil
}
func getPeerIPDNSLabel(ip netip.Addr, peerHostName string) (string, error) {
@@ -1041,7 +1042,7 @@ func (am *DefaultAccountManager) SyncPeer(ctx context.Context, sync types.PeerSy
return nil, nil, nil, 0, err
}
resPeer, nmap, resPostureChecks, dnsFwdPort, err := am.networkMapController.GetValidatedPeerWithMap(ctx, peerNotValid, accountID, peer)
nmap, resPostureChecks, dnsFwdPort, err := am.networkMapController.GetValidatedPeerWithMap(ctx, peerNotValid, accountID, peer.ID)
if err != nil {
return nil, nil, nil, 0, err
}
@@ -1054,7 +1055,7 @@ func (am *DefaultAccountManager) SyncPeer(ctx context.Context, sync types.PeerSy
}
}
return resPeer, nmap, resPostureChecks, dnsFwdPort, nil
return peer, nmap, resPostureChecks, dnsFwdPort, nil
}
// syncPeerAffectedPeers resolves the peers affected by a SyncPeer change. The
@@ -1085,7 +1086,7 @@ func (am *DefaultAccountManager) markConnectedAffectedPeers(ctx context.Context,
return affectedPeerIDsFromNetworkMap(nmap, peerID)
}
func (am *DefaultAccountManager) handlePeerLoginNotFound(ctx context.Context, login types.PeerLogin, err error) (*nbpeer.Peer, *types.NetworkMap, []*posture.Checks, error) {
func (am *DefaultAccountManager) handlePeerLoginNotFound(ctx context.Context, login types.PeerLogin, err error) (*nbpeer.Peer, *types.Network, []*posture.Checks, bool, error) {
if errStatus, ok := status.FromError(err); ok && errStatus.Type() == status.NotFound {
// we couldn't find this peer by its public key which can mean that peer hasn't been registered yet.
// Try registering it.
@@ -1101,12 +1102,12 @@ func (am *DefaultAccountManager) handlePeerLoginNotFound(ctx context.Context, lo
}
log.WithContext(ctx).Errorf("failed while logging in peer %s: %v", login.WireGuardPubKey, err)
return nil, nil, nil, status.Errorf(status.Internal, "failed while logging in peer")
return nil, nil, nil, false, status.Errorf(status.Internal, "failed while logging in peer")
}
// LoginPeer logs in or registers a peer.
// If peer doesn't exist the function checks whether a setup key or a user is present and registers a new peer if so.
func (am *DefaultAccountManager) LoginPeer(ctx context.Context, login types.PeerLogin) (*nbpeer.Peer, *types.NetworkMap, []*posture.Checks, error) {
func (am *DefaultAccountManager) LoginPeer(ctx context.Context, login types.PeerLogin) (*nbpeer.Peer, *types.Network, []*posture.Checks, bool, error) {
accountID, err := am.Store.GetAccountIDByPeerPubKey(ctx, login.WireGuardPubKey)
if err != nil {
return am.handlePeerLoginNotFound(ctx, login, err)
@@ -1118,20 +1119,17 @@ func (am *DefaultAccountManager) LoginPeer(ctx context.Context, login types.Peer
if login.UserID == "" {
err = am.checkIFPeerNeedsLoginWithoutLock(ctx, accountID, login)
if err != nil {
return nil, nil, nil, err
return nil, nil, nil, false, err
}
}
var peer *nbpeer.Peer
var updateRemotePeers bool
var isPeerUpdated bool
var ipv6CapabilityChanged bool
var postureChecks []*posture.Checks
var shouldStorePeer bool
var peerGroupIDs []string
settings, err := am.Store.GetAccountSettings(ctx, store.LockingStrengthNone, accountID)
if err != nil {
return nil, nil, nil, err
return nil, nil, nil, false, err
}
err = am.Store.ExecuteInTransaction(ctx, func(transaction store.Store) error {
@@ -1140,9 +1138,6 @@ func (am *DefaultAccountManager) LoginPeer(ctx context.Context, login types.Peer
return err
}
// this flag prevents unnecessary calls to the persistent store.
shouldStorePeer := false
if login.UserID != "" {
if peer.UserID != login.UserID {
log.Warnf("user mismatch when logging in peer %s: peer user %s, login user %s ", peer.ID, peer.UserID, login.UserID)
@@ -1156,7 +1151,6 @@ func (am *DefaultAccountManager) LoginPeer(ctx context.Context, login types.Peer
if changed {
shouldStorePeer = true
updateRemotePeers = true
}
}
@@ -1165,23 +1159,9 @@ func (am *DefaultAccountManager) LoginPeer(ctx context.Context, login types.Peer
return err
}
oldHasIPv6Cap := peer.HasCapability(nbpeer.PeerCapabilityIPv6Overlay)
isPeerUpdated, _ = peer.UpdateMetaIfNew(login.Meta)
ipv6CapabilityChanged = oldHasIPv6Cap != peer.HasCapability(nbpeer.PeerCapabilityIPv6Overlay)
if isPeerUpdated {
am.metrics.AccountManagerMetrics().CountPeerMetUpdate()
shouldStorePeer = true
postureChecks, err = getPeerPostureChecks(ctx, transaction, accountID, peer.ID)
if err != nil {
return err
}
}
if peer.SSHKey != login.SSHKey {
peer.SSHKey = login.SSHKey
shouldStorePeer = true
updateRemotePeers = true
}
if !peer.AllowExtraDNSLabels && len(login.ExtraDNSLabels) > 0 {
@@ -1197,28 +1177,28 @@ func (am *DefaultAccountManager) LoginPeer(ctx context.Context, login types.Peer
return nil
})
if err != nil {
return nil, nil, nil, err
return nil, nil, nil, false, err
}
isRequiresApproval, isStatusChanged, err := am.integratedPeerValidator.IsNotValidPeer(ctx, accountID, peer, peerGroupIDs, settings.Extra)
if err != nil {
return nil, nil, nil, err
return nil, nil, nil, false, err
}
p, nmap, pc, _, err := am.networkMapController.GetValidatedPeerWithMap(ctx, isRequiresApproval, accountID, peer)
network, postureChecks, enableSSH, err := getPeerLoginInfo(ctx, am.Store, accountID, peer, !isRequiresApproval)
if err != nil {
return nil, nil, nil, err
return nil, nil, nil, false, err
}
if updateRemotePeers || isStatusChanged || ipv6CapabilityChanged || (isPeerUpdated && len(postureChecks) > 0) {
if isStatusChanged || shouldStorePeer {
changedPeerIDs := []string{peer.ID}
affectedPeerIDs := am.syncPeerAffectedPeers(ctx, accountID, peer.ID, nmap, isRequiresApproval, isPeerUpdated, len(postureChecks) > 0)
affectedPeerIDs := am.resolveAffectedPeersForPeerChanges(ctx, am.Store, accountID, changedPeerIDs)
if err = am.networkMapController.OnPeersUpdated(ctx, accountID, changedPeerIDs, affectedPeerIDs); err != nil {
return nil, nil, nil, fmt.Errorf("notify network map controller of peer update: %w", err)
return nil, nil, nil, false, fmt.Errorf("notify network map controller of peer update: %w", err)
}
}
return p, nmap, pc, nil
return peer, network, postureChecks, enableSSH, nil
}
// ExtendPeerSession refreshes the peer's SSO session deadline by updating
@@ -1294,6 +1274,50 @@ func (am *DefaultAccountManager) ExtendPeerSession(ctx context.Context, peerPubK
return refreshed.SessionExpiresAt(settings.PeerLoginExpirationEnabled, settings.PeerLoginExpiration), nil
}
// getPeerLoginInfo computes the login/register response data (network, posture
// checks, SSH) from the store without building the peer's full network map.
func getPeerLoginInfo(ctx context.Context, transaction store.Store, accountID string, peer *nbpeer.Peer, isValid bool) (*types.Network, []*posture.Checks, bool, error) {
network, err := transaction.GetAccountNetwork(ctx, store.LockingStrengthNone, accountID)
if err != nil {
return nil, nil, false, fmt.Errorf("get account network: %w", err)
}
if !isValid {
return network, nil, false, nil
}
postureChecks, err := getPeerPostureChecks(ctx, transaction, accountID, peer.ID)
if err != nil {
return nil, nil, false, err
}
enableSSH, err := isPeerSSHEnabled(ctx, transaction, accountID, peer)
if err != nil {
return nil, nil, false, err
}
return network, postureChecks, enableSSH, nil
}
func isPeerSSHEnabled(ctx context.Context, transaction store.Store, accountID string, peer *nbpeer.Peer) (bool, error) {
policies, err := transaction.GetAccountPolicies(ctx, store.LockingStrengthNone, accountID)
if err != nil {
return false, err
}
peerGroups, err := transaction.GetPeerGroups(ctx, store.LockingStrengthNone, accountID, peer.ID)
if err != nil {
return false, err
}
peerGroupIDs := make(map[string]struct{}, len(peerGroups))
for _, g := range peerGroups {
peerGroupIDs[g.ID] = struct{}{}
}
return types.PeerSSHEnabledFromPolicies(policies, peer.ID, peerGroupIDs, peer.SSHEnabled), nil
}
// getPeerPostureChecks returns the posture checks for the peer.
func getPeerPostureChecks(ctx context.Context, transaction store.Store, accountID, peerID string) ([]*posture.Checks, error) {
policies, err := transaction.GetAccountPolicies(ctx, store.LockingStrengthNone, accountID)

View File

@@ -205,7 +205,7 @@ func testGetNetworkMapGeneral(t *testing.T) {
return
}
peer1, _, _, err := manager.AddPeer(context.Background(), "", setupKey.Key, "", &nbpeer.Peer{
peer1, _, _, _, err := manager.AddPeer(context.Background(), "", setupKey.Key, "", &nbpeer.Peer{
Key: peerKey1.PublicKey().String(),
Meta: nbpeer.PeerSystemMeta{Hostname: "test-peer-1"},
}, false)
@@ -219,7 +219,7 @@ func testGetNetworkMapGeneral(t *testing.T) {
t.Fatal(err)
return
}
_, _, _, err = manager.AddPeer(context.Background(), "", setupKey.Key, "", &nbpeer.Peer{
_, _, _, _, err = manager.AddPeer(context.Background(), "", setupKey.Key, "", &nbpeer.Peer{
Key: peerKey2.PublicKey().String(),
Meta: nbpeer.PeerSystemMeta{Hostname: "test-peer-2"},
}, false)
@@ -278,7 +278,7 @@ func TestAccountManager_GetNetworkMapWithPolicy(t *testing.T) {
return
}
peer1, _, _, err := manager.AddPeer(context.Background(), "", setupKey.Key, "", &nbpeer.Peer{
peer1, _, _, _, err := manager.AddPeer(context.Background(), "", setupKey.Key, "", &nbpeer.Peer{
Key: peerKey1.PublicKey().String(),
Meta: nbpeer.PeerSystemMeta{Hostname: "test-peer-1"},
}, false)
@@ -292,7 +292,7 @@ func TestAccountManager_GetNetworkMapWithPolicy(t *testing.T) {
t.Fatal(err)
return
}
peer2, _, _, err := manager.AddPeer(context.Background(), "", setupKey.Key, "", &nbpeer.Peer{
peer2, _, _, _, err := manager.AddPeer(context.Background(), "", setupKey.Key, "", &nbpeer.Peer{
Key: peerKey2.PublicKey().String(),
Meta: nbpeer.PeerSystemMeta{Hostname: "test-peer-2"},
}, false)
@@ -454,7 +454,7 @@ func TestAccountManager_GetPeerNetwork(t *testing.T) {
return
}
peer1, _, _, err := manager.AddPeer(context.Background(), "", setupKey.Key, "", &nbpeer.Peer{
peer1, _, _, _, err := manager.AddPeer(context.Background(), "", setupKey.Key, "", &nbpeer.Peer{
Key: peerKey1.PublicKey().String(),
Meta: nbpeer.PeerSystemMeta{Hostname: "test-peer-1"},
}, false)
@@ -468,7 +468,7 @@ func TestAccountManager_GetPeerNetwork(t *testing.T) {
t.Fatal(err)
return
}
_, _, _, err = manager.AddPeer(context.Background(), "", setupKey.Key, "", &nbpeer.Peer{
_, _, _, _, err = manager.AddPeer(context.Background(), "", setupKey.Key, "", &nbpeer.Peer{
Key: peerKey2.PublicKey().String(),
Meta: nbpeer.PeerSystemMeta{Hostname: "test-peer-2"},
}, false)
@@ -526,7 +526,7 @@ func TestDefaultAccountManager_GetPeer(t *testing.T) {
return
}
peer1, _, _, err := manager.AddPeer(context.Background(), "", "", someUser, &nbpeer.Peer{
peer1, _, _, _, err := manager.AddPeer(context.Background(), "", "", someUser, &nbpeer.Peer{
Key: peerKey1.PublicKey().String(),
Meta: nbpeer.PeerSystemMeta{Hostname: "test-peer-2"},
}, false)
@@ -542,7 +542,7 @@ func TestDefaultAccountManager_GetPeer(t *testing.T) {
}
// the second peer added with a setup key
peer2, _, _, err := manager.AddPeer(context.Background(), "", setupKey.Key, "", &nbpeer.Peer{
peer2, _, _, _, err := manager.AddPeer(context.Background(), "", setupKey.Key, "", &nbpeer.Peer{
Key: peerKey2.PublicKey().String(),
Meta: nbpeer.PeerSystemMeta{Hostname: "test-peer-2"},
}, false)
@@ -698,7 +698,7 @@ func TestDefaultAccountManager_GetPeers(t *testing.T) {
return
}
_, _, _, err = manager.AddPeer(context.Background(), "", "", someUser, &nbpeer.Peer{
_, _, _, _, err = manager.AddPeer(context.Background(), "", "", someUser, &nbpeer.Peer{
Key: peerKey1.PublicKey().String(),
Meta: nbpeer.PeerSystemMeta{Hostname: "test-peer-1"},
}, false)
@@ -707,7 +707,7 @@ func TestDefaultAccountManager_GetPeers(t *testing.T) {
return
}
_, _, _, err = manager.AddPeer(context.Background(), "", "", adminUser, &nbpeer.Peer{
_, _, _, _, err = manager.AddPeer(context.Background(), "", "", adminUser, &nbpeer.Peer{
Key: peerKey2.PublicKey().String(),
Meta: nbpeer.PeerSystemMeta{Hostname: "test-peer-2"},
}, false)
@@ -1332,7 +1332,7 @@ func Test_RegisterPeerByUser(t *testing.T) {
},
}
addedPeer, _, _, err := am.AddPeer(context.Background(), "", "", existingUserID, newPeer, false)
addedPeer, _, _, _, err := am.AddPeer(context.Background(), "", "", existingUserID, newPeer, false)
require.NoError(t, err)
assert.Equal(t, newPeer.ExtraDNSLabels, addedPeer.ExtraDNSLabels)
@@ -1465,7 +1465,7 @@ func Test_RegisterPeerBySetupKey(t *testing.T) {
ExtraDNSLabels: newPeerTemplate.ExtraDNSLabels,
}
addedPeer, _, _, err := am.AddPeer(context.Background(), "", tc.existingSetupKeyID, "", currentPeer, false)
addedPeer, _, _, _, err := am.AddPeer(context.Background(), "", tc.existingSetupKeyID, "", currentPeer, false)
if tc.expectAddPeerError {
require.Error(t, err, "Expected an error when adding peer with setup key: %s", tc.existingSetupKeyID)
@@ -1577,7 +1577,7 @@ func Test_RegisterPeerRollbackOnFailure(t *testing.T) {
SSHEnabled: false,
}
_, _, _, err = am.AddPeer(context.Background(), "", faultyKey, "", newPeer, false)
_, _, _, _, err = am.AddPeer(context.Background(), "", faultyKey, "", newPeer, false)
require.Error(t, err)
_, err = s.GetPeerByPeerPubKey(context.Background(), store.LockingStrengthNone, newPeer.Key)
@@ -1723,7 +1723,7 @@ func Test_LoginPeer(t *testing.T) {
if sk.AllowExtraDNSLabels {
currentPeer.ExtraDNSLabels = newPeerTemplate.ExtraDNSLabels
}
_, _, _, err = am.AddPeer(context.Background(), "", tc.setupKey, "", currentPeer, false)
_, _, _, _, err = am.AddPeer(context.Background(), "", tc.setupKey, "", currentPeer, false)
require.NoError(t, err, "Expected no error when adding peer with setup key: %s", tc.setupKey)
loginInput := types.PeerLogin{
@@ -1739,12 +1739,12 @@ func Test_LoginPeer(t *testing.T) {
loginInput.ExtraDNSLabels = tc.extraDNSLabels
}
loggedinPeer, networkMap, postureChecks, loginErr := am.LoginPeer(context.Background(), loginInput)
loggedinPeer, network, postureChecks, _, loginErr := am.LoginPeer(context.Background(), loginInput)
if tc.expectLoginError {
require.Error(t, loginErr, "Expected an error during LoginPeer with setup key: %s", tc.setupKey)
assert.Contains(t, loginErr.Error(), tc.expectedErrorMsgSubstring, "Error message mismatch")
assert.Nil(t, loggedinPeer, "LoggedinPeer should be nil on error")
assert.Nil(t, networkMap, "NetworkMap should be nil on error")
assert.Nil(t, network, "Network should be nil on error")
assert.Nil(t, postureChecks, "PostureChecks should be empty or nil on error")
return
}
@@ -1757,7 +1757,7 @@ func Test_LoginPeer(t *testing.T) {
} else {
assert.Equal(t, currentPeer.ExtraDNSLabels, loggedinPeer.ExtraDNSLabels, "ExtraDNSLabels mismatch on loggedinPeer")
}
assert.NotNil(t, networkMap, "networkMap should not be nil on success")
assert.NotNil(t, network, "network should not be nil on success")
assert.Equal(t, existingAccountID, loggedinPeer.AccountID, "AccountID mismatch for logged peer")
@@ -1863,7 +1863,7 @@ func TestPeerAccountPeersUpdate(t *testing.T) {
require.NoError(t, err)
expectedPeerKey := key.PublicKey().String()
peer4, _, _, err = manager.AddPeer(context.Background(), "", "", "regularUser1", &nbpeer.Peer{
peer4, _, _, _, err = manager.AddPeer(context.Background(), "", "", "regularUser1", &nbpeer.Peer{
Key: expectedPeerKey,
Meta: nbpeer.PeerSystemMeta{Hostname: expectedPeerKey},
}, false)
@@ -1986,7 +1986,7 @@ func TestPeerAccountPeersUpdate(t *testing.T) {
require.NoError(t, err)
expectedPeerKey := key.PublicKey().String()
peer4, _, _, err = manager.AddPeer(context.Background(), "", "", "regularUser1", &nbpeer.Peer{
peer4, _, _, _, err = manager.AddPeer(context.Background(), "", "", "regularUser1", &nbpeer.Peer{
Key: expectedPeerKey,
LoginExpirationEnabled: true,
Meta: nbpeer.PeerSystemMeta{Hostname: expectedPeerKey},
@@ -2053,7 +2053,7 @@ func TestPeerAccountPeersUpdate(t *testing.T) {
require.NoError(t, err)
expectedPeerKey := key.PublicKey().String()
peer5, _, _, err = manager.AddPeer(context.Background(), "", "", "regularUser2", &nbpeer.Peer{
peer5, _, _, _, err = manager.AddPeer(context.Background(), "", "", "regularUser2", &nbpeer.Peer{
Key: expectedPeerKey,
LoginExpirationEnabled: true,
Meta: nbpeer.PeerSystemMeta{Hostname: expectedPeerKey},
@@ -2108,7 +2108,7 @@ func TestPeerAccountPeersUpdate(t *testing.T) {
require.NoError(t, err)
expectedPeerKey := key.PublicKey().String()
peer6, _, _, err = manager.AddPeer(context.Background(), "", "", "regularUser3", &nbpeer.Peer{
peer6, _, _, _, err = manager.AddPeer(context.Background(), "", "", "regularUser3", &nbpeer.Peer{
Key: expectedPeerKey,
LoginExpirationEnabled: true,
Meta: nbpeer.PeerSystemMeta{Hostname: expectedPeerKey},
@@ -2286,7 +2286,7 @@ func Test_AddPeer(t *testing.T) {
<-start
_, _, _, err := manager.AddPeer(context.Background(), "", setupKey.Key, "", newPeer, false)
_, _, _, _, err := manager.AddPeer(context.Background(), "", setupKey.Key, "", newPeer, false)
if err != nil {
errs <- fmt.Errorf("AddPeer failed for peer %d: %w", i, err)
return
@@ -2366,7 +2366,7 @@ func TestAddPeer_UserPendingApprovalBlocked(t *testing.T) {
},
}
_, _, _, err = manager.AddPeer(context.Background(), "", "", pendingUser.Id, peer, false)
_, _, _, _, err = manager.AddPeer(context.Background(), "", "", pendingUser.Id, peer, false)
require.Error(t, err)
assert.Contains(t, err.Error(), "user pending approval cannot add peers")
}
@@ -2401,7 +2401,7 @@ func TestAddPeer_ApprovedUserCanAddPeers(t *testing.T) {
},
}
_, _, _, err = manager.AddPeer(context.Background(), "", "", regularUser.Id, peer, false)
_, _, _, _, err = manager.AddPeer(context.Background(), "", "", regularUser.Id, peer, false)
require.NoError(t, err, "Regular user should be able to add peers")
}
@@ -2444,7 +2444,7 @@ func TestLoginPeer_UserPendingApprovalBlocked(t *testing.T) {
WtVersion: "0.28.0",
},
}
existingPeer, _, _, err := manager.AddPeer(context.Background(), "", "", pendingUser.Id, newPeer, false)
existingPeer, _, _, _, err := manager.AddPeer(context.Background(), "", "", pendingUser.Id, newPeer, false)
require.NoError(t, err)
// Now set the user back to pending approval after peer was created
@@ -2463,7 +2463,7 @@ func TestLoginPeer_UserPendingApprovalBlocked(t *testing.T) {
},
}
_, _, _, err = manager.LoginPeer(context.Background(), login)
_, _, _, _, err = manager.LoginPeer(context.Background(), login)
require.Error(t, err)
e, ok := status.FromError(err)
require.True(t, ok, "error is not a gRPC status error")
@@ -2500,7 +2500,7 @@ func TestLoginPeer_ApprovedUserCanLogin(t *testing.T) {
WtVersion: "0.28.0",
},
}
existingPeer, _, _, err := manager.AddPeer(context.Background(), "", "", regularUser.Id, newPeer, false)
existingPeer, _, _, _, err := manager.AddPeer(context.Background(), "", "", regularUser.Id, newPeer, false)
require.NoError(t, err)
// Try to login with regular user
@@ -2513,7 +2513,7 @@ func TestLoginPeer_ApprovedUserCanLogin(t *testing.T) {
},
}
_, _, _, err = manager.LoginPeer(context.Background(), login)
_, _, _, _, err = manager.LoginPeer(context.Background(), login)
require.NoError(t, err, "Regular user should be able to login peers")
}
@@ -2837,7 +2837,7 @@ func TestUpdatePeer_DnsLabelCollisionWithFQDN(t *testing.T) {
// Add first peer with hostname that produces DNS label "netbird1"
key1, err := wgtypes.GenerateKey()
require.NoError(t, err)
peer1, _, _, err := manager.AddPeer(context.Background(), "", "", userID, &nbpeer.Peer{
peer1, _, _, _, err := manager.AddPeer(context.Background(), "", "", userID, &nbpeer.Peer{
Key: key1.PublicKey().String(),
Meta: nbpeer.PeerSystemMeta{Hostname: "netbird1.netbird.cloud"},
}, false)
@@ -2847,7 +2847,7 @@ func TestUpdatePeer_DnsLabelCollisionWithFQDN(t *testing.T) {
// Add second peer with a different hostname
key2, err := wgtypes.GenerateKey()
require.NoError(t, err)
peer2, _, _, err := manager.AddPeer(context.Background(), "", "", userID, &nbpeer.Peer{
peer2, _, _, _, err := manager.AddPeer(context.Background(), "", "", userID, &nbpeer.Peer{
Key: key2.PublicKey().String(),
Meta: nbpeer.PeerSystemMeta{Hostname: "ip-10-29-5-130"},
}, false)
@@ -2871,7 +2871,7 @@ func TestUpdatePeer_DnsLabelUniqueName(t *testing.T) {
key1, err := wgtypes.GenerateKey()
require.NoError(t, err)
peer1, _, _, err := manager.AddPeer(context.Background(), "", "", userID, &nbpeer.Peer{
peer1, _, _, _, err := manager.AddPeer(context.Background(), "", "", userID, &nbpeer.Peer{
Key: key1.PublicKey().String(),
Meta: nbpeer.PeerSystemMeta{Hostname: "web-server"},
}, false)
@@ -2881,7 +2881,7 @@ func TestUpdatePeer_DnsLabelUniqueName(t *testing.T) {
// Add second peer and rename it to a unique FQDN whose first label doesn't collide
key2, err := wgtypes.GenerateKey()
require.NoError(t, err)
peer2, _, _, err := manager.AddPeer(context.Background(), "", "", userID, &nbpeer.Peer{
peer2, _, _, _, err := manager.AddPeer(context.Background(), "", "", userID, &nbpeer.Peer{
Key: key2.PublicKey().String(),
Meta: nbpeer.PeerSystemMeta{Hostname: "old-name"},
}, false)

View File

@@ -6,7 +6,6 @@ import (
b64 "encoding/base64"
"encoding/binary"
"fmt"
"math/rand"
"net"
"net/netip"
"os"
@@ -92,7 +91,7 @@ func runLargeTest(t *testing.T, store Store) {
account.SetupKeys[setupKey.Key] = setupKey
const numPerAccount = 6000
for n := 0; n < numPerAccount; n++ {
netIP := randomIPv4()
netIP := sequentialIPv4(n)
peerID := fmt.Sprintf("%s-peer-%d", account.Id, n)
addr, _ := netip.AddrFromSlice(netIP)
@@ -216,12 +215,12 @@ func runLargeTest(t *testing.T, store Store) {
}
}
func randomIPv4() net.IP {
rand.New(rand.NewSource(time.Now().UnixNano()))
// sequentialIPv4 returns a unique IPv4 address for the given index, avoiding
// the random collisions that would otherwise violate the unique (account_id, ip)
// index when generating a large number of peers.
func sequentialIPv4(n int) net.IP {
b := make([]byte, 4)
for i := range b {
b[i] = byte(rand.Intn(256))
}
binary.BigEndian.PutUint32(b, 0x0A000000+uint32(n))
return net.IP(b)
}

View File

@@ -1156,6 +1156,47 @@ func policyRuleImpliesLegacySSH(rule *PolicyRule) bool {
return rule.Protocol == PolicyRuleProtocolALL || (rule.Protocol == PolicyRuleProtocolTCP && (portsIncludesSSH(rule.Ports) || portRangeIncludesSSH(rule.PortRanges)))
}
// PeerSSHEnabledFromPolicies is the network-map-free equivalent of the sshEnabled
// determination in GetPeerConnectionResources / CalculateNetworkMapFromComponents.
func PeerSSHEnabledFromPolicies(policies []*Policy, peerID string, peerGroupIDs map[string]struct{}, peerSSHEnabled bool) bool {
for _, policy := range policies {
if !policy.Enabled {
continue
}
for _, rule := range policy.Rules {
if !rule.Enabled {
continue
}
isSSHRule := rule.Protocol == PolicyRuleProtocolNetbirdSSH ||
(policyRuleImpliesLegacySSH(rule) && peerSSHEnabled)
if !isSSHRule {
continue
}
if ruleHasDestination(rule, peerID, peerGroupIDs) {
return true
}
}
}
return false
}
func ruleHasDestination(rule *PolicyRule, peerID string, peerGroupIDs map[string]struct{}) bool {
if rule.DestinationResource.Type == ResourceTypePeer && rule.DestinationResource.ID != "" {
return rule.DestinationResource.ID == peerID
}
for _, groupID := range rule.Destinations {
if _, ok := peerGroupIDs[groupID]; ok {
return true
}
}
return false
}
func portRangeIncludesSSH(portRanges []RulePortRange) bool {
for _, pr := range portRanges {
if (pr.Start <= defaultSSHPortNumber && pr.End >= defaultSSHPortNumber) || (pr.Start <= nativeSSHPortNumber && pr.End >= nativeSSHPortNumber) {

View File

@@ -1233,3 +1233,97 @@ func TestComponents_DisabledRuleInEnabledPolicy(t *testing.T) {
assert.True(t, has3000, "enabled rule should generate firewall rule for port 3000")
assert.False(t, has3001, "disabled rule should NOT generate firewall rule for port 3001")
}
func peerGroupIDSet(account *types.Account, peerID string) map[string]struct{} {
return account.GetPeerGroups(peerID)
}
func assertSSHEquivalence(t *testing.T, account *types.Account, peerID string, validatedPeers map[string]struct{}) {
t.Helper()
nm := componentsNetworkMap(account, peerID, validatedPeers)
require.NotNil(t, nm)
got := types.PeerSSHEnabledFromPolicies(account.Policies, peerID, peerGroupIDSet(account, peerID), account.Peers[peerID].SSHEnabled)
assert.Equalf(t, nm.EnableSSH, got, "PeerSSHEnabledFromPolicies mismatch for %s", peerID)
}
func TestPeerSSHEnabledFromPolicies_MatchesMap_NetbirdSSHProtocol(t *testing.T) {
account, validatedPeers := scalableTestAccount(20, 2)
account.Groups["ssh-users"] = &types.Group{ID: "ssh-users", Name: "SSH Users", Peers: []string{}}
account.Policies = append(account.Policies, &types.Policy{
ID: "policy-ssh", Name: "SSH Access", Enabled: true, AccountID: "test-account",
Rules: []*types.PolicyRule{{
ID: "rule-ssh", Name: "Allow SSH", Enabled: true,
Action: types.PolicyTrafficActionAccept, Protocol: types.PolicyRuleProtocolNetbirdSSH,
Bidirectional: false,
Sources: []string{"group-0"}, Destinations: []string{"group-1"},
AuthorizedGroups: map[string][]string{"ssh-users": {"root"}},
}},
})
assertSSHEquivalence(t, account, "peer-10", validatedPeers)
assertSSHEquivalence(t, account, "peer-0", validatedPeers)
}
func TestPeerSSHEnabledFromPolicies_MatchesMap_NoSSHPolicy(t *testing.T) {
account, validatedPeers := scalableTestAccount(20, 2)
assertSSHEquivalence(t, account, "peer-0", validatedPeers)
}
func TestPeerSSHEnabledFromPolicies_MatchesMap_LegacyImpliedSSH(t *testing.T) {
account, validatedPeers := scalableTestAccount(20, 2)
account.Peers["peer-10"].SSHEnabled = true
assertSSHEquivalence(t, account, "peer-10", validatedPeers)
assertSSHEquivalence(t, account, "peer-11", validatedPeers)
}
func TestPeerSSHEnabledFromPolicies_MatchesMap_PeerAsDestinationResource(t *testing.T) {
account, validatedPeers := scalableTestAccountWithoutDefaultPolicy(20, 2)
account.Policies = append(account.Policies, &types.Policy{
ID: "policy-ssh-res", Name: "SSH to peer", Enabled: true, AccountID: "test-account",
Rules: []*types.PolicyRule{{
ID: "rule-ssh-res", Name: "SSH to peer-5", Enabled: true,
Action: types.PolicyTrafficActionAccept, Protocol: types.PolicyRuleProtocolNetbirdSSH,
Sources: []string{"group-0"},
DestinationResource: types.Resource{ID: "peer-5", Type: types.ResourceTypePeer},
}},
})
assertSSHEquivalence(t, account, "peer-5", validatedPeers)
assertSSHEquivalence(t, account, "peer-6", validatedPeers)
}
func TestPeerSSHEnabledFromPolicies_MatchesMap_DisabledSSHPolicy(t *testing.T) {
account, validatedPeers := scalableTestAccountWithoutDefaultPolicy(20, 2)
account.Policies = append(account.Policies, &types.Policy{
ID: "policy-ssh-off", Name: "SSH disabled", Enabled: false, AccountID: "test-account",
Rules: []*types.PolicyRule{{
ID: "rule-ssh-off", Name: "Allow SSH", Enabled: true,
Action: types.PolicyTrafficActionAccept, Protocol: types.PolicyRuleProtocolNetbirdSSH,
Sources: []string{"group-0"}, Destinations: []string{"group-1"},
}},
})
assertSSHEquivalence(t, account, "peer-10", validatedPeers)
}
func TestPeerSSHEnabledFromPolicies_MatchesMap_Sweep(t *testing.T) {
account, validatedPeers := scalableTestAccount(60, 6)
account.Policies = append(account.Policies, &types.Policy{
ID: "policy-ssh-sweep", Name: "SSH sweep", Enabled: true, AccountID: "test-account",
Rules: []*types.PolicyRule{{
ID: "rule-ssh-sweep", Name: "Allow SSH", Enabled: true,
Action: types.PolicyTrafficActionAccept, Protocol: types.PolicyRuleProtocolNetbirdSSH,
Sources: []string{"group-0"}, Destinations: []string{"group-2"},
}},
})
for peerID := range account.Peers {
account.Peers[peerID].SSHEnabled = len(peerID)%2 == 0
}
for peerID := range account.Peers {
if _, ok := validatedPeers[peerID]; !ok {
continue
}
assertSSHEquivalence(t, account, peerID, validatedPeers)
}
}

View File

@@ -1565,7 +1565,7 @@ func TestUserAccountPeersUpdate(t *testing.T) {
require.NoError(t, err)
expectedPeerKey := key.PublicKey().String()
peer4, _, _, err := manager.AddPeer(context.Background(), "", "", "regularUser2", &nbpeer.Peer{
peer4, _, _, _, err := manager.AddPeer(context.Background(), "", "", "regularUser2", &nbpeer.Peer{
Key: expectedPeerKey,
Meta: nbpeer.PeerSystemMeta{Hostname: expectedPeerKey},
}, false)

View File

@@ -7,7 +7,8 @@ RUN echo "netbird:x:1000:1000:netbird:/var/lib/netbird:/sbin/nologin" > /tmp/pas
mkdir -p /tmp/certs
FROM gcr.io/distroless/base:debug
COPY netbird-proxy /go/bin/netbird-proxy
ARG TARGETPLATFORM
COPY ${TARGETPLATFORM}/netbird-proxy /go/bin/netbird-proxy
COPY --from=builder /tmp/passwd /etc/passwd
COPY --from=builder /tmp/group /etc/group
COPY --from=builder --chown=1000:1000 /tmp/var/lib/netbird /var/lib/netbird

View File

@@ -1,4 +1,5 @@
FROM gcr.io/distroless/base:debug
ENTRYPOINT [ "/go/bin/netbird-relay" ]
ENV NB_LOG_FILE=console
COPY netbird-relay /go/bin/netbird-relay
ARG TARGETPLATFORM
COPY ${TARGETPLATFORM}/netbird-relay /go/bin/netbird-relay

View File

@@ -6,6 +6,7 @@ import (
"time"
log "github.com/sirupsen/logrus"
"go.opentelemetry.io/otel/attribute"
"go.opentelemetry.io/otel/metric"
)
@@ -119,8 +120,8 @@ func NewMetrics(ctx context.Context, meter metric.Meter) (*Metrics, error) {
}
// PeerConnected increments the number of connected peers and increments number of idle connections
func (m *Metrics) PeerConnected(id string) {
m.peers.Add(m.ctx, 1)
func (m *Metrics) PeerConnected(id, transport string) {
m.peers.Add(m.ctx, 1, metric.WithAttributes(attribute.String("transport", transport)))
m.mutexActivity.Lock()
defer m.mutexActivity.Unlock()
@@ -138,8 +139,8 @@ func (m *Metrics) RecordPeerStoreTime(duration time.Duration) {
}
// PeerDisconnected decrements the number of connected peers and decrements number of idle or active connections
func (m *Metrics) PeerDisconnected(id string) {
m.peers.Add(m.ctx, -1)
func (m *Metrics) PeerDisconnected(id, transport string) {
m.peers.Add(m.ctx, -1, metric.WithAttributes(attribute.String("transport", transport)))
m.mutexActivity.Lock()
defer m.mutexActivity.Unlock()

View File

@@ -11,4 +11,6 @@ type Conn interface {
Write(ctx context.Context, b []byte) (n int, err error)
RemoteAddr() net.Addr
Close() error
// Protocol returns the transport name.
Protocol() string
}

View File

@@ -42,6 +42,11 @@ func (c *Conn) RemoteAddr() net.Addr {
return c.session.RemoteAddr()
}
// Protocol returns the transport name for this connection.
func (c *Conn) Protocol() string {
return "quic"
}
func (c *Conn) Close() error {
c.closedMu.Lock()
if c.closed {

View File

@@ -64,6 +64,11 @@ func (c *Conn) RemoteAddr() net.Addr {
return c.rAddr
}
// Protocol returns the transport name for this connection.
func (c *Conn) Protocol() string {
return "ws"
}
func (c *Conn) Close() error {
c.closedMu.Lock()
c.closed = true

View File

@@ -154,15 +154,16 @@ func (r *Relay) Accept(conn listener.Conn) {
}
r.notifier.PeerCameOnline(peer.ID())
transport := conn.Protocol()
r.metrics.RecordPeerStoreTime(time.Since(storeTime))
r.metrics.PeerConnected(peer.String())
r.metrics.PeerConnected(peer.String(), transport)
go func() {
peer.Work()
if deleted := r.store.DeletePeer(peer); deleted {
r.notifier.PeerWentOffline(peer.ID())
}
peer.log.Debugf("relay connection closed")
r.metrics.PeerDisconnected(peer.String())
r.metrics.PeerDisconnected(peer.String(), transport)
}()
if err := h.handshakeResponse(hsCtx); err != nil {

View File

@@ -145,6 +145,11 @@ func (cc *connContainer) close() {
}
}
// transportConn is implemented by relay connections that know their transport.
type transportConn interface {
Protocol() string
}
// Client is a client for the relay server. It is responsible for establishing a connection to the relay server and
// managing connections to other peers. All exported functions are safe to call concurrently. After close the connection,
// the client can be reused by calling Connect again. When the client is closed, all connections are closed too.
@@ -182,6 +187,18 @@ type Client struct {
// datagramFallbackTriggered guards a single fallback per connection so a
// burst of oversized datagrams triggers one reconnect, not many.
datagramFallbackTriggered atomic.Bool
// transport is the negotiated relay transport of the
// current connection, guarded by mu.
transport string
}
// Transport returns the negotiated relay transport of the current connection,
// or an empty string when not connected.
func (c *Client) Transport() string {
c.mu.Lock()
defer c.mu.Unlock()
return c.transport
}
// SetTransportFallback wires the shared datagram-transport fallback tracker.
@@ -402,6 +419,9 @@ func (c *Client) connect(ctx context.Context) (*RelayAddr, error) {
}
c.relayConn = conn
c.datagramFallbackTriggered.Store(false)
if tc, ok := conn.(transportConn); ok {
c.transport = tc.Protocol()
}
instanceURL, err := c.handShake(ctx)
if err != nil {
@@ -792,6 +812,7 @@ func (c *Client) close(gracefullyExit bool) error {
return nil
}
c.serviceIsRunning = false
c.transport = ""
c.muInstanceURL.Lock()
c.instanceURL = nil

View File

@@ -57,6 +57,11 @@ func (c *Conn) Write(b []byte) (int, error) {
return len(b), nil
}
// Protocol returns the transport name for this connection.
func (c *Conn) Protocol() string {
return Network
}
func (c *Conn) RemoteAddr() net.Addr {
return c.session.RemoteAddr()
}

View File

@@ -59,14 +59,12 @@ func (d Dialer) Dial(ctx context.Context, address, serverName string) (net.Conn,
udpConn, err := nbnet.ListenUDP("udp", &net.UDPAddr{Port: 0})
if err != nil {
log.Errorf("failed to listen on UDP: %s", err)
return nil, err
return nil, fmt.Errorf("listen udp: %w", err)
}
udpAddr, err := net.ResolveUDPAddr("udp", quicURL)
if err != nil {
log.Errorf("failed to resolve UDP address: %s", err)
return nil, err
return nil, fmt.Errorf("resolve %s: %w", quicURL, err)
}
session, err := quic.Dial(ctx, udpConn, udpAddr, tlsClientConfig, quicConfig)
@@ -74,7 +72,7 @@ func (d Dialer) Dial(ctx context.Context, address, serverName string) (net.Conn,
if errors.Is(err, context.Canceled) {
return nil, err
}
log.Errorf("failed to dial to Relay server via QUIC '%s': %s", quicURL, err)
log.Debugf("failed to dial to Relay server via QUIC '%s': %s", quicURL, err)
return nil, err
}

View File

@@ -3,6 +3,7 @@ package dialer
import (
"context"
"errors"
"fmt"
"net"
"time"
@@ -71,6 +72,7 @@ func (r *RaceDial) Dial(ctx context.Context) (net.Conn, error) {
connChan := make(chan dialResult, len(r.dialerFns))
winnerConn := make(chan net.Conn, 1)
errChan := make(chan error, 1)
abortCtx, abort := context.WithCancel(ctx)
defer abort()
@@ -78,11 +80,11 @@ func (r *RaceDial) Dial(ctx context.Context) (net.Conn, error) {
go r.dial(dfn, abortCtx, connChan)
}
go r.processResults(connChan, winnerConn, abort)
go r.processResults(connChan, winnerConn, errChan, abort)
conn, ok := <-winnerConn
if !ok {
return nil, errors.New("failed to dial to Relay server on any protocol")
return nil, <-errChan
}
return conn, nil
}
@@ -90,6 +92,7 @@ func (r *RaceDial) Dial(ctx context.Context) (net.Conn, error) {
// dialSequential tries each dialer in order, returning the first connection and
// falling back to the next on failure.
func (r *RaceDial) dialSequential(ctx context.Context) (net.Conn, error) {
var errs []error
for _, dfn := range r.dialerFns {
if err := ctx.Err(); err != nil {
return nil, err
@@ -103,12 +106,13 @@ func (r *RaceDial) dialSequential(ctx context.Context) (net.Conn, error) {
return nil, err
}
r.log.Errorf("failed to dial via %s: %s", dfn.Protocol(), err)
errs = append(errs, fmt.Errorf("%s: %w", dfn.Protocol(), err))
continue
}
r.log.Infof("successfully dialed via: %s", dfn.Protocol())
return conn, nil
}
return nil, errors.New("failed to dial to Relay server on any protocol")
return nil, dialErr(errs)
}
func (r *RaceDial) dial(dfn DialeFn, abortCtx context.Context, connChan chan dialResult) {
@@ -120,8 +124,9 @@ func (r *RaceDial) dial(dfn DialeFn, abortCtx context.Context, connChan chan dia
connChan <- dialResult{Conn: conn, Protocol: dfn.Protocol(), Err: err}
}
func (r *RaceDial) processResults(connChan chan dialResult, winnerConn chan net.Conn, abort context.CancelFunc) {
func (r *RaceDial) processResults(connChan chan dialResult, winnerConn chan net.Conn, errChan chan error, abort context.CancelFunc) {
var hasWinner bool
errsByProtocol := make(map[string]error)
for i := 0; i < len(r.dialerFns); i++ {
dr := <-connChan
if dr.Err != nil {
@@ -129,6 +134,7 @@ func (r *RaceDial) processResults(connChan chan dialResult, winnerConn chan net.
r.log.Infof("connection attempt aborted via: %s", dr.Protocol)
} else {
r.log.Errorf("failed to dial via %s: %s", dr.Protocol, dr.Err)
errsByProtocol[dr.Protocol] = fmt.Errorf("%s: %w", dr.Protocol, dr.Err)
}
continue
}
@@ -146,5 +152,29 @@ func (r *RaceDial) processResults(connChan chan dialResult, winnerConn chan net.
hasWinner = true
winnerConn <- dr.Conn
}
if !hasWinner {
errChan <- dialErr(r.orderedErrs(errsByProtocol))
}
close(winnerConn)
}
// orderedErrs returns the per-protocol errors in dialer order, so the combined
// error is stable regardless of which attempt failed first.
func (r *RaceDial) orderedErrs(byProtocol map[string]error) []error {
errs := make([]error, 0, len(byProtocol))
for _, dfn := range r.dialerFns {
if err, ok := byProtocol[dfn.Protocol()]; ok {
errs = append(errs, err)
}
}
return errs
}
// dialErr combines per-dialer failures, preserving the underlying reasons
// (e.g. "connection refused") rather than a generic message.
func dialErr(errs []error) error {
if len(errs) == 0 {
return errors.New("no relay transport available")
}
return errors.Join(errs...)
}

View File

@@ -33,6 +33,11 @@ func NewConn(wsConn *websocket.Conn, serverAddress string, underlying net.Conn)
}
}
// Protocol returns the transport name for this connection.
func (c *Conn) Protocol() string {
return Network
}
func (c *Conn) Read(b []byte) (n int, err error) {
t, ioReader, err := c.Conn.Reader(c.ctx)
if err != nil {

View File

@@ -22,7 +22,7 @@ type Dialer struct {
}
func (d Dialer) Protocol() string {
return "WS"
return Network
}
func (d Dialer) Dial(ctx context.Context, address, serverName string) (net.Conn, error) {
@@ -39,7 +39,12 @@ func (d Dialer) Dial(ctx context.Context, address, serverName string) (net.Conn,
if errors.Is(err, context.Canceled) {
return nil, err
}
log.Errorf("failed to dial to Relay server '%s': %s", wsURL, err)
// websocket.Dial wraps the cause in verbose layers; surface the
// underlying network error when present.
var opErr *net.OpError
if errors.As(err, &opErr) {
return nil, opErr
}
return nil, err
}
if resp.Body != nil {

View File

@@ -41,14 +41,14 @@ func TestGetDialers(t *testing.T) {
preferWS bool
want []string
}{
{name: "auto races quic and ws", mode: "auto", mtu: iface.DefaultMTU, want: []string{"quic", "WS"}},
{name: "ws pinned", mode: "ws", mtu: iface.DefaultMTU, want: []string{"WS"}},
{name: "auto races quic and ws", mode: "auto", mtu: iface.DefaultMTU, want: []string{"quic", "ws"}},
{name: "ws pinned", mode: "ws", mtu: iface.DefaultMTU, want: []string{"ws"}},
{name: "quic pinned", mode: "quic", mtu: iface.DefaultMTU, want: []string{"quic"}},
{name: "prefer-quic orders quic first", mode: "prefer-quic", mtu: iface.DefaultMTU, want: []string{"quic", "WS"}},
{name: "prefer-ws orders ws first", mode: "prefer-ws", mtu: iface.DefaultMTU, want: []string{"WS", "quic"}},
{name: "mtu above default forces ws", mode: "auto", mtu: iface.DefaultMTU + 100, want: []string{"WS"}},
{name: "sticky fallback forces ws in auto", mode: "auto", mtu: iface.DefaultMTU, preferWS: true, want: []string{"WS"}},
{name: "sticky fallback forces ws in prefer-quic", mode: "prefer-quic", mtu: iface.DefaultMTU, preferWS: true, want: []string{"WS"}},
{name: "prefer-quic orders quic first", mode: "prefer-quic", mtu: iface.DefaultMTU, want: []string{"quic", "ws"}},
{name: "prefer-ws orders ws first", mode: "prefer-ws", mtu: iface.DefaultMTU, want: []string{"ws", "quic"}},
{name: "mtu above default forces ws", mode: "auto", mtu: iface.DefaultMTU + 100, want: []string{"ws"}},
{name: "sticky fallback forces ws in auto", mode: "auto", mtu: iface.DefaultMTU, preferWS: true, want: []string{"ws"}},
{name: "sticky fallback forces ws in prefer-quic", mode: "prefer-quic", mtu: iface.DefaultMTU, preferWS: true, want: []string{"ws"}},
{name: "quic pin overrides sticky fallback", mode: "quic", mtu: iface.DefaultMTU, preferWS: true, want: []string{"quic"}},
}
@@ -91,11 +91,11 @@ func TestStickyFallbackAfterDatagramTooLarge(t *testing.T) {
}
// First dial races both transports.
assert.Equal(t, []string{"quic", "WS"}, protocols(c.getDialers(transportModeFromEnv())))
assert.Equal(t, []string{"quic", "ws"}, protocols(c.getDialers(transportModeFromEnv())))
// An oversized datagram records the fallback for this server.
c.onDatagramTooLarge(&closeTrackingConn{}, netErr.ErrDatagramTooLarge)
// The reconnect now sticks to WebSocket.
assert.Equal(t, []string{"WS"}, protocols(c.getDialers(transportModeFromEnv())))
assert.Equal(t, []string{"ws"}, protocols(c.getDialers(transportModeFromEnv())))
}

View File

@@ -2,6 +2,7 @@ package client
import (
"context"
"sync/atomic"
"time"
"github.com/cenkalti/backoff/v4"
@@ -20,6 +21,10 @@ type Guard struct {
// maxBackoffInterval caps the exponential backoff between reconnect
// attempts.
maxBackoffInterval time.Duration
// lastErr is the error from the most recent failed reconnect attempt,
// surfaced as the home relay status while disconnected.
lastErr atomic.Pointer[error]
}
// NewGuard creates a new guard for the relay client. A non-positive
@@ -37,6 +42,15 @@ func NewGuard(sp *ServerPicker, maxBackoffInterval time.Duration) *Guard {
return g
}
// LastError returns the error from the most recent failed reconnect attempt, or
// nil if reconnection last succeeded.
func (g *Guard) LastError() error {
if p := g.lastErr.Load(); p != nil {
return *p
}
return nil
}
// StartReconnectTrys is called when the relay client is disconnected from the relay server.
// It attempts to reconnect to the relay server. The function first tries a quick reconnect
// to the same server that was used before, if the server URL is still valid. If the quick
@@ -63,6 +77,7 @@ func (g *Guard) StartReconnectTrys(ctx context.Context, relayClient *Client) {
case <-ticker.C:
if err := g.retry(ctx); err != nil {
log.Errorf("failed to pick new Relay server: %s", err)
g.setLastError(err)
continue
}
return
@@ -72,6 +87,10 @@ func (g *Guard) StartReconnectTrys(ctx context.Context, relayClient *Client) {
}
}
func (g *Guard) setLastError(err error) {
g.lastErr.Store(&err)
}
func (g *Guard) tryToQuickReconnect(parentCtx context.Context, rc *Client) bool {
if rc == nil {
return false
@@ -89,6 +108,7 @@ func (g *Guard) tryToQuickReconnect(parentCtx context.Context, rc *Client) bool
if err := rc.Connect(parentCtx); err != nil {
log.Errorf("failed to reconnect to relay server: %s", err)
g.setLastError(err)
return false
}
return true
@@ -100,6 +120,7 @@ func (g *Guard) retry(ctx context.Context) error {
if err != nil {
return err
}
g.setLastError(nil)
// prevent to work with a deprecated Relay client instance
g.drainRelayClientChan()
@@ -125,6 +146,7 @@ func (g *Guard) isServerURLStillValid(rc *Client) bool {
}
func (g *Guard) notifyReconnected() {
g.setLastError(nil)
select {
case g.OnReconnected <- struct{}{}:
default:

View File

@@ -43,6 +43,17 @@ type OnServerCloseListener func()
// ManagerOption configures a Manager at construction time.
type ManagerOption func(*Manager)
// RelayConnState is the connection state of a single relay server.
type RelayConnState struct {
// URL is the server's instance address when connected, otherwise the
// configured server URL.
URL string
// Transport is the negotiated transport, empty if not connected.
Transport string
// Err is set when the relay is not connected.
Err error
}
// WithMaxBackoffInterval caps the exponential backoff between reconnect
// attempts to the home relay. A non-positive value keeps the default.
func WithMaxBackoffInterval(d time.Duration) ManagerOption {
@@ -130,6 +141,9 @@ func (m *Manager) Serve() error {
client, err := m.serverPicker.PickServer(m.ctx)
if err != nil {
// record the initial failure so status shows the real reason before
// the guard's first retry tick
m.reconnectGuard.setLastError(err)
go m.reconnectGuard.StartReconnectTrys(m.ctx, nil)
} else {
m.storeClient(client)
@@ -242,6 +256,56 @@ func (m *Manager) ServerURLs() []string {
return m.serverPicker.ServerURLs.Load().([]string)
}
// RelayConnectError returns the error from the most recent failed home relay
// reconnect attempt, or nil if the relay last connected successfully.
func (m *Manager) RelayConnectError() error {
return m.reconnectGuard.LastError()
}
// RelayStates returns the connection state of the home relay and every foreign
// relay the manager currently tracks.
func (m *Manager) RelayStates() []RelayConnState {
var states []RelayConnState
m.relayClientMu.RLock()
home := m.relayClient
m.relayClientMu.RUnlock()
if home != nil {
st := relayConnState(home)
// The home relay reconnects through the guard, so the real failure
// reason lives there rather than on the (stale) client.
if st.Err != nil {
if gErr := m.reconnectGuard.LastError(); gErr != nil {
st.Err = gErr
}
}
states = append(states, st)
}
// Snapshot the tracks, then query each outside the map lock: a track can be
// held by an in-progress Connect, and blocking on it must not stall other
// relay operations.
m.relayClientsMutex.RLock()
tracks := make([]*RelayTrack, 0, len(m.relayClients))
for _, rt := range m.relayClients {
tracks = append(tracks, rt)
}
m.relayClientsMutex.RUnlock()
// Only connected foreign relays carry state; a failed connect is evicted
// immediately (openConnVia), so there is no error state to surface.
for _, rt := range tracks {
rt.RLock()
rc := rt.relayClient
rt.RUnlock()
if rc != nil {
states = append(states, relayConnState(rc))
}
}
return states
}
// HasRelayAddress returns true if the manager is serving. With this method can check if the peer can communicate with
// Relay service.
func (m *Manager) HasRelayAddress() bool {
@@ -460,3 +524,11 @@ func (m *Manager) notifyOnDisconnectListeners(serverAddress string) {
}
delete(m.onDisconnectedListeners, serverAddress)
}
func relayConnState(c *Client) RelayConnState {
addr, err := c.ServerInstanceURL()
if err != nil {
return RelayConnState{URL: c.connectionURL, Err: err}
}
return RelayConnState{URL: addr, Transport: c.Transport()}
}

View File

@@ -40,6 +40,7 @@ func (sp *ServerPicker) PickServer(parentCtx context.Context) (*Client, error) {
connResultChan := make(chan connResult, totalServers)
successChan := make(chan connResult, 1)
errChan := make(chan error, 1)
concurrentLimiter := make(chan struct{}, maxConcurrentServers)
log.Debugf("pick server from list: %v", sp.ServerURLs.Load().([]string))
@@ -54,17 +55,17 @@ func (sp *ServerPicker) PickServer(parentCtx context.Context) (*Client, error) {
}(url)
}
go sp.processConnResults(connResultChan, successChan)
go sp.processConnResults(connResultChan, successChan, errChan)
select {
case cr, ok := <-successChan:
if !ok {
return nil, errors.New("failed to connect to any relay server: all attempts failed")
return nil, <-errChan
}
log.Infof("chosen home Relay server: %s", cr.Url)
return cr.RelayClient, nil
case <-ctx.Done():
return nil, fmt.Errorf("failed to connect to any relay server: %w", ctx.Err())
return nil, fmt.Errorf("connect to relay server: %w", ctx.Err())
}
}
@@ -80,12 +81,14 @@ func (sp *ServerPicker) startConnection(ctx context.Context, resultChan chan con
}
}
func (sp *ServerPicker) processConnResults(resultChan chan connResult, successChan chan connResult) {
func (sp *ServerPicker) processConnResults(resultChan chan connResult, successChan chan connResult, errChan chan error) {
var hasSuccess bool
var errs []error
for numOfResults := 0; numOfResults < cap(resultChan); numOfResults++ {
cr := <-resultChan
if cr.Err != nil {
log.Tracef("failed to connect to Relay server: %s: %v", cr.Url, cr.Err)
errs = append(errs, cr.Err)
continue
}
log.Infof("connected to Relay server: %s", cr.Url)
@@ -101,5 +104,16 @@ func (sp *ServerPicker) processConnResults(resultChan chan connResult, successCh
hasSuccess = true
successChan <- cr
}
if !hasSuccess {
errChan <- pickErr(errs)
}
close(successChan)
}
// pickErr combines per-server connection failures into a single error.
func pickErr(errs []error) error {
if len(errs) == 0 {
return errors.New("no relay server available")
}
return errors.Join(errs...)
}

View File

@@ -2,9 +2,11 @@ package client
import (
"context"
"errors"
"fmt"
"io"
"sync"
"sync/atomic"
"time"
"github.com/cenkalti/backoff/v4"
@@ -23,7 +25,23 @@ import (
"github.com/netbirdio/netbird/util/wsproxy"
)
const healthCheckTimeout = 5 * time.Second
const (
// receiveInactivityThreshold is how long the receive stream may be silent
// before the watchdog actively probes it. The gRPC transport can stay
// healthy (keepalive satisfied) while the server stops delivering messages,
// which the transport layer cannot detect.
receiveInactivityThreshold = 30 * time.Second
// receiveProbeTimeout is how long the watchdog waits for its self-addressed
// probe to round-trip back on the stream before declaring the receive
// direction dead.
receiveProbeTimeout = 10 * time.Second
// receiveWatchdogInterval is how often the watchdog evaluates the stream.
receiveWatchdogInterval = 10 * time.Second
)
// errReceiveStreamStalled is reported when the receive stream is transport-alive
// but no longer delivering messages, so the stream is torn down to reconnect.
var errReceiveStreamStalled = errors.New("signal receive stream stalled")
// ConnStateNotifier is a wrapper interface of the status recorder
type ConnStateNotifier interface {
@@ -52,6 +70,14 @@ type GrpcClient struct {
decryptionWorker *Worker
decryptionWorkerCancel context.CancelFunc
decryptionWg sync.WaitGroup
// lastReceived holds the Unix-nano timestamp of the last message read from
// the receive stream, used by the receive watchdog.
lastReceived atomic.Int64
// receiveStalled is set by the receive watchdog when the stream is
// transport-alive but no longer delivering messages. It is the source of
// truth IsHealthy reads, and is cleared once any frame is received again.
receiveStalled atomic.Bool
}
// NewClient creates a new Signal client
@@ -148,9 +174,9 @@ func (c *GrpcClient) Receive(ctx context.Context, msgHandler func(msg *proto.Mes
// connect to Signal stream identifying ourselves with a public WireGuard key
// todo once the key rotation logic has been implemented, consider changing to some other identifier (received from management)
ctx, cancelStream := context.WithCancel(ctx)
streamCtx, cancelStream := context.WithCancel(ctx)
defer cancelStream()
stream, err := c.connect(ctx, c.key.PublicKey().String())
stream, err := c.connect(streamCtx, c.key.PublicKey().String())
if err != nil {
log.Warnf("disconnected from the Signal Exchange due to an error: %v", err)
return err
@@ -164,9 +190,16 @@ func (c *GrpcClient) Receive(ctx context.Context, msgHandler func(msg *proto.Mes
// Start worker pool if not already started
c.startEncryptionWorker(msgHandler)
// Guard the receive direction: the transport can stay healthy while the
// server stops delivering messages. The watchdog reconnects via cancelStream.
c.markReceived()
go c.watchReceiveStream(streamCtx, cancelStream)
// start receiving messages from the Signal stream (from other peers through signal)
err = c.receive(stream)
if err != nil {
// Check the parent context, not streamCtx: a watchdog-triggered
// cancelStream must reconnect, only a parent cancel is shutdown.
if ctx.Err() != nil {
log.Debugf("signal connection context has been canceled, this usually indicates shutdown")
return nil
@@ -252,7 +285,10 @@ func (c *GrpcClient) Ready() bool {
return c.signalConn.GetState() == connectivity.Ready || c.signalConn.GetState() == connectivity.Idle
}
// IsHealthy probes the gRPC connection and returns false on errors
// IsHealthy reports whether the Signal connection is usable, based on the
// transport state plus the receive watchdog's verdict, and updates the status
// recorder accordingly. It does not actively probe: the watchdog
// (watchReceiveStream) owns probing the receive path and reconnecting.
func (c *GrpcClient) IsHealthy() bool {
switch c.signalConn.GetState() {
case connectivity.TransientFailure:
@@ -265,16 +301,8 @@ func (c *GrpcClient) IsHealthy() bool {
case connectivity.Ready:
}
ctx, cancel := context.WithTimeout(c.ctx, healthCheckTimeout)
defer cancel()
_, err := c.realClient.Send(ctx, &proto.EncryptedMessage{
Key: c.key.PublicKey().String(),
RemoteKey: "dummy",
Body: nil,
})
if err != nil {
c.notifyDisconnected(err)
log.Warnf("health check returned: %s", err)
if c.receiveStalled.Load() {
c.notifyDisconnected(errReceiveStreamStalled)
return false
}
c.notifyConnected()
@@ -398,6 +426,68 @@ func (c *GrpcClient) Send(msg *proto.Message) error {
return err
}
// markReceived records that a frame was just read from the receive stream and
// clears the stalled flag.
func (c *GrpcClient) markReceived() {
c.lastReceived.Store(time.Now().UnixNano())
c.receiveStalled.Store(false)
}
// idleSinceReceive returns how long the receive stream has been silent.
func (c *GrpcClient) idleSinceReceive() time.Duration {
return time.Since(time.Unix(0, c.lastReceived.Load()))
}
// watchReceiveStream guards against a receive stream that is transport-alive but
// no longer delivering messages. While the stream is idle past
// receiveInactivityThreshold it sends a self-addressed probe that the Signal
// server routes back to this client. If the probe does not round-trip within
// receiveProbeTimeout the receive direction is considered dead and cancelStream
// is called so the retry loop reconnects.
func (c *GrpcClient) watchReceiveStream(ctx context.Context, cancelStream context.CancelFunc) {
ticker := time.NewTicker(receiveWatchdogInterval)
defer ticker.Stop()
var probeSentAt time.Time
for {
select {
case <-ctx.Done():
return
case <-ticker.C:
if c.idleSinceReceive() < receiveInactivityThreshold {
probeSentAt = time.Time{}
continue
}
if !probeSentAt.IsZero() && time.Since(probeSentAt) >= receiveProbeTimeout {
log.Warnf("signal receive stream stalled: no messages for %s and probe did not return, reconnecting", c.idleSinceReceive().Round(time.Second))
c.receiveStalled.Store(true)
c.notifyDisconnected(errReceiveStreamStalled)
cancelStream()
return
}
if probeSentAt.IsZero() {
if err := c.sendReceiveProbe(); err != nil {
log.Debugf("failed to send signal receive probe: %v", err)
}
probeSentAt = time.Now()
}
}
}
}
// sendReceiveProbe sends a self-addressed heartbeat. The Signal server routes it
// back to this client, exercising the exact receive path the watchdog guards.
func (c *GrpcClient) sendReceiveProbe() error {
self := c.key.PublicKey().String()
return c.Send(&proto.Message{
Key: self,
RemoteKey: self,
Body: &proto.Body{Type: proto.Body_HEARTBEAT},
})
}
// receive receives messages from other peers coming through the Signal Exchange
// and distributes them to worker threads for processing
func (c *GrpcClient) receive(stream proto.SignalExchange_ConnectStreamClient) error {
@@ -419,6 +509,9 @@ func (c *GrpcClient) receive(stream proto.SignalExchange_ConnectStreamClient) er
return err
}
// Any frame from the server proves the receive direction is alive.
c.markReceived()
if msg == nil {
continue
}

View File

@@ -0,0 +1,84 @@
package client
import (
"context"
"net"
"testing"
"time"
"github.com/stretchr/testify/require"
"go.opentelemetry.io/otel"
"golang.zx2c4.com/wireguard/wgctrl/wgtypes"
"google.golang.org/grpc"
sigProto "github.com/netbirdio/netbird/shared/signal/proto"
"github.com/netbirdio/netbird/signal/server"
)
func startTestSignalServer(t *testing.T) string {
t.Helper()
lis, err := net.Listen("tcp", "127.0.0.1:0")
require.NoError(t, err)
s := grpc.NewServer()
srv, err := server.NewServer(context.Background(), otel.Meter(""))
require.NoError(t, err)
sigProto.RegisterSignalExchangeServer(s, srv)
go func() {
_ = s.Serve(lis)
}()
t.Cleanup(s.Stop)
return lis.Addr().String()
}
// TestReceiveProbeRoundTrips verifies that the watchdog's self-addressed heartbeat
// is routed back to the same client through the signal server. This round-trip is
// what lets the watchdog confirm the receive direction is still delivering.
func TestReceiveProbeRoundTrips(t *testing.T) {
addr := startTestSignalServer(t)
key, err := wgtypes.GenerateKey()
require.NoError(t, err)
ctx, cancel := context.WithCancel(context.Background())
t.Cleanup(cancel)
client, err := NewClient(ctx, addr, key, false)
require.NoError(t, err)
t.Cleanup(func() { _ = client.Close() })
received := make(chan struct{}, 1)
go func() {
_ = client.Receive(ctx, func(msg *sigProto.Message) error {
if msg.GetBody().GetType() == sigProto.Body_HEARTBEAT && msg.GetKey() == key.PublicKey().String() {
select {
case received <- struct{}{}:
default:
}
}
return nil
})
}()
streamReady := make(chan struct{})
go func() {
client.WaitStreamConnected()
close(streamReady)
}()
select {
case <-streamReady:
case <-time.After(5 * time.Second):
t.Fatal("signal stream did not connect within timeout")
}
require.NoError(t, client.sendReceiveProbe())
select {
case <-received:
case <-time.After(3 * time.Second):
t.Fatal("self-addressed heartbeat did not round-trip back through the signal server")
}
}

View File

@@ -30,6 +30,7 @@ const (
Body_CANDIDATE Body_Type = 2
Body_MODE Body_Type = 4
Body_GO_IDLE Body_Type = 5
Body_HEARTBEAT Body_Type = 6
)
// Enum value maps for Body_Type.
@@ -40,6 +41,7 @@ var (
2: "CANDIDATE",
4: "MODE",
5: "GO_IDLE",
6: "HEARTBEAT",
}
Body_Type_value = map[string]int32{
"OFFER": 0,
@@ -47,6 +49,7 @@ var (
"CANDIDATE": 2,
"MODE": 4,
"GO_IDLE": 5,
"HEARTBEAT": 6,
}
)
@@ -463,7 +466,7 @@ var file_signalexchange_proto_rawDesc = []byte{
0x52, 0x09, 0x72, 0x65, 0x6d, 0x6f, 0x74, 0x65, 0x4b, 0x65, 0x79, 0x12, 0x28, 0x0a, 0x04, 0x62,
0x6f, 0x64, 0x79, 0x18, 0x04, 0x20, 0x01, 0x28, 0x0b, 0x32, 0x14, 0x2e, 0x73, 0x69, 0x67, 0x6e,
0x61, 0x6c, 0x65, 0x78, 0x63, 0x68, 0x61, 0x6e, 0x67, 0x65, 0x2e, 0x42, 0x6f, 0x64, 0x79, 0x52,
0x04, 0x62, 0x6f, 0x64, 0x79, 0x22, 0xc3, 0x04, 0x0a, 0x04, 0x42, 0x6f, 0x64, 0x79, 0x12, 0x2d,
0x04, 0x62, 0x6f, 0x64, 0x79, 0x22, 0xd2, 0x04, 0x0a, 0x04, 0x42, 0x6f, 0x64, 0x79, 0x12, 0x2d,
0x0a, 0x04, 0x74, 0x79, 0x70, 0x65, 0x18, 0x01, 0x20, 0x01, 0x28, 0x0e, 0x32, 0x19, 0x2e, 0x73,
0x69, 0x67, 0x6e, 0x61, 0x6c, 0x65, 0x78, 0x63, 0x68, 0x61, 0x6e, 0x67, 0x65, 0x2e, 0x42, 0x6f,
0x64, 0x79, 0x2e, 0x54, 0x79, 0x70, 0x65, 0x52, 0x04, 0x74, 0x79, 0x70, 0x65, 0x12, 0x18, 0x0a,
@@ -491,38 +494,39 @@ var file_signalexchange_proto_rawDesc = []byte{
0x52, 0x09, 0x73, 0x65, 0x73, 0x73, 0x69, 0x6f, 0x6e, 0x49, 0x64, 0x88, 0x01, 0x01, 0x12, 0x29,
0x0a, 0x0d, 0x72, 0x65, 0x6c, 0x61, 0x79, 0x53, 0x65, 0x72, 0x76, 0x65, 0x72, 0x49, 0x50, 0x18,
0x0b, 0x20, 0x01, 0x28, 0x0c, 0x48, 0x02, 0x52, 0x0d, 0x72, 0x65, 0x6c, 0x61, 0x79, 0x53, 0x65,
0x72, 0x76, 0x65, 0x72, 0x49, 0x50, 0x88, 0x01, 0x01, 0x22, 0x43, 0x0a, 0x04, 0x54, 0x79, 0x70,
0x72, 0x76, 0x65, 0x72, 0x49, 0x50, 0x88, 0x01, 0x01, 0x22, 0x52, 0x0a, 0x04, 0x54, 0x79, 0x70,
0x65, 0x12, 0x09, 0x0a, 0x05, 0x4f, 0x46, 0x46, 0x45, 0x52, 0x10, 0x00, 0x12, 0x0a, 0x0a, 0x06,
0x41, 0x4e, 0x53, 0x57, 0x45, 0x52, 0x10, 0x01, 0x12, 0x0d, 0x0a, 0x09, 0x43, 0x41, 0x4e, 0x44,
0x49, 0x44, 0x41, 0x54, 0x45, 0x10, 0x02, 0x12, 0x08, 0x0a, 0x04, 0x4d, 0x4f, 0x44, 0x45, 0x10,
0x04, 0x12, 0x0b, 0x0a, 0x07, 0x47, 0x4f, 0x5f, 0x49, 0x44, 0x4c, 0x45, 0x10, 0x05, 0x42, 0x15,
0x0a, 0x13, 0x5f, 0x72, 0x65, 0x6c, 0x61, 0x79, 0x53, 0x65, 0x72, 0x76, 0x65, 0x72, 0x41, 0x64,
0x64, 0x72, 0x65, 0x73, 0x73, 0x42, 0x0c, 0x0a, 0x0a, 0x5f, 0x73, 0x65, 0x73, 0x73, 0x69, 0x6f,
0x6e, 0x49, 0x64, 0x42, 0x10, 0x0a, 0x0e, 0x5f, 0x72, 0x65, 0x6c, 0x61, 0x79, 0x53, 0x65, 0x72,
0x76, 0x65, 0x72, 0x49, 0x50, 0x4a, 0x04, 0x08, 0x09, 0x10, 0x0a, 0x22, 0x2e, 0x0a, 0x04, 0x4d,
0x6f, 0x64, 0x65, 0x12, 0x1b, 0x0a, 0x06, 0x64, 0x69, 0x72, 0x65, 0x63, 0x74, 0x18, 0x01, 0x20,
0x01, 0x28, 0x08, 0x48, 0x00, 0x52, 0x06, 0x64, 0x69, 0x72, 0x65, 0x63, 0x74, 0x88, 0x01, 0x01,
0x42, 0x09, 0x0a, 0x07, 0x5f, 0x64, 0x69, 0x72, 0x65, 0x63, 0x74, 0x22, 0x6d, 0x0a, 0x0f, 0x52,
0x6f, 0x73, 0x65, 0x6e, 0x70, 0x61, 0x73, 0x73, 0x43, 0x6f, 0x6e, 0x66, 0x69, 0x67, 0x12, 0x28,
0x0a, 0x0f, 0x72, 0x6f, 0x73, 0x65, 0x6e, 0x70, 0x61, 0x73, 0x73, 0x50, 0x75, 0x62, 0x4b, 0x65,
0x79, 0x18, 0x01, 0x20, 0x01, 0x28, 0x0c, 0x52, 0x0f, 0x72, 0x6f, 0x73, 0x65, 0x6e, 0x70, 0x61,
0x73, 0x73, 0x50, 0x75, 0x62, 0x4b, 0x65, 0x79, 0x12, 0x30, 0x0a, 0x13, 0x72, 0x6f, 0x73, 0x65,
0x6e, 0x70, 0x61, 0x73, 0x73, 0x53, 0x65, 0x72, 0x76, 0x65, 0x72, 0x41, 0x64, 0x64, 0x72, 0x18,
0x02, 0x20, 0x01, 0x28, 0x09, 0x52, 0x13, 0x72, 0x6f, 0x73, 0x65, 0x6e, 0x70, 0x61, 0x73, 0x73,
0x53, 0x65, 0x72, 0x76, 0x65, 0x72, 0x41, 0x64, 0x64, 0x72, 0x32, 0xb9, 0x01, 0x0a, 0x0e, 0x53,
0x69, 0x67, 0x6e, 0x61, 0x6c, 0x45, 0x78, 0x63, 0x68, 0x61, 0x6e, 0x67, 0x65, 0x12, 0x4c, 0x0a,
0x04, 0x53, 0x65, 0x6e, 0x64, 0x12, 0x20, 0x2e, 0x73, 0x69, 0x67, 0x6e, 0x61, 0x6c, 0x65, 0x78,
0x63, 0x68, 0x61, 0x6e, 0x67, 0x65, 0x2e, 0x45, 0x6e, 0x63, 0x72, 0x79, 0x70, 0x74, 0x65, 0x64,
0x4d, 0x65, 0x73, 0x73, 0x61, 0x67, 0x65, 0x1a, 0x20, 0x2e, 0x73, 0x69, 0x67, 0x6e, 0x61, 0x6c,
0x65, 0x78, 0x63, 0x68, 0x61, 0x6e, 0x67, 0x65, 0x2e, 0x45, 0x6e, 0x63, 0x72, 0x79, 0x70, 0x74,
0x65, 0x64, 0x4d, 0x65, 0x73, 0x73, 0x61, 0x67, 0x65, 0x22, 0x00, 0x12, 0x59, 0x0a, 0x0d, 0x43,
0x6f, 0x6e, 0x6e, 0x65, 0x63, 0x74, 0x53, 0x74, 0x72, 0x65, 0x61, 0x6d, 0x12, 0x20, 0x2e, 0x73,
0x69, 0x67, 0x6e, 0x61, 0x6c, 0x65, 0x78, 0x63, 0x68, 0x61, 0x6e, 0x67, 0x65, 0x2e, 0x45, 0x6e,
0x63, 0x72, 0x79, 0x70, 0x74, 0x65, 0x64, 0x4d, 0x65, 0x73, 0x73, 0x61, 0x67, 0x65, 0x1a, 0x20,
0x2e, 0x73, 0x69, 0x67, 0x6e, 0x61, 0x6c, 0x65, 0x78, 0x63, 0x68, 0x61, 0x6e, 0x67, 0x65, 0x2e,
0x45, 0x6e, 0x63, 0x72, 0x79, 0x70, 0x74, 0x65, 0x64, 0x4d, 0x65, 0x73, 0x73, 0x61, 0x67, 0x65,
0x22, 0x00, 0x28, 0x01, 0x30, 0x01, 0x42, 0x08, 0x5a, 0x06, 0x2f, 0x70, 0x72, 0x6f, 0x74, 0x6f,
0x62, 0x06, 0x70, 0x72, 0x6f, 0x74, 0x6f, 0x33,
0x04, 0x12, 0x0b, 0x0a, 0x07, 0x47, 0x4f, 0x5f, 0x49, 0x44, 0x4c, 0x45, 0x10, 0x05, 0x12, 0x0d,
0x0a, 0x09, 0x48, 0x45, 0x41, 0x52, 0x54, 0x42, 0x45, 0x41, 0x54, 0x10, 0x06, 0x42, 0x15, 0x0a,
0x13, 0x5f, 0x72, 0x65, 0x6c, 0x61, 0x79, 0x53, 0x65, 0x72, 0x76, 0x65, 0x72, 0x41, 0x64, 0x64,
0x72, 0x65, 0x73, 0x73, 0x42, 0x0c, 0x0a, 0x0a, 0x5f, 0x73, 0x65, 0x73, 0x73, 0x69, 0x6f, 0x6e,
0x49, 0x64, 0x42, 0x10, 0x0a, 0x0e, 0x5f, 0x72, 0x65, 0x6c, 0x61, 0x79, 0x53, 0x65, 0x72, 0x76,
0x65, 0x72, 0x49, 0x50, 0x4a, 0x04, 0x08, 0x09, 0x10, 0x0a, 0x22, 0x2e, 0x0a, 0x04, 0x4d, 0x6f,
0x64, 0x65, 0x12, 0x1b, 0x0a, 0x06, 0x64, 0x69, 0x72, 0x65, 0x63, 0x74, 0x18, 0x01, 0x20, 0x01,
0x28, 0x08, 0x48, 0x00, 0x52, 0x06, 0x64, 0x69, 0x72, 0x65, 0x63, 0x74, 0x88, 0x01, 0x01, 0x42,
0x09, 0x0a, 0x07, 0x5f, 0x64, 0x69, 0x72, 0x65, 0x63, 0x74, 0x22, 0x6d, 0x0a, 0x0f, 0x52, 0x6f,
0x73, 0x65, 0x6e, 0x70, 0x61, 0x73, 0x73, 0x43, 0x6f, 0x6e, 0x66, 0x69, 0x67, 0x12, 0x28, 0x0a,
0x0f, 0x72, 0x6f, 0x73, 0x65, 0x6e, 0x70, 0x61, 0x73, 0x73, 0x50, 0x75, 0x62, 0x4b, 0x65, 0x79,
0x18, 0x01, 0x20, 0x01, 0x28, 0x0c, 0x52, 0x0f, 0x72, 0x6f, 0x73, 0x65, 0x6e, 0x70, 0x61, 0x73,
0x73, 0x50, 0x75, 0x62, 0x4b, 0x65, 0x79, 0x12, 0x30, 0x0a, 0x13, 0x72, 0x6f, 0x73, 0x65, 0x6e,
0x70, 0x61, 0x73, 0x73, 0x53, 0x65, 0x72, 0x76, 0x65, 0x72, 0x41, 0x64, 0x64, 0x72, 0x18, 0x02,
0x20, 0x01, 0x28, 0x09, 0x52, 0x13, 0x72, 0x6f, 0x73, 0x65, 0x6e, 0x70, 0x61, 0x73, 0x73, 0x53,
0x65, 0x72, 0x76, 0x65, 0x72, 0x41, 0x64, 0x64, 0x72, 0x32, 0xb9, 0x01, 0x0a, 0x0e, 0x53, 0x69,
0x67, 0x6e, 0x61, 0x6c, 0x45, 0x78, 0x63, 0x68, 0x61, 0x6e, 0x67, 0x65, 0x12, 0x4c, 0x0a, 0x04,
0x53, 0x65, 0x6e, 0x64, 0x12, 0x20, 0x2e, 0x73, 0x69, 0x67, 0x6e, 0x61, 0x6c, 0x65, 0x78, 0x63,
0x68, 0x61, 0x6e, 0x67, 0x65, 0x2e, 0x45, 0x6e, 0x63, 0x72, 0x79, 0x70, 0x74, 0x65, 0x64, 0x4d,
0x65, 0x73, 0x73, 0x61, 0x67, 0x65, 0x1a, 0x20, 0x2e, 0x73, 0x69, 0x67, 0x6e, 0x61, 0x6c, 0x65,
0x78, 0x63, 0x68, 0x61, 0x6e, 0x67, 0x65, 0x2e, 0x45, 0x6e, 0x63, 0x72, 0x79, 0x70, 0x74, 0x65,
0x64, 0x4d, 0x65, 0x73, 0x73, 0x61, 0x67, 0x65, 0x22, 0x00, 0x12, 0x59, 0x0a, 0x0d, 0x43, 0x6f,
0x6e, 0x6e, 0x65, 0x63, 0x74, 0x53, 0x74, 0x72, 0x65, 0x61, 0x6d, 0x12, 0x20, 0x2e, 0x73, 0x69,
0x67, 0x6e, 0x61, 0x6c, 0x65, 0x78, 0x63, 0x68, 0x61, 0x6e, 0x67, 0x65, 0x2e, 0x45, 0x6e, 0x63,
0x72, 0x79, 0x70, 0x74, 0x65, 0x64, 0x4d, 0x65, 0x73, 0x73, 0x61, 0x67, 0x65, 0x1a, 0x20, 0x2e,
0x73, 0x69, 0x67, 0x6e, 0x61, 0x6c, 0x65, 0x78, 0x63, 0x68, 0x61, 0x6e, 0x67, 0x65, 0x2e, 0x45,
0x6e, 0x63, 0x72, 0x79, 0x70, 0x74, 0x65, 0x64, 0x4d, 0x65, 0x73, 0x73, 0x61, 0x67, 0x65, 0x22,
0x00, 0x28, 0x01, 0x30, 0x01, 0x42, 0x08, 0x5a, 0x06, 0x2f, 0x70, 0x72, 0x6f, 0x74, 0x6f, 0x62,
0x06, 0x70, 0x72, 0x6f, 0x74, 0x6f, 0x33,
}
var (

View File

@@ -48,6 +48,7 @@ message Body {
CANDIDATE = 2;
MODE = 4;
GO_IDLE = 5;
HEARTBEAT = 6;
}
Type type = 1;
string payload = 2;

View File

@@ -1,4 +1,5 @@
FROM gcr.io/distroless/base:debug
ENTRYPOINT [ "/go/bin/netbird-signal","run" ]
CMD ["--log-file", "console"]
COPY netbird-signal /go/bin/netbird-signal
ARG TARGETPLATFORM
COPY ${TARGETPLATFORM}/netbird-signal /go/bin/netbird-signal

View File

@@ -26,6 +26,10 @@ type Peer struct {
// a gRpc connection stream to the Peer
Stream proto.SignalExchange_ConnectStreamServer
// sendMu serializes writes to Stream. gRPC forbids concurrent SendMsg on
// the same ServerStream, and a peer can be the target of many senders at
// once.
sendMu sync.Mutex
// registration time
RegisteredAt time.Time
@@ -33,6 +37,13 @@ type Peer struct {
Cancel context.CancelFunc
}
// Send writes a message to the peer's stream, serializing concurrent senders.
func (p *Peer) Send(msg *proto.EncryptedMessage) error {
p.sendMu.Lock()
defer p.sendMu.Unlock()
return p.Stream.Send(msg)
}
// NewPeer creates a new instance of a connected Peer
func NewPeer(id string, stream proto.SignalExchange_ConnectStreamServer, cancel context.CancelFunc) *Peer {
return &Peer{

View File

@@ -0,0 +1,67 @@
package server
import (
"context"
"sync"
"sync/atomic"
"testing"
"time"
"github.com/stretchr/testify/require"
"go.opentelemetry.io/otel"
"github.com/netbirdio/netbird/shared/signal/proto"
"github.com/netbirdio/netbird/signal/peer"
)
// concurrencyCheckStream records the maximum number of Send calls in flight at
// once. gRPC forbids concurrent SendMsg on the same ServerStream, so a correct
// server must never have more than one in flight per peer.
type concurrencyCheckStream struct {
proto.SignalExchange_ConnectStreamServer
ctx context.Context
inflight atomic.Int32
maxSeen atomic.Int32
}
func (s *concurrencyCheckStream) Send(*proto.EncryptedMessage) error {
n := s.inflight.Add(1)
for {
old := s.maxSeen.Load()
if n <= old || s.maxSeen.CompareAndSwap(old, n) {
break
}
}
// Widen the window so overlapping callers are reliably observed.
time.Sleep(time.Millisecond)
s.inflight.Add(-1)
return nil
}
func (s *concurrencyCheckStream) Context() context.Context { return s.ctx }
// TestForwardMessageToPeerSerializesSend verifies that concurrent forwards to the
// same peer never call Stream.Send concurrently, which would violate the gRPC
// ServerStream contract.
func TestForwardMessageToPeerSerializesSend(t *testing.T) {
s, err := NewServer(context.Background(), otel.Meter(""))
require.NoError(t, err)
const peerID = "peerX"
stream := &concurrencyCheckStream{ctx: context.Background()}
_, cancel := context.WithCancel(context.Background())
t.Cleanup(cancel)
require.NoError(t, s.registry.Register(peer.NewPeer(peerID, stream, cancel)))
var wg sync.WaitGroup
for i := 0; i < 50; i++ {
wg.Add(1)
go func() {
defer wg.Done()
s.forwardMessageToPeer(context.Background(), &proto.EncryptedMessage{Key: "sender", RemoteKey: peerID})
}()
}
wg.Wait()
require.Equal(t, int32(1), stream.maxSeen.Load(), "Stream.Send must never run concurrently on the same peer stream")
}

View File

@@ -179,7 +179,7 @@ func (s *Server) forwardMessageToPeer(ctx context.Context, msg *proto.EncryptedM
sendResultChan := make(chan error, 1)
go func() {
select {
case sendResultChan <- dstPeer.Stream.Send(msg):
case sendResultChan <- dstPeer.Send(msg):
return
case <-dstPeer.Stream.Context().Done():
return

View File

@@ -1,3 +1,4 @@
FROM gcr.io/distroless/base:debug
ENTRYPOINT [ "/go/bin/netbird-upload" ]
COPY netbird-upload /go/bin/netbird-upload
ARG TARGETPLATFORM
COPY ${TARGETPLATFORM}/netbird-upload /go/bin/netbird-upload

View File

@@ -140,7 +140,12 @@ func newRotatedOutput(logPath string) io.Writer {
func setGRPCLibLogger(logger *log.Logger) {
logOut := logger.Writer()
if os.Getenv("GRPC_GO_LOG_SEVERITY_LEVEL") != "info" {
grpclog.SetLoggerV2(grpclog.NewLoggerV2(io.Discard, logOut, logOut))
// Discard grpc info AND warning logs by default — the warning stream is
// dominated by benign connection-retry noise ("addrConn.createTransport
// failed", "transport is closing") that surfaces e.g. when the CLI dials
// a daemon that is still starting or already gone. Errors are kept. Set
// GRPC_GO_LOG_SEVERITY_LEVEL=info to get the full verbose grpc logging.
grpclog.SetLoggerV2(grpclog.NewLoggerV2(io.Discard, io.Discard, logOut))
return
}