Files
netbird/proxy/server_test.go
Maycon Santos fa1e241aea [management, client, proxy] Follow-up fixes for private reverse-proxy services (#6268)
* fix(proxy): gate tunnel-peer fast-path on inbound listener marker

forwardWithTunnelPeer previously accepted any RFC1918 / ULA / CGNAT
source IP, so a public client whose address happened to fall in those
ranges could bypass the configured operator auth scheme by colliding
with a known tunnel IP. The fast-path is now gated on
TunnelLookupFromContext(r.Context()) being present — that context value
is attached only by the per-account inbound (overlay) listener, so the
host-facing listener never enters this branch.

Tests updated to reflect the new requirement: requests that don't
carry the inbound marker now fall through to the regular auth flow.

* fix(proxy): harden inbound listener resource + startup-ctx handling

Three correctness fixes on the per-account inbound path, with tests:

- Close the logrus ErrorLog PipeWriter on tearDown. WriterLevel hands
  back an *io.PipeWriter backed by a pipe + scanner goroutine that the
  caller owns; the two writers per account (https + plain) were never
  closed, leaking the pipe and goroutine on every teardown.
- Run the post-Start hooks on context.Background(). runClientStartup
  is launched in a goroutine from AddPeer and was inheriting the
  caller's request-scoped ctx, so a cancelled request could abort the
  inbound bring-up or fail the management status notification. The
  tail is split into notifyClientReady so the contract is testable.

Tests cover the PipeWriter close behaviour and assert the readyHandler
+ NotifyStatus calls receive a non-cancelled background context.

* feat(proxy): short-circuit peer-own-target loops with 421

When a peer that hosts the target of a private service dials its own
service URL the request was being looped through the proxy and back
over WireGuard to the same peer — twice the WG round-trip for no
benefit, with no signal to the caller that something was wrong.

Add isSelfTargetLoop to ReverseProxy.ServeHTTP: when the request
arrived on the per-account overlay listener (IsOverlayOrigin) and the
source tunnel IP matches the target host, refuse the request with 421
Misdirected Request and a body pointing the operator at the backend
directly.

The gate is scoped to overlay origin so requests on the public
listener that happen to share a source IP with the target host are
forwarded normally.

* fix(management): private-service validation + tunnel-IP lookup semantics

- Require an explicit port for L4 cluster targets. validateL4Target
  exempted TargetTypeCluster from the port check, but buildPathMappings
  serializes every L4 target via net.JoinHostPort(host, port) — port=0
  shipped a ":0" upstream. Cluster targets use the same Host/Port
  fields, so the same requirement applies.
- GetPeerByIP returns NotFound on a tunnel-IP miss instead of mapping
  every error to Internal. The proxy's ValidateTunnelPeer probes IPs
  that legitimately aren't in the roster; the miss is expected and now
  distinguishable from a real store failure.
- Thread ctx into getClusterCapability's gorm query so a cancelled
  request doesn't keep the store busy.

Tests updated for the L4-cluster port requirement and the GetPeerByIP
NotFound path.

* fix(client): include offlinePeers in PeerStateByIP lookup

ReplaceOfflinePeers moves peers into d.offlinePeers but PeerStateByIP
only scanned d.peers. Callers (the local DNS filter via
localPeerConnectivity, embed.Client.IdentityForIP used by the
proxy's tunnel-peer validator) were treating known-but-offline peers
as unknown, which:

- causes the DNS filter to keep returning records pointing at peers
  that have no live tunnel, AND
- makes the proxy's local-roster check deny a request from such a
  peer rather than letting the cached management RPC carry the
  authorisation decision.

Search both slices in PeerStateByIP. Adds a unit test for the IPv4
and IPv6 offline-match paths.

* fix(rest): reject empty Delete path params in reverse-proxy clients

ReverseProxyClustersAPI.Delete and ReverseProxyTokensAPI.Delete passed
the path parameter into url.PathEscape without an empty check.
PathEscape("") returns "" which collapses the request onto the
collection endpoint ("/api/reverse-proxies/clusters/" /
"/api/reverse-proxies/proxy-tokens/"), so a caller bug delete with no
id reached a routable URL with surprising semantics (typically 405).

Short-circuit with a typed error before the request is built. Tests
mount a handler on the collection path that fails the test if hit, so
the regression is impossible to reintroduce silently.

* chore(api,ci,docs,test): private-service schema, proto-check, fixups

Non-functional cleanups and contract/CI hardening around the
private-service work:

API schema (openapi.yml):
- Require a non-empty access_groups and mode=http when private=true,
  on both Service and ServiceRequest, mirroring
  validatePrivateRequirements. mode stays optional-but-constrained
  (empty defaults to http server-side), matching runtime.

CI (proto-version-check.yml):
- Cover renamed .pb.go files (read base via previous_filename).
- Match protoc-gen-go-grpc version headers (optional "- " prefix and
  -gen-go-grpc suffix) so grpc-generated files are in scope.

Docs / comments:
- Reword Config field docs to say defaults are applied at Server.Start
  (initDefaults), not New.
- Rename the obsolete --private-inbound flag to --private across
  comments and the proto doc.

Pre-existing test fixups surfaced by review:
- Repair the integration-tagged validate_session_test.go (SignToken
  signature growth + new Manager interface methods).
- Fix the CI-skip boolean precedence so Windows isn't skipped
  unconditionally.
- Guard the router.HTTPListener type assertion with comma-ok.

* fix(proxy): background ctx for already-started AddPeer notification

The earlier ctx fix covered the async runClientStartup path but missed
the synchronous branch: when a service is added to an already-started
client, AddPeer called NotifyStatus with the caller's request-scoped
ctx. A cancelled request/stream could drop the connected notification
to management. Use context.Background() here too, matching
notifyClientReady.

Extends TestNetBird_AddPeer_ExistingStartedClient_NotifiesStatus to
pass a pre-cancelled caller ctx and assert the notification still ran
on a non-cancelled context.

* use the cmd context for roundtripper
2026-06-02 13:40:09 +02:00

205 lines
6.3 KiB
Go

package proxy
import (
"context"
"errors"
"io"
"testing"
"time"
log "github.com/sirupsen/logrus"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
"github.com/netbirdio/netbird/shared/management/proto"
)
func TestDebugEndpointDisabledByDefault(t *testing.T) {
s := &Server{}
assert.False(t, s.DebugEndpointEnabled, "debug endpoint should be disabled by default")
}
func TestDebugEndpointAddr(t *testing.T) {
tests := []struct {
name string
input string
expected string
}{
{
name: "empty defaults to localhost",
input: "",
expected: "localhost:8444",
},
{
name: "explicit localhost preserved",
input: "localhost:9999",
expected: "localhost:9999",
},
{
name: "explicit address preserved",
input: "0.0.0.0:8444",
expected: "0.0.0.0:8444",
},
{
name: "127.0.0.1 preserved",
input: "127.0.0.1:8444",
expected: "127.0.0.1:8444",
},
}
for _, tc := range tests {
t.Run(tc.name, func(t *testing.T) {
result := debugEndpointAddr(tc.input)
assert.Equal(t, tc.expected, result)
})
}
}
// quietLifecycleLogger keeps lifecycle tests from spamming the test output.
func quietLifecycleLogger() *log.Logger {
l := log.New()
l.SetOutput(io.Discard)
l.SetLevel(log.PanicLevel)
return l
}
func TestStopBeforeStartIsNoOp(t *testing.T) {
srv := New(t.Context(), Config{Logger: quietLifecycleLogger()})
ctx, cancel := context.WithTimeout(context.Background(), time.Second)
defer cancel()
err := srv.Stop(ctx)
assert.NoError(t, err, "Stop on an unstarted server must succeed without error")
err = srv.Stop(ctx)
assert.NoError(t, err, "Stop must remain idempotent across repeated calls")
}
func TestStartFailsWithoutManagement(t *testing.T) {
srv := New(t.Context(), Config{
Logger: quietLifecycleLogger(),
ListenAddr: "127.0.0.1:0",
ManagementAddress: "://broken-url",
})
ctx, cancel := context.WithTimeout(context.Background(), time.Second)
defer cancel()
err := srv.Start(ctx)
require.Error(t, err, "Start must surface management dial failures")
assert.True(t, srv.started, "started flag is set before any dial attempt so a second Start fails fast")
err = srv.Start(ctx)
require.Error(t, err, "second Start must reject")
assert.Contains(t, err.Error(), "already started", "error must explain why the call was rejected")
}
func TestStopIsIdempotent(t *testing.T) {
srv := &Server{
Logger: quietLifecycleLogger(),
started: true,
runErrCh: make(chan struct{}),
runCancel: func() {},
}
srv.recordRunErr(errors.New("synthetic"))
ctx, cancel := context.WithTimeout(context.Background(), time.Second)
defer cancel()
err := srv.Stop(ctx)
require.Error(t, err, "Stop must surface the recorded background error")
assert.Contains(t, err.Error(), "synthetic", "error must round-trip recordRunErr's value")
err = srv.Stop(ctx)
require.Error(t, err, "second Stop must still report the same error")
assert.Contains(t, err.Error(), "synthetic", "idempotent Stop must return the cached error")
}
func TestRecordRunErrPreservesFirstFailure(t *testing.T) {
srv := &Server{
Logger: quietLifecycleLogger(),
runErrCh: make(chan struct{}),
}
srv.recordRunErr(errors.New("first"))
srv.recordRunErr(errors.New("second"))
require.Error(t, srv.runErr, "first failure must be retained")
assert.Contains(t, srv.runErr.Error(), "first", "second call must not overwrite the cached error")
select {
case <-srv.runErrCh:
default:
t.Fatal("recordRunErr must close runErrCh so waitAndStop unblocks")
}
}
func TestStopSkipsShutdownWhenNeverStarted(t *testing.T) {
srv := New(t.Context(), Config{Logger: quietLifecycleLogger()})
ctx, cancel := context.WithCancel(context.Background())
cancel()
err := srv.Stop(ctx)
assert.NoError(t, err, "Stop on an unstarted server should not block on the cancelled ctx")
}
func TestRedactMappingForLog_ScrubsSensitiveFields(t *testing.T) {
original := &proto.ProxyMapping{
Id: "svc-1",
Domain: "example.com",
AuthToken: "super-secret-token",
Auth: &proto.Authentication{
SessionKey: "pubkey-not-secret",
HeaderAuths: []*proto.HeaderAuth{
{Header: "Authorization", HashedValue: "argon2-hash-1"},
{Header: "X-Api-Key", HashedValue: "argon2-hash-2"},
},
},
Path: []*proto.PathMapping{
{
Path: "/api",
Target: "10.0.0.1:8080",
Options: &proto.PathTargetOptions{
CustomHeaders: map[string]string{
"Authorization": "Bearer upstream-token",
"X-Tenant": "acme",
},
},
},
},
}
redacted := redactMappingForLog(original)
assert.Equal(t, "super-secret-token", original.AuthToken, "original must not be mutated")
assert.Equal(t, "argon2-hash-1", original.Auth.HeaderAuths[0].HashedValue, "original header hash must not be mutated")
assert.Equal(t, "Bearer upstream-token", original.Path[0].Options.CustomHeaders["Authorization"], "original custom header must not be mutated")
assert.Equal(t, "[REDACTED]", redacted.AuthToken, "auth_token must be redacted")
require.Len(t, redacted.Auth.HeaderAuths, 2, "header auths must be preserved in count")
assert.Equal(t, "Authorization", redacted.Auth.HeaderAuths[0].Header, "header name must be preserved")
assert.Equal(t, "[REDACTED]", redacted.Auth.HeaderAuths[0].HashedValue, "hashed_value must be redacted")
assert.Equal(t, "[REDACTED]", redacted.Auth.HeaderAuths[1].HashedValue, "hashed_value must be redacted for every header auth")
assert.Equal(t, "pubkey-not-secret", redacted.Auth.SessionKey, "session_key (public) must be preserved")
headers := redacted.Path[0].Options.CustomHeaders
require.Len(t, headers, 2, "custom header keys must be preserved")
assert.Equal(t, "[REDACTED]", headers["Authorization"], "custom header values must be redacted")
assert.Equal(t, "[REDACTED]", headers["X-Tenant"], "every custom header value must be redacted")
assert.Equal(t, "svc-1", redacted.Id, "non-sensitive fields must round-trip")
assert.Equal(t, "example.com", redacted.Domain, "non-sensitive fields must round-trip")
}
func TestRedactMappingForLog_HandlesEmptyOrNilFields(t *testing.T) {
empty := &proto.ProxyMapping{Id: "svc-empty"}
redacted := redactMappingForLog(empty)
assert.Equal(t, "", redacted.AuthToken, "empty auth_token must remain empty (no placeholder)")
assert.Nil(t, redacted.Auth, "nil Auth must remain nil")
assert.Empty(t, redacted.Path, "empty Path must remain empty")
}