Files
netbird/proxy/lifecycle.go
Maycon Santos fa1e241aea [management, client, proxy] Follow-up fixes for private reverse-proxy services (#6268)
* fix(proxy): gate tunnel-peer fast-path on inbound listener marker

forwardWithTunnelPeer previously accepted any RFC1918 / ULA / CGNAT
source IP, so a public client whose address happened to fall in those
ranges could bypass the configured operator auth scheme by colliding
with a known tunnel IP. The fast-path is now gated on
TunnelLookupFromContext(r.Context()) being present — that context value
is attached only by the per-account inbound (overlay) listener, so the
host-facing listener never enters this branch.

Tests updated to reflect the new requirement: requests that don't
carry the inbound marker now fall through to the regular auth flow.

* fix(proxy): harden inbound listener resource + startup-ctx handling

Three correctness fixes on the per-account inbound path, with tests:

- Close the logrus ErrorLog PipeWriter on tearDown. WriterLevel hands
  back an *io.PipeWriter backed by a pipe + scanner goroutine that the
  caller owns; the two writers per account (https + plain) were never
  closed, leaking the pipe and goroutine on every teardown.
- Run the post-Start hooks on context.Background(). runClientStartup
  is launched in a goroutine from AddPeer and was inheriting the
  caller's request-scoped ctx, so a cancelled request could abort the
  inbound bring-up or fail the management status notification. The
  tail is split into notifyClientReady so the contract is testable.

Tests cover the PipeWriter close behaviour and assert the readyHandler
+ NotifyStatus calls receive a non-cancelled background context.

* feat(proxy): short-circuit peer-own-target loops with 421

When a peer that hosts the target of a private service dials its own
service URL the request was being looped through the proxy and back
over WireGuard to the same peer — twice the WG round-trip for no
benefit, with no signal to the caller that something was wrong.

Add isSelfTargetLoop to ReverseProxy.ServeHTTP: when the request
arrived on the per-account overlay listener (IsOverlayOrigin) and the
source tunnel IP matches the target host, refuse the request with 421
Misdirected Request and a body pointing the operator at the backend
directly.

The gate is scoped to overlay origin so requests on the public
listener that happen to share a source IP with the target host are
forwarded normally.

* fix(management): private-service validation + tunnel-IP lookup semantics

- Require an explicit port for L4 cluster targets. validateL4Target
  exempted TargetTypeCluster from the port check, but buildPathMappings
  serializes every L4 target via net.JoinHostPort(host, port) — port=0
  shipped a ":0" upstream. Cluster targets use the same Host/Port
  fields, so the same requirement applies.
- GetPeerByIP returns NotFound on a tunnel-IP miss instead of mapping
  every error to Internal. The proxy's ValidateTunnelPeer probes IPs
  that legitimately aren't in the roster; the miss is expected and now
  distinguishable from a real store failure.
- Thread ctx into getClusterCapability's gorm query so a cancelled
  request doesn't keep the store busy.

Tests updated for the L4-cluster port requirement and the GetPeerByIP
NotFound path.

* fix(client): include offlinePeers in PeerStateByIP lookup

ReplaceOfflinePeers moves peers into d.offlinePeers but PeerStateByIP
only scanned d.peers. Callers (the local DNS filter via
localPeerConnectivity, embed.Client.IdentityForIP used by the
proxy's tunnel-peer validator) were treating known-but-offline peers
as unknown, which:

- causes the DNS filter to keep returning records pointing at peers
  that have no live tunnel, AND
- makes the proxy's local-roster check deny a request from such a
  peer rather than letting the cached management RPC carry the
  authorisation decision.

Search both slices in PeerStateByIP. Adds a unit test for the IPv4
and IPv6 offline-match paths.

* fix(rest): reject empty Delete path params in reverse-proxy clients

ReverseProxyClustersAPI.Delete and ReverseProxyTokensAPI.Delete passed
the path parameter into url.PathEscape without an empty check.
PathEscape("") returns "" which collapses the request onto the
collection endpoint ("/api/reverse-proxies/clusters/" /
"/api/reverse-proxies/proxy-tokens/"), so a caller bug delete with no
id reached a routable URL with surprising semantics (typically 405).

Short-circuit with a typed error before the request is built. Tests
mount a handler on the collection path that fails the test if hit, so
the regression is impossible to reintroduce silently.

* chore(api,ci,docs,test): private-service schema, proto-check, fixups

Non-functional cleanups and contract/CI hardening around the
private-service work:

API schema (openapi.yml):
- Require a non-empty access_groups and mode=http when private=true,
  on both Service and ServiceRequest, mirroring
  validatePrivateRequirements. mode stays optional-but-constrained
  (empty defaults to http server-side), matching runtime.

CI (proto-version-check.yml):
- Cover renamed .pb.go files (read base via previous_filename).
- Match protoc-gen-go-grpc version headers (optional "- " prefix and
  -gen-go-grpc suffix) so grpc-generated files are in scope.

Docs / comments:
- Reword Config field docs to say defaults are applied at Server.Start
  (initDefaults), not New.
- Rename the obsolete --private-inbound flag to --private across
  comments and the proto doc.

Pre-existing test fixups surfaced by review:
- Repair the integration-tagged validate_session_test.go (SignToken
  signature growth + new Manager interface methods).
- Fix the CI-skip boolean precedence so Windows isn't skipped
  unconditionally.
- Guard the router.HTTPListener type assertion with comma-ok.

* fix(proxy): background ctx for already-started AddPeer notification

The earlier ctx fix covered the async runClientStartup path but missed
the synchronous branch: when a service is added to an already-started
client, AddPeer called NotifyStatus with the caller's request-scoped
ctx. A cancelled request/stream could drop the connected notification
to management. Use context.Background() here too, matching
notifyClientReady.

Extends TestNetBird_AddPeer_ExistingStartedClient_NotifiesStatus to
pass a pre-cancelled caller ctx and assert the notification still ran
on a non-cancelled context.

* use the cmd context for roundtripper
2026-06-02 13:40:09 +02:00

172 lines
6.9 KiB
Go

package proxy
import (
"context"
"net/netip"
"time"
log "github.com/sirupsen/logrus"
"github.com/netbirdio/netbird/client/embed"
"github.com/netbirdio/netbird/proxy/internal/acme"
)
// Config bundles every knob the proxy reads at construction time. It mirrors
// the public fields on Server so library callers don't have to learn the
// internal struct layout. Zero values mean "feature off" or "fall back to the
// internal default" depending on the field — see the per-field doc.
//
// The standalone binary continues to populate Server fields directly, so
// adding fields here must not change the zero-value behaviour of Server.
type Config struct {
// ListenAddr is the TCP address the main listener binds. Required.
ListenAddr string
// ID identifies this proxy instance to management. Empty values are
// replaced with a timestamped default at Server.Start time (see
// initDefaults), not in New.
ID string
// Logger is the logrus logger used everywhere. Empty values fall
// back to log.StandardLogger() at Server.Start time (see
// initDefaults), not in New.
Logger *log.Logger
// Version is the build version string reported to management. Empty
// values are replaced with "dev" at Server.Start time (see
// initDefaults), not in New.
Version string
// ProxyURL is the public address operators use to reach this proxy.
ProxyURL string
// ManagementAddress is the gRPC URL of the management server.
ManagementAddress string
// ProxyToken authenticates this proxy with the management server.
ProxyToken string
// CertificateDirectory is the directory holding TLS certificate
// material (static or ACME-provisioned).
CertificateDirectory string
// CertificateFile is the certificate filename within
// CertificateDirectory.
CertificateFile string
// CertificateKeyFile is the private key filename within
// CertificateDirectory.
CertificateKeyFile string
// GenerateACMECertificates toggles ACME certificate provisioning.
GenerateACMECertificates bool
// ACMEChallengeAddress is the listen address for HTTP-01 challenges.
ACMEChallengeAddress string
// ACMEDirectory is the ACME directory URL (Let's Encrypt by default).
ACMEDirectory string
// ACMEEABKID is the External Account Binding Key ID for CAs that
// require EAB (e.g. ZeroSSL).
ACMEEABKID string
// ACMEEABHMACKey is the External Account Binding HMAC key for CAs
// that require EAB.
ACMEEABHMACKey string
// ACMEChallengeType is the ACME challenge type ("tls-alpn-01" or
// "http-01"). Empty defaults to "tls-alpn-01".
ACMEChallengeType string
// CertLockMethod controls how ACME certificate locks are coordinated
// across replicas.
CertLockMethod acme.CertLockMethod
// WildcardCertDir is an optional directory containing static wildcard
// certificates that override ACME for matching domains.
WildcardCertDir string
// DebugEndpointEnabled toggles the debug HTTP endpoint.
DebugEndpointEnabled bool
// DebugEndpointAddress is the bind address for the debug endpoint.
DebugEndpointAddress string
// HealthAddr is the bind address for the health probe and metrics
// surface. Empty disables the health probe entirely (library callers
// can attach their own).
HealthAddr string
// ForwardedProto overrides the X-Forwarded-Proto value sent to
// backends. Valid values: "auto", "http", "https".
ForwardedProto string
// TrustedProxies is a list of IP prefixes for trusted upstream
// proxies that may set forwarding headers.
TrustedProxies []netip.Prefix
// WireguardPort is the UDP port for the embedded NetBird tunnel.
// Zero asks the OS for a random port.
WireguardPort uint16
// ProxyProtocol enables PROXY protocol (v1/v2) on TCP listeners.
ProxyProtocol bool
// PreSharedKey is the WireGuard pre-shared key used between the
// proxy's embedded clients and peers.
PreSharedKey string
// Performance configures the tunnel pool/batch sizes for every
// embedded client this proxy creates. Zero values fall back to
// upstream defaults.
Performance embed.Performance
// SupportsCustomPorts indicates whether the proxy can bind arbitrary
// ports for TCP/UDP/TLS services.
SupportsCustomPorts bool
// RequireSubdomain forces accounts to use a subdomain in front of
// the proxy's cluster domain.
RequireSubdomain bool
// Private flags this proxy as embedded in a netbird client and
// serving exclusively over the WireGuard tunnel. Also enables
// per-account inbound listeners on each embedded client's netstack.
Private bool
// MaxDialTimeout caps the per-service backend dial timeout.
MaxDialTimeout time.Duration
// MaxSessionIdleTimeout caps the per-service session idle timeout.
MaxSessionIdleTimeout time.Duration
// GeoDataDir is the directory containing GeoLite2 MMDB files.
GeoDataDir string
// CrowdSecAPIURL is the CrowdSec LAPI URL. Empty disables CrowdSec.
CrowdSecAPIURL string
// CrowdSecAPIKey is the CrowdSec bouncer API key. Empty disables
// CrowdSec.
CrowdSecAPIKey string
}
// New builds a Server from cfg without performing any I/O. No goroutines
// are spawned, no network connections are dialed, and no listeners are
// bound — call Start to bring the proxy up. Returning a fully-formed
// Server keeps the standalone code path (which still constructs Server
// directly) byte-for-byte equivalent.
func New(ctx context.Context, cfg Config) *Server {
return &Server{
ctx: ctx,
ListenAddr: cfg.ListenAddr,
ID: cfg.ID,
Logger: cfg.Logger,
Version: cfg.Version,
ProxyURL: cfg.ProxyURL,
ManagementAddress: cfg.ManagementAddress,
ProxyToken: cfg.ProxyToken,
CertificateDirectory: cfg.CertificateDirectory,
CertificateFile: cfg.CertificateFile,
CertificateKeyFile: cfg.CertificateKeyFile,
GenerateACMECertificates: cfg.GenerateACMECertificates,
ACMEChallengeAddress: cfg.ACMEChallengeAddress,
ACMEDirectory: cfg.ACMEDirectory,
ACMEEABKID: cfg.ACMEEABKID,
ACMEEABHMACKey: cfg.ACMEEABHMACKey,
ACMEChallengeType: cfg.ACMEChallengeType,
CertLockMethod: cfg.CertLockMethod,
WildcardCertDir: cfg.WildcardCertDir,
DebugEndpointEnabled: cfg.DebugEndpointEnabled,
DebugEndpointAddress: cfg.DebugEndpointAddress,
HealthAddress: cfg.HealthAddr,
ForwardedProto: cfg.ForwardedProto,
TrustedProxies: cfg.TrustedProxies,
WireguardPort: cfg.WireguardPort,
ProxyProtocol: cfg.ProxyProtocol,
PreSharedKey: cfg.PreSharedKey,
Performance: cfg.Performance,
SupportsCustomPorts: cfg.SupportsCustomPorts,
RequireSubdomain: cfg.RequireSubdomain,
Private: cfg.Private,
MaxDialTimeout: cfg.MaxDialTimeout,
MaxSessionIdleTimeout: cfg.MaxSessionIdleTimeout,
GeoDataDir: cfg.GeoDataDir,
CrowdSecAPIURL: cfg.CrowdSecAPIURL,
CrowdSecAPIKey: cfg.CrowdSecAPIKey,
}
}