* [client] Fix flow client Receive retry loop not stopping after Close
Use backoff.Permanent for canceled gRPC errors so Receive returns
immediately instead of retrying until context deadline when the
connection is already closed. Add TestNewClient_PermanentClose to
verify the behavior.
The connectivity.Shutdown check was meaningless because when the connection is
shut down, c.realClient.Events(ctx, grpc.WaitForReady(true)) on the nex line
already fails with codes.Canceled — which is now handled as a permanent error.
The explicit state check was just duplicating what gRPC already reports
through its normal error path.
* [client] remove WaitForReady from stream open call
grpc.WaitForReady(true) parks the RPC call internally until the
connection reaches READY, only unblocking on ctx cancellation.
This means the external backoff.Retry loop in Receive() never gets
control back during a connection outage — it cannot tick, log, or
apply its retry intervals while WaitForReady is blocking.
Removing it restores fail-fast behaviour: Events() returns immediately
with codes.Unavailable when the connection is not ready, which is
exactly what the backoff loop expects. The backoff becomes the single
authority over retry timing and cadence, as originally intended.
* [client] Add connection recreation and improve flow client error handling
Store gRPC dial options on the client to enable connection recreation
on Internal errors (RST_STREAM/PROTOCOL_ERROR). Treat Unauthenticated,
PermissionDenied, and Unimplemented as permanent failures. Unify mutex
usage and add reconnection logging for better observability.
* [client] Remove Unauthenticated, PermissionDenied, and Unimplemented from permanent error handling
* [client] Fix error handling in Receive to properly re-establish stream and improve reconnection messaging
* Fix test
* [client] Add graceful shutdown handling and test for concurrent Close during Receive
Prevent reconnection attempts after client closure by tracking a `closed` flag. Use `backoff.Permanent` for errors caused by operations on a closed client. Add a test to ensure `Close` does not block when `Receive` is actively running.
* [client] Fix connection swap to properly close old gRPC connection
Close the old `gRPC.ClientConn` after successfully swapping to a new connection during reconnection.
* [client] Reset backoff
* [client] Ensure stream closure on error during initialization
* [client] Add test for handling server-side stream closure and reconnection
Introduce `TestReceive_ServerClosesStream` to verify the client's ability to recover and process acknowledgments after the server closes the stream. Enhance test server with a controlled stream closure mechanism.
* [client] Add protocol error simulation and enhance reconnection test
Introduce `connTrackListener` to simulate HTTP/2 RST_STREAM with PROTOCOL_ERROR for testing. Refactor and rename `TestReceive_ServerClosesStream` to `TestReceive_ProtocolErrorStreamReconnect` to verify client recovery on protocol errors.
* [client] Update Close error message in test for clarity
* [client] Fine-tune the tests
* [client] Adjust connection tracking in reconnection test
* [client] Wait for Events handler to exit in RST_STREAM reconnection test
Ensure the old `Events` handler exits fully before proceeding in the reconnection test to avoid dropped acknowledgments on a broken stream. Add a `handlerDone` channel to synchronize handler exits.
* [client] Prevent panic on nil connection during Close
* [client] Refactor connection handling to use explicit target tracking
Introduce `target` field to store the gRPC connection target directly, simplifying reconnections and ensuring consistent connection reuse logic.
* [client] Rename `isCancellation` to `isContextDone` and extend handling for `DeadlineExceeded`
Refactor error handling to include `DeadlineExceeded` scenarios alongside `Canceled`. Update related condition checks for consistency.
* [client] Add connection generation tracking to prevent stale reconnections
Introduce `connGen` to track connection generations and ensure that stale `recreateConnection` calls do not override newer connections. Update stream establishment and reconnection logic to incorporate generation validation.
* [client] Add backoff reset condition to prevent short-lived retry cycles
Refine backoff reset logic to ensure it only occurs for sufficiently long-lived stream connections, avoiding interference with `MaxElapsedTime`.
* [client] Introduce `minHealthyDuration` to refine backoff reset logic
Add `minHealthyDuration` constant to ensure stream retries only reset the backoff timer if the stream survives beyond a minimum duration. Prevents unhealthy, short-lived streams from interfering with `MaxElapsedTime`.
* [client] IPv6 friendly connection
parsedURL.Hostname() strips IPv6 brackets. For http://[::1]:443, this turns it into ::1:443, which is not a valid host:port target for gRPC. Additionally, fmt.Sprintf("%s:%s", hostname, port) produces a trailing colon when the URL has no explicit port—http://example.com becomes example.com:. Both cases break the initial dial and reconnect paths. Use parsedURL.Host directly instead.
* [client] Add `handlerStarted` channel to synchronize stream establishment in tests
Introduce `handlerStarted` channel in the test server to signal when the server-side handler begins, ensuring robust synchronization between client and server during stream establishment. Update relevant test cases to wait for this signal before proceeding.
* [client] Replace `receivedAcks` map with atomic counter and improve stream establishment sync in tests
Refactor acknowledgment tracking in tests to use an `atomic.Int32` counter instead of a map. Replace fixed sleep with robust synchronization by waiting on `handlerStarted` signal for stream establishment.
* [client] Extract `handleReceiveError` to simplify receive logic
Refactor error handling in `receive` to a dedicated `handleReceiveError` method. Streamlines the main logic and isolates error recovery, including backoff reset and connection recreation.
* [client] recreate gRPC ClientConn on every retry to prevent dual backoff
The flow client had two competing retry loops: our custom exponential
backoff and gRPC's internal subchannel reconnection. When establishStream
failed, the same ClientConn was reused, allowing gRPC's internal backoff
state to accumulate and control dial timing independently.
Changes:
- Consolidate error handling into handleRetryableError, which now
handles context cancellation, permanent errors, backoff reset,
and connection recreation in a single path
- Call recreateConnection on every retryable error so each retry
gets a fresh ClientConn with no internal backoff state
- Remove connGen tracking since Receive is sequential and protected
by a new receiving guard against concurrent calls
- Reduce RandomizationFactor from 1 to 0.5 to avoid near-zero
backoff intervals
- Move `util/grpc` and `util/net` to `client` so `internal` packages can be accessed
- Add methods to return the next best interface after the NetBird interface.
- Use `IP_UNICAST_IF` sock opt to force the outgoing interface for the NetBird `net.Dialer` and `net.ListenerConfig` to avoid routing loops. The interface is picked by the new route lookup method.
- Some refactoring to avoid import cycles
- Old behavior is available through `NB_USE_LEGACY_ROUTING=true` env var
adds NetFlow functionality to track and log network traffic information between peers, with features including:
- Flow logging for TCP, UDP, and ICMP traffic
- Integration with connection tracking system
- Resource ID tracking in NetFlow events
- DNS and exit node collection configuration
- Flow API and Redis cache in management
- Memory-based flow storage implementation
- Kernel conntrack counters and userspace counters
- TCP state machine improvements for more accurate tracking
- Migration from net.IP to netip.Addr in the userspace firewall