Deduplicate STUN package sending.
Originally, because every peer shared the same UDP address, the library could not distinguish which STUN message was associated with which candidate. As a result, the Pion library responded from all candidates for every STUN message.
In this PR, speed up the GRPC message processing, force the recreation of the ICE agent when getting a new, remote offer (do not wait for local STUN timeout).
This will allow running netbird commands (including debugging) against the daemon and provide a flow similar to non-container usages.
It will by default both log to file and stderr so it can be handled more uniformly in container-native environments.
Avoid invalid disconnection notifications in case the closed race dials.
In this PR resolve multiple race condition questions. Easier to understand the fix based on commit by commit.
- Remove store dependency from notifier
- Enforce the notification orders
- Fix invalid disconnection notification
- Ensure the order of the events on the consumer side
- Clients now subscribe to peer status changes.
- The server manages and maintains these subscriptions.
- Replaced raw string peer IDs with a custom peer ID type for better type safety and clarity.
This PR introduces a new inactivity package responsible for monitoring peer activity and notifying when peers become inactive.
Introduces a new Signal message type to close the peer connection after the idle timeout is reached.
Periodically checks the last activity of registered peers via a Bind interface.
Notifies via a channel when peers exceed a configurable inactivity threshold.
Default settings
DefaultInactivityThreshold is set to 15 minutes, with a minimum allowed threshold of 1 minute.
Limitations
This inactivity check does not support kernel WireGuard integration. In kernel–user space communication, the user space side will always be responsible for closing the connection.
- Removed separate thread execution of GetStates during notifications.
- Updated notification handler to rely on state data included in the notification payload.
* Refactor peer state change subscription mechanism
Because the code generated new channel for every single event, was easy to miss notification.
Use single channel.
* Fix lint
* Avoid potential deadlock
* Fix test
* Add context
* Fix test
* Fix HA router switch.
- Simplify the notification filter logic.
Always send notification if a state has been changed
- Remove IP changes check because we never modify
* Notify only the proper listeners
* Fix test
* Fix TestGetPeerStateChangeNotifierLogic test
* Before lazy connection, when the peer disconnected, the status switched to disconnected.
After implementing lazy connection, the peer state is connecting, so we did not decrease the reference counters on the routes.
* When switch to idle notify the route mgr
With the lazy connection feature, the peer will connect to target peers on-demand. The trigger can be any IP traffic.
This feature can be enabled with the NB_ENABLE_EXPERIMENTAL_LAZY_CONN environment variable.
When the engine receives a network map, it binds a free UDP port for every remote peer, and the system configures WireGuard endpoints for these ports. When traffic appears on a UDP socket, the system removes this listener and starts the peer connection procedure immediately.
Key changes
Fix slow netbird status -d command
Move from engine.go file to conn_mgr.go the peer connection related code
Refactor the iface interface usage and moved interface file next to the engine code
Add new command line flag and UI option to enable feature
The peer.Conn struct is reusable after it has been closed.
Change connection states
Connection states
Idle: The peer is not attempting to establish a connection. This typically means it's in a lazy state or the remote peer is expired.
Connecting: The peer is actively trying to establish a connection. This occurs when the peer has entered an active state and is continuously attempting to reach the remote peer.
Connected: A successful peer-to-peer connection has been established and communication is active.
fixes the Rosenpass preshared key handling to enable successful WireGuard handshakes when one side is in permissive mode. Key changes include:
Updating field accesses from RosenpassPubKey/RosenpassAddr to RosenpassConfig.PubKey/RosenpassConfig.Addr.
Modifying the preshared key computation logic to account for permissive mode.
Revising peer configuration in the Engine to use the new RosenpassConfig struct.
updates the logging implementation to use the netbird logger for both ICE and gRPC components. The key changes include:
- Introducing a gRPC logger configuration in util/log.go that integrates with the netbird logging setup.
- Updating the log hook in formatter/hook/hook.go to ensure a default caller is used when not set.
- Refactoring ICE agent and UDP multiplexers to use a unified logger via the new getLogger() method.
This PR fixes issues with closing the WireGuard watcher by adjusting its asynchronous invocation and synchronization.
Update tests in wg_watcher_test.go to launch the watcher in a goroutine and add a delay for timing.
Modify wg_watcher.go to run the periodic handshake check synchronously by removing the waitGroup and goroutine.
Enhance conn.go to wait on the watcher wait group during connection close and add a note for potential further synchronization
Prevent calling the onDisconnected callback if the ICE connection has never been established
If call onDisconnected without onConnected then overwrite the relayed status in the conn priority variable.