Commit Graph

160 Commits

Author SHA1 Message Date
Zoltan Papp
d9118eb239 [client] Fix WASM peer connection to lazy peers (#5097)
WASM peers now properly initiate relay connections instead of waiting for offers that lazy peers won't send.
2026-01-13 13:33:15 +01:00
Zoltan Papp
9ba067391f [client] Fix semaphore slot leaks (#5018)
- Remove WaitGroup, make SemaphoreGroup a pure semaphore
- Make Add() return error instead of silently failing on context cancel
- Remove context parameter from Done() to prevent slot leaks
- Fix missing Done() call in conn.go error path
2026-01-03 09:10:02 +01:00
Zoltan Papp
537151e0f3 Remove redundant lock in peer update logic to avoid deadlock with exported functions (#4953) 2025-12-17 13:55:33 +01:00
Viktor Liu
d71a82769c [client,management] Rewrite the SSH feature (#4015) 2025-11-17 17:10:41 +01:00
Viktor Liu
9cc9462cd5 [client] Use stdnet with a context to avoid DNS deadlocks (#4781) 2025-11-13 20:16:45 +01:00
Viktor Liu
27957036c9 [client] Fix shutdown blocking on stuck ICE agent close (#4780) 2025-11-13 13:24:51 +01:00
Zoltan Papp
c28275611b Fix agent reference (#4776) 2025-11-11 13:59:32 +01:00
Viktor Liu
c92e6c1b5f [client] Block on all subsystems on shutdown (#4709) 2025-11-05 12:15:37 +01:00
Zoltan Papp
9021bb512b [client] Recreate agent when receive new session id (#4564)
When an ICE agent connection was in progress, new offers were being ignored. This was incorrect logic because the remote agent could be restarted at any time.
In this change, whenever a new session ID is received, the ongoing handshake is closed and a new one is started.
2025-10-08 17:14:24 +02:00
Zoltan Papp
4d33567888 [client] Remove endpoint address on peer disconnect, retain status for activity recording (#4228)
* When a peer disconnects, remove the endpoint address to avoid sending traffic to a non-existent address, but retain the status for the activity recorder.
2025-10-08 03:12:16 +02:00
Zoltan Papp
5e1a40c33f [client] Order the list of candidates for proper comparison (#4561)
Order the list of candidates for proper comparison
2025-09-30 23:40:46 +02:00
Zoltan Papp
e8d301fdc9 [client] Fix/pkg loss (#3338)
The Relayed connection setup is optimistic. It does not have any confirmation of an established end-to-end connection. Peers start sending WireGuard handshake packets immediately after the successful offer-answer handshake.
Meanwhile, for successful P2P connection negotiation, we change the WireGuard endpoint address, but this change does not trigger new handshake initiation. Because the peer switched from Relayed connection to P2P, the packets from the Relay server are dropped and must wait for the next WireGuard handshake via P2P.

To avoid this scenario, the relayed WireGuard proxy no longer drops the packets. Instead, it rewrites the source address to the new P2P endpoint and continues forwarding the packets.

We still have one corner case: if the Relayed server negotiation chooses a server that has not been used before. In this case, one side of the peer connection will be slower to reach the Relay server, and the Relay server will drop the handshake packet.

If everything goes well we should see exactly 5 seconds improvements between the WireGuard configuration time and the handshake time.
2025-09-30 15:31:18 +02:00
Zoltan Papp
bd23ab925e [client] Fix ICE latency handling (#4501)
The GetSelectedCandidatePair() does not carry the latency information.
2025-09-15 15:08:53 +02:00
Zoltan Papp
9e81e782e5 [client] Fix/v4 stun routing (#4430)
Deduplicate STUN package sending.
Originally, because every peer shared the same UDP address, the library could not distinguish which STUN message was associated with which candidate. As a result, the Pion library responded from all candidates for every STUN message.
2025-09-11 10:08:54 +02:00
Zoltan Papp
7aef0f67df [client] Implement environment variable handling for Android (#4440)
Some features can only be manipulated via environment variables. With this PR, environment variables can be managed from Android.
2025-09-08 18:42:42 +02:00
Zoltan Papp
69d87343d2 [client] Debug information for connection (#4439)
Improve logging

Print the exact time when the first WireGuard handshake occurs
Print the steps for gathering system information
2025-09-08 14:51:34 +02:00
Zoltan Papp
786ca6fc79 Do not block Offer processing from relay worker (#4435)
- do not miss ICE offers when relay worker busy
- close p2p connection before recreate agent
2025-09-05 11:02:29 +02:00
Zoltan Papp
21368b38d9 [client] Update Pion ICE to the latest version (#4388)
- Update Pion version
- Update protobuf version
2025-09-01 10:42:01 +02:00
Zoltan Papp
f425870c8e [client] Avoid duplicated agent close (#4383) 2025-08-20 18:50:51 +02:00
Zoltan Papp
12cad854b2 [client] Fix/ice handshake (#4281)
In this PR, speed up the GRPC message processing, force the recreation of the ICE agent when getting a new, remote offer (do not wait for local STUN timeout).
2025-08-18 20:09:50 +02:00
Viktor Liu
1022a5015c [client] Eliminate upstream server strings in dns code (#4267) 2025-08-11 11:57:21 +02:00
Viktor Liu
1d5e871bdf [misc] Move shared components to shared directory (#4286)
Moved the following directories:

```
  - management/client → shared/management/client
  - management/domain → shared/management/domain
  - management/proto → shared/management/proto
  - signal/client → shared/signal/client
  - signal/proto → shared/signal/proto
  - relay/client → shared/relay/client
  - relay/auth → shared/relay/auth
```

and adjusted import paths
2025-08-05 15:22:58 +02:00
Krzysztof Nazarewski (kdn)
af8687579b client: container: support CLI with entrypoint addition (#4126)
This will allow running netbird commands (including debugging) against the daemon and provide a flow similar to non-container usages.

It will by default both log to file and stderr so it can be handled more uniformly in container-native environments.
2025-07-25 11:44:30 +02:00
Louis Li
3f82698089 [client] make ICE failed timeout configurable (#4211) 2025-07-25 10:36:11 +02:00
Zoltan Papp
86c16cf651 [server, relay] Fix/relay race disconnection (#4174)
Avoid invalid disconnection notifications in case the closed race dials.
In this PR resolve multiple race condition questions. Easier to understand the fix based on commit by commit.

- Remove store dependency from notifier
- Enforce the notification orders
- Fix invalid disconnection notification
- Ensure the order of the events on the consumer side
2025-07-21 19:58:17 +02:00
Viktor Liu
d6ed9c037e [client] Fix bind exclusion routes (#4154) 2025-07-21 12:13:21 +02:00
Zoltan Papp
0dab03252c [client, relay-server] Feature/relay notification (#4083)
- Clients now subscribe to peer status changes.
- The server manages and maintains these subscriptions.
- Replaced raw string peer IDs with a custom peer ID type for better type safety and clarity.
2025-07-15 10:43:42 +02:00
Zoltan Papp
fbb1b55beb [client] refactor lazy detection (#4050)
This PR introduces a new inactivity package responsible for monitoring peer activity and notifying when peers become inactive.
Introduces a new Signal message type to close the peer connection after the idle timeout is reached.
Periodically checks the last activity of registered peers via a Bind interface.
Notifies via a channel when peers exceed a configurable inactivity threshold.
Default settings
DefaultInactivityThreshold is set to 15 minutes, with a minimum allowed threshold of 1 minute.

Limitations
This inactivity check does not support kernel WireGuard integration. In kernel–user space communication, the user space side will always be responsible for closing the connection.
2025-07-04 19:52:27 +02:00
Viktor Liu
2a51609436 [client] Handle lazy routing peers that are part of HA groups (#3943)
* Activate new lazy routing peers if the HA group is active
* Prevent lazy peers going to idle if HA group members are active (#3948)
2025-06-20 18:07:19 +02:00
Viktor Liu
d4a800edd5 [client] Fix status recorder panic (#3988) 2025-06-17 01:20:26 +02:00
Zoltan Papp
9d11257b1a [client] Carry the peer's actual state with the notification. (#3929)
- Removed separate thread execution of GetStates during notifications.
- Updated notification handler to rely on state data included in the notification payload.
2025-06-11 13:33:38 +02:00
Viktor Liu
3c535cdd2b [client] Add lazy connections to routed networks (#3908) 2025-06-08 14:10:34 +02:00
Zoltan Papp
9424b88db2 [client] Add output similar to wg show to the debug package (#3922) 2025-06-05 11:51:39 +02:00
Zoltan Papp
af27aaf9af [client] Refactor peer state change subscription mechanism (#3910)
* Refactor peer state change subscription mechanism

Because the code generated new channel for every single event, was easy to miss notification.
Use single channel.

* Fix lint

* Avoid potential deadlock

* Fix test

* Add context

* Fix test
2025-06-03 09:20:33 +02:00
Zoltan Papp
f16f0c7831 [client] Fix HA router switch (#3889)
* Fix HA router switch.

- Simplify the notification filter logic.
Always send notification if a state has been changed

- Remove IP changes check because we never modify

* Notify only the proper listeners

* Fix test

* Fix TestGetPeerStateChangeNotifierLogic test

* Before lazy connection, when the peer disconnected, the status switched to disconnected.
After implementing lazy connection, the peer state is connecting, so we did not decrease the reference counters on the routes.

* When switch to idle notify the route mgr
2025-06-01 16:08:27 +02:00
Zoltan Papp
aa07b3b87b Fix deadlock (#3904) 2025-05-30 23:38:02 +02:00
Zoltan Papp
0492c1724a [client, android] Fix/notifier threading (#3807)
- Fix potential deadlocks
- When adding a listener, immediately notify with the last known IP and fqdn.
2025-05-27 17:12:04 +02:00
Zoltan Papp
daa8380df9 [client] Feature/lazy connection (#3379)
With the lazy connection feature, the peer will connect to target peers on-demand. The trigger can be any IP traffic.

This feature can be enabled with the NB_ENABLE_EXPERIMENTAL_LAZY_CONN environment variable.

When the engine receives a network map, it binds a free UDP port for every remote peer, and the system configures WireGuard endpoints for these ports. When traffic appears on a UDP socket, the system removes this listener and starts the peer connection procedure immediately.

Key changes
Fix slow netbird status -d command
Move from engine.go file to conn_mgr.go the peer connection related code
Refactor the iface interface usage and moved interface file next to the engine code
Add new command line flag and UI option to enable feature
The peer.Conn struct is reusable after it has been closed.
Change connection states
Connection states
Idle: The peer is not attempting to establish a connection. This typically means it's in a lazy state or the remote peer is expired.

Connecting: The peer is actively trying to establish a connection. This occurs when the peer has entered an active state and is continuously attempting to reach the remote peer.

Connected: A successful peer-to-peer connection has been established and communication is active.
2025-05-21 11:12:28 +02:00
Viktor Liu
4a9049566a [client] Set up firewall rules for dns routes dynamically based on dns response (#3702) 2025-04-24 17:37:28 +02:00
Zoltan Papp
c38e07d89a [client] Fix Rosenpass permissive mode handling (#3689)
fixes the Rosenpass preshared key handling to enable successful WireGuard handshakes when one side is in permissive mode. Key changes include:

Updating field accesses from RosenpassPubKey/RosenpassAddr to RosenpassConfig.PubKey/RosenpassConfig.Addr.
Modifying the preshared key computation logic to account for permissive mode.
Revising peer configuration in the Engine to use the new RosenpassConfig struct.
2025-04-16 16:04:43 +02:00
hakansa
7cb366bc7d [client] Remove logrus writer assignment in pion logging (#3684) 2025-04-15 18:15:52 +03:00
hakansa
1ba1e092ce [client] Enhance DNS forwarder to track resolved IPs with resource IDs on routing peers (#3620)
[client] Enhance DNS forwarder to track resolved IPs with resource IDs on routing peers (#3620)
2025-04-07 15:16:12 +08:00
Maycon Santos
fbd783ad58 [client] Use the netbird logger for ice and grpc (#3603)
updates the logging implementation to use the netbird logger for both ICE and gRPC components. The key changes include:

- Introducing a gRPC logger configuration in util/log.go that integrates with the netbird logging setup.
- Updating the log hook in formatter/hook/hook.go to ensure a default caller is used when not set.
- Refactoring ICE agent and UDP multiplexers to use a unified logger via the new getLogger() method.
2025-04-04 18:30:47 +02:00
Zoltan Papp
21464ac770 [client] Fix close WireGuard watcher (#3598)
This PR fixes issues with closing the WireGuard watcher by adjusting its asynchronous invocation and synchronization.

Update tests in wg_watcher_test.go to launch the watcher in a goroutine and add a delay for timing.
Modify wg_watcher.go to run the periodic handshake check synchronously by removing the waitGroup and goroutine.
Enhance conn.go to wait on the watcher wait group during connection close and add a note for potential further synchronization
2025-03-28 20:12:31 +01:00
Zoltan Papp
ed5647028a [client] Prevent calling the onDisconnected callback in incorrect state (#3582)
Prevent calling the onDisconnected callback if the ICE connection has never been established

If call onDisconnected without onConnected then overwrite the relayed status in the conn priority variable.
2025-03-28 18:08:26 +01:00
hakansa
fceb3ca392 [client] fix route handling for local peer state (#3586) 2025-03-27 19:31:04 +08:00
Maycon Santos
bd8f0c1ef3 [client] add profiling dumps to debug package (#3517)
enhances debugging capabilities by adding support for goroutine, mutex, and block profiling while updating state dump tracking and refining test and release settings.

- Adds pprof-based profiling for goroutine, mutex, and block profiles in the debug bundle.
- Updates state dump functionality by incorporating new status and key fields.
- Adjusts test validations and default flag/retention settings.
2025-03-23 13:46:09 +01:00
Maycon Santos
c02e236196 [client,management] add netflow support to client and update management (#3414)
adds NetFlow functionality to track and log network traffic information between peers, with features including:

- Flow logging for TCP, UDP, and ICMP traffic
- Integration with connection tracking system
- Resource ID tracking in NetFlow events
- DNS and exit node collection configuration
- Flow API and Redis cache in management
- Memory-based flow storage implementation
- Kernel conntrack counters and userspace counters
- TCP state machine improvements for more accurate tracking
- Migration from net.IP to netip.Addr in the userspace firewall
2025-03-20 17:05:48 +01:00
Viktor Liu
0ef476b014 [client] Fix state dump panic (#3519) 2025-03-16 15:13:04 +01:00
Zoltan Papp
6f82e96d6a [client] Set info logs (#3504)
collect and log connection stats per peer every 10 minutes
2025-03-14 22:34:41 +01:00