Files
netbird-docs/src/pages/help/troubleshooting-client.mdx
Jack Carter 607f4af66e docs: consolidate and refine DNS troubleshooting guide (#516)
* docs: consolidate DNS troubleshooting links and remove redundancies

* rewrote to improve readability

* Update recommended version for fleet upgrade

---------

Co-authored-by: Ashley Mensah <ashley@netbird.io>
2025-12-15 11:37:31 +01:00

634 lines
24 KiB
Plaintext

# Troubleshooting client issues
This document offers practical tips and insights to help you debug various problems, ensuring a seamless user
experience.
## NetBird agent status
The netbird agent is a daemon service that runs in the background; it provides information about peers connected and
about the NetBird control services. You can check the status of the agent with the following command:
```shell
netbird status --detail
````
This will output the following information:
```shell
Peers detail:
server-a.netbird.cloud:
NetBird IP: 100.75.232.118/32
Public key: kndklnsakldvnsld+XeRF4CLr/lcNF+DSdkd/t0nZHDqmE=
Status: Connected
-- detail --
Connection type: P2P
Direct: true
ICE candidate (Local/Remote): host/host
ICE candidate endpoints (Local/Remote): 10.128.0.35:51820/10.128.0.54:51820
Last connection update: 20 seconds ago
Last Wireguard handshake: 19 seconds ago
Transfer status (received/sent) 6.1 KiB/20.6 KiB
Quantum resistance: false
Routes: 10.0.0.0/24
Latency: 37.503682ms
server-b.netbird.cloud:
NetBird IP: 100.75.226.48/32
Public key: Mi6jtrK5Tokndklnsakldvnsld+XeRF4CLr/lcNF+DSdkd=
Status: Connected
-- detail --
Connection type: Relayed
Direct: false
ICE candidate (Local/Remote): relay/host
ICE candidate endpoints (Local/Remote): 108.54.10.33:60434/10.128.0.12:51820
Last connection update: 20 seconds ago
Last Wireguard handshake: 18 seconds ago
Transfer status (received/sent) 6.1 KiB/20.6 KiB
Quantum resistance: false
Routes: -
Latency: 37.503682ms
OS: darwin/amd64
Daemon version: 0.27.4
CLI version: 0.27.4
Management: Connected to https://api.netbird.io:443
Signal: Connected to https://signal.netbird.io:443
Relays:
[stun:turn.netbird.io:5555] is Available
[turns:turn.netbird.io:443?transport=tcp] is Available
Nameservers:
[8.8.8.8:53, 8.8.4.4:53] for [.] is Available
FQDN: maycons-mbp-2.netbird.cloud
NetBird IP: 100.75.143.239/16
Interface type: Kernel
Quantum resistance: false
Routes: -
Peers count: 2/2 Connected
```
As you can see, the output shows the peers connected, the NetBird IP address, the public key, the connection status, and
the connection type. The status will also report if there is an issue connecting to the relay servers,
the management server, or the signal server.
As for Peers, the status will show the following information:
* `Connection type`: P2P, Relayed, where relayed connections indicate a limitation in the network that prevents a direct
connection between the peers.
* `Direct`: true/false, where true indicates a direct connection between the peers without a local proxy. This case is
common when the local peer is allocating the relay connection.
* `ICE candidate (Local/Remote)`: relay/host, where relay indicates that the local peer is using a relay connection and
host indicates that the remote peer is using a direct connection.
* `Last Wireguard handshake`: Indicating the last time the Wireguard handshake was performed. Usually, this is performed
every 2 minutes, and if you don\'t see an update here or if the value is empty, that indicates that the connection
wasn\'t possible yet.
* `Transfer status (received/sent)`: Indicating the amount of data received and sent by the peer. This is useful to
check if the connection is being used.
See more details about the status command [here](/get-started/cli#status).
## Getting client logs
By default, client logs are located in the `/var/log/netbird/client.log` file on macOS and Linux and in the
`C:\ProgramData\netbird\client.log` file on Windows.
You can analyze the logs to identify the root cause of the problem. If you need help, open
a [github issue](https://github.com/netbirdio/netbird/issues/new/choose) and attach the logs.
### Debug bundle
A debug archive containing the recent logs and the status at the time of execution can be generated with the following
command.
Adding the `--anonymize (-A)` flag will anonymize the logs, removing sensitive information such as public IP addresses
and domain
names. In case you have tunneling issues, omitting the `--anonymize` flag might help our analysis.
Adding the `--system-info (-S)` flag will add system information like network routes and interfaces
```shell
netbird debug bundle --anonymize --system-info
```
This will output the path of the generated file. The output file is owned by and can only be accessed by the user
NetBird is running as, by default it is: `Administrator` on Windows, `root` on MacOS/Linux or the operating system\'s
equivalent.
### Debug for a specific time
To capture logs for a specific time period, you can use the `debug for` command. This will generate a debug bundle after
the specified time has elapsed.
```shell
netbird debug for 5m --system-info
```
<Note>
The flag `--anonymize (-A)` can be used to anonymize IP addresses and non-netbird.io domains in logs and status output when needed.
</Note>
To capture any issues arising during the `up` and `down` processes, this will set the log level to `TRACE` and bring
netbird `up` and `down` up to a few times.
After 5 minutes the netbird status will be restored to the previous state and the debug bundle will be generated.
### Debug bundle uploads
Since version `0.43.1`, you can share debug bundle with the NetBird development team without local administrative
privileges
by using the `--upload-bundle (-U)` flag.
It will securely generate and upload the debug bundle to our servers for access by the NetBird development team. See
examples below:
Run debug for a specific time and upload the bundle:
```shell
netbird debug for 1m --system-info --upload-bundle
```
To generate a bundle without restarting the client and then uploading:
```shell
netbird debug bundle --system-info --upload-bundle
```
This will output an `Upload file key`, which is effectively a random filename in our internal storage system
and can be safely shared with us through public channels such as GitHub Issues or Slack.
```shell
netbird debug bundle --system-info --upload-bundle
Local file:
/tmp/netbird.debug.2611377582.zip
Upload file key:
1234567890ab27fb37c88b3b4be7011e22aa2e5ca6f38ffa9c4481884941f726/12345678-90ab-cdef-1234-567890abcdef
```
<Note>
The flag `--anonymize` can be used to anonymize IP addresses and non-netbird.io domains in logs and status output when needed.
</Note>
### Debug bundle uploads with GUI
Since version `0.43.2` users can upload their debug bundle via the GUI client.
To generate a bundle via GUI, you can access the application then go to `Settings` > `Create Debug Bundle` and follow
the wizard to upload the bundle:
<p>
<img src="/docs-static/img/help/troubleshooting-client/ui-settings.png" alt="service-user-overview" className="imagewrapper-big"/>
</p>
<Note>
If needed, you can update the upload URL and select to anonymize sensitive information like IP addresses and non-netbird.io domains in logs and status output.
</Note>
<p>
<img src="/docs-static/img/help/troubleshooting-client/ui-bundle-wizard.png" alt="service-user-overview" className="imagewrapper-big"/>
</p>
By default running with trace log enable before generating the bundle is selected. This will restart the client connections and provide a `disconnect to connected` information for our engineers.
If you uncheck this option, a bundle will be generated without running this step. Which is very useful when you have an
issue that recovers when restarting the client.
<p>
<img src="/docs-static/img/help/troubleshooting-client/ui-bundle-success.png" alt="service-user-overview" className="imagewrapper-big"/>
</p>
Once the bundle generation is complete, you can click on `Copy Key` to get the uploaded key and share with NetBird\'s team.
## Enabling debug logs on agent
Logs can be temporarily set using the following command.
```shell
netbird debug log level debug
```
or
```shell
netbird debug log level trace
```
The next time the daemon is restarted, the log level will return to the configured level.
Using `netbird down` and `netbird up` will not reset the log level.
To permanently set the log level, see the following sections.
<Note>
The default logging level is `info`. To revert back to the original state, you can repeat the procedure with `info` instead of `debug` or `trace`.
</Note>
### On Linux with systemd
The default systemd unit file reads a set of environment variables from the path `/etc/sysconfig/netbird`.
You can add the following line to the file to enable debug logs:
```shell
sudo mkdir -p /etc/sysconfig
echo 'NB_LOG_LEVEL=debug' | sudo tee -a /etc/sysconfig/netbird
sudo systemctl restart netbird
```
### On Other Linux and MacOS
```shell
sudo netbird service stop
sudo netbird service uninstall
sudo netbird service install --log-level debug # or trace
sudo netbird service start
```
### On Windows
You need to run the following commands with an elevated PowerShell or `cmd.exe` window.
```shell
[Environment]::SetEnvironmentVariable("NB_LOG_LEVEL", "debug", "Machine")
netbird service restart
```
### On Docker
You can set the environment variable `NB_LOG_LEVEL` to `debug` to enable debug logs.
```shell
docker run --rm --name PEER_NAME --hostname PEER_NAME --cap-add=NET_ADMIN --cap-add=SYS_ADMIN --cap-add=SYS_RESOURCE -d \
-e NB_SETUP_KEY=<SETUP KEY> -e NB_LOG_LEVEL=debug -v netbird-client:/var/lib/netbird netbirdio/netbird:latest
```
### On Android
Enable the ADB in the developer menu on the Android device.
In the app set the the Trace log level setting - it is a checkbox in the advanced menu.
With the ADB tool, you can get the logs from your device. The ADB is part of the SDK platform tools pack (zip file).
You can download it from [here](https://developer.android.com/tools/releases/platform-tools).
Please extract it and run the next command in the case of Linux:
```shell
sudo adb logcat -v time | grep GoLog
```
## Running the agent in foreground mode
You can run the agent in foreground mode to see the logs in the terminal. This is useful to debugging issues with the
agent.
### Linux and MacOS
```shell
sudo netbird service stop
sudo netbird up -F
```
### Windows
On Windows, the agent depends on the Wireguard's `wintun.dll` and can only be executed as a system account.
To run the agent in foreground mode, you need to use a tool
called [PSExec](https://learn.microsoft.com/en-us/sysinternals/downloads/psexec).
Once you have downloaded and extracted `psexec` open an elevated Powershell window:
```shell
netbird service stop
.\PsExec64.exe -s cmd.exe /c "netbird up -F --log-level debug > c:\windows\temp\netbird.out.log 2>&1"
```
In case you need to configure environment variables, you need to add them as system variables so they get picked up by
the agent on the next psexec run:
```shell
[Environment]::SetEnvironmentVariable("PIONS_LOG_DEBUG", "all", "Machine")
````
## Enabling WireGuard in user space
Sometimes, you want to test NetBird running on userspace mode instead of a kernel module. That can be a check to see if
there is a problem with NetBird's firewall management in kernel mode.
You must run the agent in foreground mode and set the environment variable `NB_WG_KERNEL_DISABLED` to `true`.
```shell
sudo netbird service stop
sudo bash -c 'NB_WG_KERNEL_DISABLED=true netbird up -F' > /tmp/netbird.log
```
## Debugging GRPC
The NetBird agent communicates with the Management and Signal servers using the GRPC framework. With these parameters,
you can
set verbose logging for this service.
```shell
sudo netbird service stop
sudo bash -c 'GRPC_GO_LOG_VERBOSITY_LEVEL=99 GRPC_GO_LOG_SEVERITY_LEVEL=info netbird up -F' > /tmp/netbird.log
```
## Debugging ICE connections
The Netbird agent communicates with other peers through the Interactive Connectivity Establishment (ICE) protocol
described in the [RFC 8445](https://datatracker.ietf.org/doc/html/rfc8445). To debug the connection procedure,
set verbose logging for the the [Pion/ICE](https://github.com/pion/ice) library with the `PIONS_LOG_DEBUG` or
`PIONS_LOG_TRACE` variable.
```shell {{ title: 'Environment variable' }}
PIONS_LOG_DEBUG=all
NB_LOG_LEVEL=debug
```
```shell
sudo netbird service stop
sudo bash -c 'PIONS_LOG_DEBUG=all NB_LOG_LEVEL=debug netbird up -F' > /tmp/netbird.log
```
## Client login failures
A single machine can only connect to one NetBird account as the same user/login method throughout the lifetime of
the `config.json` file:
- `/var/lib/netbird/default.json` for Linux/MacOS
- `C:\ProgramData\netbird\default.json` for Windows
You might get errors like below when trying to use Setup Key/different SSO user account during login:
```
2025-04-08T15:03:04+01:00 ERRO management/client/grpc.go:351: failed to login to Management Service: rpc error: code = PermissionDenied desc = peer login has expired, please log in once more
2025-04-08T15:03:04+01:00 ERRO management/client/grpc.go:351: failed to login to Management Service: rpc error: code = PermissionDenied desc = invalid user
2025-04-08T15:03:04+01:00 ERRO client/internal/login.go:145: failed registering peer rpc error: code = PermissionDenied desc = invalid user,00000000-0000-0000-0000-000000000000
2025-04-08T15:03:04+01:00 WARN client/server/server.go:267: failed login: rpc error: code = PermissionDenied desc = invalid user
```
<Note>
Starting with the release `0.50.0` the `invalid user` message is more descriptive:
`peer is already registered by a different User or a Setup Key`
</Note>
The most notable examples of encountering the issue are:
- a shared machine and/or machine previously logged in by somebody else,
- a machine that was previously logged in using Setup Key, but now attempts SSO login,
- the user makes a mistake and selects
- the user uses different browser/profile or selects the wrong account during SSO login at the start of the workday,
If you know the exact previous Peer which was logged in, you can just delete it from Dashboard without doing anything
else and attempt login again.
Otherwise, to resolve the issue, you will need to remove the file manually to use the machine as a different user/Setup
Key while the NetBird client daemon is stopped:
1. `netbird service stop`
2. `sudo rm /var/lib/netbird/default.json` (*nix) or `rm C:\ProgramData\netbird\config.json` (Windows)
3. `netbird service start`
## Debugging access to network resources
In this section we will be presenting methodology of troubleshooting access issues involving Netbird.
We will start by presenting a glossary of all machines and services involved.
A sub-section will describe a specific use case.
Each will start with a concise summary of usual troubleshooting steps then expand into more detailed step-by-step
guides.
### Glossary
We will be using the following names for resources outside the Netbird network:
- `int-net1`: an internal network `10.123.45.0/24`,
- `srv-c`: an internal HTTP server running at `10.123.45.17`,
- `int-dns1`: an internal DNS server running at `10.123.45.6`,
- `int-dns2`: an internal DNS server nunning at `10.7.8.9`,
- `cf-dns`: an Internet-accessible CloudFlare DNS server at `1.1.1.1` and `1.0.0.1`,
and following Netbird network resources:
- `peer-a`: end user's device running Netbird Client,
- `peer-b`: a linux server inside the internal network running Netbird Client,
- it has direct access to the whole `int-net1` IP range,
- `users:employees`: a Netbird Group containing `peer-a`,
- `routers:int-net1`: a Netbird Group containing `peer-b`,
- `access:srv-c`: a Netbird Groups used as a target of ACL rules for `srv-c` only,
- `access:int-net1`: a Netbird Groups used as a target of ACL rules for the whole subnet,
- `net-a`: a Netbird Network
- `net-a:srv-c`: a Network Resource handling traffic to `10.123.45.17/32` (`srv-c`),
- `net-a:int-net1`: a Network Resource handling traffic to `10.123.45.0/24` (`int-net1`),
- `route:int-net1`: a Netbird Network Route handling traffic to `10.123.45.0/24` (`int-net1`),
- `route:srv-c`: a Netbird Network Route handling traffic to `10.123.45.17/32` (`srv-c`),
### Access from `peer-a` to `srv-c`
In short:
1. Does `peer-b` have direct access to `srv-c`'s port `80`?
2. Can a routing peer `peer-b` forward traffic to `srv-c`?
3. Are Netbird's network routing resources configured?
4. Do Netbird's Access Control rules allow access from `peer-a` to the target's ACL Group?
5. Is `peer-a`'s operating system configured to use the route?
Access Control rule is not required for connectivity from `peer-a` to `peer-b`
#### Does `peer-b` have direct access to `srv-c`'s port `80`?
After logging in to `peer-b` you can confirm/troubleshoot the HTTP port `80` connection by issuing any of the following
commands:
```shell
curl -v "http://10.123.45.17"
curl --fail -v --max-time=2 "http://10.123.45.17:80"
wget -O - --timeout=2 "http://10.123.45.17:80"
nc -nvz -w 2 10.123.45.17 80
```
You can also try `ping` (an ICMP packet), but the firewall might have a different configuration for ICMP and TCP ports:
```shell
ping --numeric --count=1 --timeout=2 10.123.45.17
```
#### Can a routing peer `peer-b` forward traffic to `srv-c`?
This is more complicated to test, but usually boils down to confirming `net.ipv4.ip_forward` is set to `1` on `peer-b`'s
Linux operating system:
```shell
> sysctl net.ipv4.ip_forward
net.ipv4.ip_forward = 1
```
It should be set up automatically by the Netbird client unless it runs inside a container (which would not be able
to modify `sysctl`), then it requires manual setup.
For setting up the value persistently (across reboots) please consult your operating system's documentation.
It is often handled by either `/etc/sysctl.conf` or `/etc/sysctl.d/*.conf` files.
Testing the functionality in practice involves:
- connecting to another machine with direct access to `peer-b`,
- adding a routing table entry to route `int-net1` (`10.123.45.0/24`) traffic through it,
- trying to at least `ping 10.123.45.17` (`srv-c`)
#### Are Netbird's network routing resources configured?
For Netbird network routing resources configurations you can use either (new) _Networks_ or (old) _Network Routes_.
A Network `net-a` should have at minimum:
- _Network Resource_: `net-a:srv-c` with either of:
- an _Address_ set to `10.123.45.17/32` to configure route to `srv-c` exclusively and nothing else,
- _Assigned Groups_ set to `access:srv-c`
- _Routing Peer Group_ assigned to `routers:int-net1`
A _Network Route_ `route:srv-c` should have at least:
- a _Network Range_ set to `10.123.45.17/32` to configure route to `srv-c` exclusively and nothing else,
- _Routing Peer Group_ assigned to `routers:int-net1`,
- _Distribution Group_ assigned to `users:employees`,
- (optional) _Access Control Groups_ assigned to `access:srv-c`,
You can loosen the rules and replace following to grant access to the whole `int-net1` network range:
- _Address_: `10.123.45.17/32` -> `10.123.45.0/24`,
- _Assigned Groups_ / _Access Control Groups_: `access:srv-c` -> `access:int-net1`
#### Do Netbird's Access Control rules allow access from `peer-a` to the target's ACL Group?
You can skip this check, when you are using (old) Network Route feature without filling in _Access Control Groups (
optional)_ section.
Otherwise, there should be an _Access Control Policy_ present allowing traffic from one of `peer-a`'s Groups to:
- _Networks Resource_'s _Assigned Groups_: `access:srv-c` or `access:int-net1`,
- _Network Route_'s _Access Control Groups_: `access:srv-c` or `access:int-net1`,
You can confirm the _Policy_ is working by:
1. logging in to `peer-a`,
2. issuing `netbird status -d` command,
3. finding `peer-b.netbird.cloud` under `Peers detail`,
4. finding `10.123.45.0/24` or `10.123.45.17/32` under `peer-b.netbird.cloud`'s _Networks_ field,
In the most specific setup it should have at:
- have `TCP` protocol selected,
- a blue arrow should point from left to right and a second right-to-left arrow should be greyed out,
- a _Source group_ set to `users:employees`,
- a _Destination group_ set to `access:srv-c`,
- have `80` in the Ports section,
Just like with the previous section you can loosen the above example by:
- replacing `access:srv-c` _Group_ with `access:int-net1` _Group_,
- allowing `ALL` protocol, _Ports_ will become greyed out because all traffic will be allowed,
- creating a bidirectional rule (both arrows should be green), always true for the protocol `ALL`,
- selecting a different source group from the pool assigned to `peer-a`,
- it could be built-in `All` group, but it is discouraged,
- selecting a different destination group from the pool assigned to `peer-b`,
- it could be built-in `All` group, but it is discouraged,
#### Is `peer-a`'s operating system configured to use the route?
After all resources are configured in the Netbird management you should check whether they are
properly registered with your operating system.
You can start by checking Netbird client's configuration with `netbird status -d` command:
```shell
% netbird status -d
Peers detail:
brys-vm-nbt-ubuntu-isolated-01.netbird.cloud:
...
Status: Connected
-- detail --
Connection type: P2P
...
Networks: 10.123.45.0/24
...
Peers count: 1/1 Connected
```
You should be primarily looking for _Networks_ section under each _Peers detail_, but you can also check:
- _Peer_'s name,
- _Peer_'s _Status_: it should be `Connected`,
- _Peer_'s _Connection type_: it can be either `P2P` (direct) or `Relayed` (over the Internet),
- _Peers count_ near the end of the output,
If it's missing you can search for clues with `netbird networks ls` command:
```shell
% netbird networks ls
Available Networks:
...
- ID: net-a:int-net1
Network: 10.123.45.0/24
Status: Selected
...
```
The _Status_ could be `Not Selected`, which you can fix with `netbird networks select <ID>` or
`netbird networks select all`
##### Verifying routing configuration on the Windows operating system
Below commands assume running a PowerShell prompt with administrator's privileges.
The easiest way is to read output of `Get-NetRoute` command:
```shell
PS C:\Users\user> Get-NetRoute
ifIndex DestinationPrefix NextHop RouteMetric ifMetric PolicyStore
------- ----------------- ------- ----------- -------- -----------
...
17 10.123.45.255/32 0.0.0.0 256 5 ActiveStore
17 10.123.45.0/24 0.0.0.0 1 5 ActiveStore
...
17 100.83.255.255/32 0.0.0.0 256 5 ActiveStore
17 100.83.183.133/32 0.0.0.0 256 5 ActiveStore
17 100.83.0.0/16 0.0.0.0 256 5 ActiveStore
...
```
You should be looking for your specific subnet's IP ranges (`10.123.45.0/24` in case of `int-net1`) and anything from
`100.*.0.0/16` range.
Some other alternatives are `route print` & `Get-NetIPConfiguration`.
##### Verifying routing configuration on the MacOS operating system
The easiest way to verify system configuration is `netstat -nr` command:
```shell
% netstat -nr
Routing tables
Internet:
Destination Gateway Flags Netif Expire
...
100.83/16 utun100 USc utun100
100.83.19.63 100.83.19.63 UH utun100
...
10.123.45 utun100 USc utun100
...
Internet6:
Destination Gateway Flags Netif Expire
...
```
You should be looking for `utun*` interface in 4th column and searching the rows for
your specific subnet's clamped IP ranges (`10.123.45` in case of `int-net1`) and anything from `100.*/16` range.
##### Verifying routing configuration on the Linux operating system
Depending on specifics of your Linux distribution (or even your configuration of it) you should be able to use either
`iproute2` or `net-tools` family of network commands.
Netbird client stores it's custom routes in the routing table `7120` (or `0x1BD0`) when it's available (through
`iproute2` interface).
For `iproute2` (`ip`, `ss` tools):
- `ip route` to find built-in `100.*.0.0/16` route,
- `ip route show table 7120` or `ip route show table all` to find the specific routed networks,
For `net-tools` (`ifconfig`, `route`, `netstat` tools):
- `route -n` to find built-in `100.*.0.0/16` route,
- neither `route` nor `netstat` support viewing content of custom routing tables,