Commit Graph

2850 Commits

Author SHA1 Message Date
Ashley Mensah
5bf17023fb feat(ci): sort by oldest updated and bump MAX_ISSUES to 100 2026-04-30 11:53:40 +02:00
Ashley Mensah
42e1f19689 fix(ci): remove unwritable fields, add Status single-select support
- Remove Linked PRs and Repository field writes (built-in, auto-populated)
- Add setSingleSelectField for Status field
- Set items to "Needs Review" when added to project board
- Clean up workflow env vars to only include writable fields
2026-04-30 11:34:14 +02:00
Ashley Mensah
8195aa43d5 fix(ci): skip linked_pull_requests field (not writable via API) 2026-04-29 18:36:36 +02:00
Ashley Mensah
6c4ce8df87 fix(ci): use number mutation for Confidence project field 2026-04-29 18:29:12 +02:00
Ashley Mensah
ca481b0ffc feat(ci): use PROJECT_PAT for project board, populate board in dry-run
- Add PROJECT_PAT secret for project board GraphQL calls, falling
  back to GITHUB_TOKEN if not set
- Dry-run now populates the project board with decisions (confidence,
  reason, evidence fields) while still skipping labels/comments/closing
- Extract addToProjectWithFields helper to reduce duplication
2026-04-29 17:45:46 +02:00
Ashley Mensah
6d118d9c99 feat(ci): trust LLM decisions and feed it PR merge status
- Remove pre_score override from enforcePolicy — policy now only gates
  AUTO_CLOSE, otherwise trusts the model's decision
- Pass pre_score evidence (hard signals, contradictions) to LLM as
  context instead of using it as a decision override
- Fetch linked PR merge status (MERGED/OPEN/CLOSED) in fetch step
  and include in LLM prompt so it can distinguish merged fixes from
  open proposals
2026-04-29 15:49:09 +02:00
Ashley Mensah
222d498bb6 fix(ci): distinguish workarounds from actual fixes in system prompt 2026-04-28 17:48:43 +02:00
Ashley Mensah
52cd104f1e chore(ci): switch back to gpt-4o-mini for higher quota 2026-04-28 17:44:05 +02:00
Ashley Mensah
92f666f652 fix(ci): cap retry-after and handle quota exhaustion gracefully 2026-04-28 17:42:43 +02:00
Ashley Mensah
4fc0cb7ec4 fix(ci): pace API requests to avoid rate limit thrashing 2026-04-28 17:30:18 +02:00
Ashley Mensah
695614834e fix(ci): fix policy logic and add message truncation
- enforcePolicy: respect KEEP_OPEN when model is confident and
  pre_score is low. Only promote to MANUAL_REVIEW when model suggests
  resolution or pre_score has hard signals
- Truncate user messages to 24k chars (issue body capped at 4k) to
  stay within GitHub Models 8000 token input limit
2026-04-28 16:38:36 +02:00
Ashley Mensah
d75fa6ad45 feat(ci): switch to gpt-4o and add rate limit retry
- Upgrade model from gpt-4o-mini to gpt-4o for better classification
- Add retry loop with backoff on 429 responses (up to 5 retries)
- Respect Retry-After header from GitHub Models API
2026-04-28 16:23:31 +02:00
Ashley Mensah
7ce7f322eb increase max number of dry run issues to 100 2026-04-28 16:13:38 +02:00
Ashley Mensah
22bcf70b6e fix(ci): add additionalProperties to nested schema objects 2026-04-28 16:11:57 +02:00
Ashley Mensah
fe8aa21245 feat(ci): wire up gpt-4o-mini for issue classification
- Replace stub callGitHubModel() with real GitHub Models API call
  using gpt-4o-mini with structured JSON output
- Build detailed user messages from issue body, comments, and timeline
- Add per-issue decision logging to classify step
- Upload candidates.json and decisions.json as workflow artifacts
2026-04-28 16:11:01 +02:00
Ashley Mensah
29f211e51c fix(ci): guard all decisions behind dry-run check
Move the dry-run check to the top of the loop so it applies to all
decision types, not just AUTO_CLOSE. In dry-run mode the workflow now
only logs what it would do without touching any issues.
2026-04-28 16:08:16 +02:00
Ashley Mensah
6df3580bd3 fix(ci): handle project API permission errors gracefully
GITHUB_TOKEN cannot access org-level Projects V2. Make addToProject
return null on failure instead of crashing, and skip setTextField
calls when project access is unavailable. A PAT with project scope
is needed for full project board integration.
2026-04-28 15:55:12 +02:00
Ashley Mensah
9ff735dd52 fix(ci): fix workflow by adding working-directory, fetch script, and removing npm ci
- Set defaults.run.working-directory to .github/issue-resolution so
  scripts resolve from the correct path
- Remove npm ci step (no npm dependencies needed)
- Add fetch-candidates.mjs to gather open issues with comments and
  timeline events via GitHub REST API
- Add minimal package.json with type: module
2026-04-28 15:53:17 +02:00
Ashley Mensah
7c43973bc9 fix typo in workflow 2026-04-28 15:48:35 +02:00
Ashley Mensah
01e53d07b9 fix typo in workflow 2026-04-28 15:46:57 +02:00
Ashley Mensah
797dce1631 dummy commit to test workflow 2026-04-28 15:43:37 +02:00
Ashley Mensah
2877fcbbf6 add push trigger to workflow 2026-04-28 15:42:50 +02:00
Ashley Mensah
5d8201fcd0 Merge branch 'main' into github-issue-resolver 2026-04-28 15:36:54 +02:00
Ashley Mensah
09595bd0c2 enable dry run, add project field id values 2026-04-28 15:36:21 +02:00
Zoltan Papp
8fc4265995 [relay] evict foreign client cache on disconnect (#6015)
* [relay] evict foreign client cache on disconnect

When a foreign relay's TCP connection drops, the manager's
onServerDisconnected handler only triggered reconnect logic for the
home server; the disconnected foreign entry stayed in the relayClients
cache. Subsequent OpenConn calls reused the closed client until the
60-second cleanup tick evicted it, breaking peer connectivity through
that relay for up to a minute.

Evict the foreign entry from the cache on disconnect so the next
OpenConn dials a fresh client.

Also:
- Make the reconnect backoff cap configurable via WithMaxBackoffInterval
  ManagerOption; the previous hard-coded 60s constant forced
  TestAutoReconnect to sleep ~61s. Test now polls Ready() and finishes
  in ~2s.
- Add NB_HOME_RELAY_SERVERS env var that overrides the relay URL list
  received from management, so a peer can be pinned to a specific home
  relay (used by the netbird-conn-lab Edge 4 reproducer).

* [client] treat empty NB_HOME_RELAY_SERVERS as unset

Returning (urls=[], ok=true) when the env var contained only separators or
whitespace caused callers to wipe the mgmt-provided relay list, leaving the
peer with no relays. Treat a parsed-empty result the same as an unset env.
2026-04-28 15:04:48 +02:00
Zoltan Papp
9c50819f20 Don't mark management disconnected on transient job stream errors (#6005)
The JOB stream is a separate channel from the SYNC stream. Server-side
EOF or other transient errors on the JOB stream do not indicate that
the management connection is unhealthy — the SYNC stream remains the
authoritative state signal.

Previously, a JOB stream EOF would call notifyDisconnected and the
client would emit OnConnecting to the UI. The backoff retry would
reconnect the JOB stream, but handleJobStream never calls notifyConnected
on success, so the UI was stuck on "Connecting" until the next SYNC
event or health check.

Keep notifyDisconnected for codes.PermissionDenied since IsLoginRequired
relies on managementError to detect expired auth.
2026-04-28 15:04:41 +02:00
Bethuel Mmbaga
6f0eff3ba0 [management] Handle single-string JWT group claim from IdPs (#6014) 2026-04-28 14:48:28 +03:00
Bethuel Mmbaga
f8745723fc [management] Add Microsoft AD FS support for embedded Dex identity providers (#6008) 2026-04-28 12:42:19 +03:00
Vlad
154b81645a [management] removed legacy network map code (#5565) 2026-04-27 16:02:54 +02:00
Maycon Santos
34167c8a16 [misc] Update release pipeline version (#5995) v0.70.0 2026-04-27 10:55:38 +02:00
Maycon Santos
d6f08e4840 [misc] Update sign pipeline version (#5981) 2026-04-24 13:13:27 +02:00
Zoltan Papp
f732b01a05 [management] unify peer-update test timeout via constant (#5952)
peerShouldReceiveUpdate waited 500ms for the expected update message,
and every outer wrapper across the management/server test suite paired
it with a 1s goroutine-drain timeout. Both were too tight for slower
CI runners (MySQL, FreeBSD, loaded sqlite), producing intermittent
"Timed out waiting for update message" failures in tests like
TestDNSAccountPeersUpdate, TestPeerAccountPeersUpdate, and
TestNameServerAccountPeersUpdate.

Introduce peerUpdateTimeout (5s) next to the helper and use it both in
the helper and in every outer wrapper so the two timeouts stay in sync.
Only runs down on failure; passing tests return as soon as the channel
delivers, so there is no slowdown on green runs.
2026-04-23 21:19:21 +02:00
Ashley Mensah
6b8e40f78d initial commit - workflow yaml, prompts and schemas 2026-04-23 18:48:38 +02:00
alsruf36
c07c726ea7 [proxy] Set session cookie path to root (#5915) 2026-04-23 18:20:54 +02:00
Pascal Fischer
fa0d58d093 [management] exclude peers for expiration job that have already been marked expired (#5970) 2026-04-23 16:01:54 +02:00
Vlad
b6038e8acd [management] refactor: changeable pat rate limiting (#5946) 2026-04-23 15:13:22 +02:00
Zoltan Papp
5da05ecca6 [client] increase gRPC health check timeout to 5s (#5961)
Bump the IsHealthy() context timeout from 1s to 5s for both the
management and signal gRPC clients to reduce false negatives on
slower or congested connections.
2026-04-22 20:54:18 +02:00
Viktor Liu
801de8c68d [client] Add TTL-based refresh to mgmt DNS cache via handler chain (#5945) 2026-04-22 15:10:14 +02:00
Viktor Liu
a822a33240 [self-hosted] Use cscli lapi status for CrowdSec readiness in installer (#5949) 2026-04-22 10:35:22 +02:00
Bethuel Mmbaga
57b23c5b25 [management] Propagate context changes to upstream middleware (#5956) 2026-04-21 23:06:52 +03:00
Zoltan Papp
1165058fad [client] fix port collision in TestUpload (#5950)
* [debug] fix port collision in TestUpload

TestUpload hardcoded :8080, so it failed deterministically when anything
was already on that port and collided across concurrent test runs.
Bind a :0 listener in the test to get a kernel-assigned free port, and
add Server.Serve so tests can hand the listener in without reaching
into unexported state.

* [debug] drop test-only Server.Serve, use SERVER_ADDRESS env

The previous commit added a Server.Serve method on the upload-server,
used only by TestUpload. That left production with an unused function.
Reserve an ephemeral loopback port in the test, release it, and pass
the address through SERVER_ADDRESS (which the server already reads).
A small wait helper ensures the server is accepting connections before
the upload runs, so the close/rebind gap does not cause a false failure.
2026-04-21 19:07:20 +02:00
Zoltan Papp
703353d354 [flow] fix goroutine leak in TestReceive_ProtocolErrorStreamReconnect (#5951)
The Receive goroutine could outlive the test and call t.Logf after
teardown, panicking with "Log in goroutine after ... has completed".
Register a cleanup that waits for the goroutine to exit; ordering is
LIFO so it runs after client.Close, which is what unblocks Receive.
2026-04-21 19:06:47 +02:00
Zoltan Papp
2fb50aef6b [client] allow UDP packet loss in TestICEBind_HandlesConcurrentMixedTraffic (#5953)
The test writes 500 packets per family and asserted exact-count
delivery within a 5s window, even though its own comment says "Some
packet loss is acceptable for UDP". On FreeBSD/QEMU runners the writer
loops cannot always finish all 500 before the 5s deadline closes the
readers (we have seen 411/500 in CI).

The real assertion of this test is the routing check — IPv4 peer only
gets v4- packets, IPv6 peer only gets v6- packets — which remains
strict. Replace the exact-count assertions with a >=80% delivery
threshold so runner speed variance no longer causes false failures.
2026-04-21 19:05:58 +02:00
Vlad
eb3aa96257 [management] check policy for changes before actual db update (#5405) 2026-04-21 18:37:04 +02:00
Viktor Liu
064ec1c832 [client] Trust wg interface in firewalld to bypass owner-flagged chains (#5928) 2026-04-21 17:57:16 +02:00
Viktor Liu
75e408f51c [client] Prefer systemd-resolved stub over file mode regardless of resolv.conf header (#5935) 2026-04-21 17:56:56 +02:00
Zoltan Papp
5a89e6621b [client] Supress ICE signaling (#5820)
* [client] Suppress ICE signaling and periodic offers in force-relay mode

When NB_FORCE_RELAY is enabled, skip WorkerICE creation entirely,
suppress ICE credentials in offer/answer messages, disable the
periodic ICE candidate monitor, and fix isConnectedOnAllWay to
only check relay status so the guard stops sending unnecessary offers.

* [client] Dynamically suppress ICE based on remote peer's offer credentials

Track whether the remote peer includes ICE credentials in its
offers/answers. When remote stops sending ICE credentials, skip
ICE listener dispatch, suppress ICE credentials in responses, and
exclude ICE from the guard connectivity check. When remote resumes
sending ICE credentials, re-enable all ICE behavior.

* [client] Fix nil SessionID panic and force ICE teardown on relay-only transition

Fix nil pointer dereference in signalOfferAnswer when SessionID is nil
(relay-only offers). Close stale ICE agent immediately when remote peer
stops sending ICE credentials to avoid traffic black-hole during the
ICE disconnect timeout.

* [client] Add relay-only fallback check when ICE is unavailable

Ensure the relay connection is supported with the peer when ICE is disabled to prevent connectivity issues.

* [client] Add tri-state connection status to guard for smarter ICE retry (#5828)

* [client] Add tri-state connection status to guard for smarter ICE retry

Refactor isConnectedOnAllWay to return a ConnStatus enum (Connected,
Disconnected, PartiallyConnected) instead of a boolean. When relay is
up but ICE is not (PartiallyConnected), limit ICE offers to 3 retries
with exponential backoff then fall back to hourly attempts, reducing
unnecessary signaling traffic. Fully disconnected peers continue to
retry aggressively. External events (relay/ICE disconnect, signal/relay
reconnect) reset retry state to give ICE a fresh chance.

* [client] Clarify guard ICE retry state and trace log trigger

Split iceRetryState.attempt into shouldRetry (pure predicate) and
enterHourlyMode (explicit state transition) so the caller in
reconnectLoopWithRetry reads top-to-bottom. Restore the original
trace-log behavior in isConnectedOnAllWay so it only logs on full
disconnection, not on the new PartiallyConnected state.

* [client] Extract pure evalConnStatus and add unit tests

Split isConnectedOnAllWay into a thin method that snapshots state and
a pure evalConnStatus helper that takes a connStatusInputs struct, so
the tri-state decision logic can be exercised without constructing
full Worker or Handshaker objects. Add table-driven tests covering
force-relay, ICE-unavailable and fully-available code paths, plus
unit tests for iceRetryState budget/hourly transitions and reset.

* [client] Improve grammar in logs and refactor ICE credential checks
2026-04-21 15:52:08 +02:00
Misha Bragin
06dfa9d4a5 [management] replace mailru/easyjson with netbirdio/easyjson fork (#5938) 2026-04-21 13:59:35 +02:00
Misha Bragin
45d9ee52c0 [self-hosted] add reverse proxy retention fields to combined YAML (#5930) 2026-04-21 10:21:11 +02:00
Zoltan Papp
3098f48b25 [client] fix ios network addresses mac filter (#5906)
* fix(client): skip MAC address filter for network addresses on iOS

iOS does not expose hardware (MAC) addresses due to Apple's privacy
restrictions (since iOS 14), causing networkAddresses() to return an
empty list because all interfaces are filtered out by the HardwareAddr
check. Move networkAddresses() to platform-specific files so iOS can
skip this filter.
v0.69.0
2026-04-20 11:49:38 +02:00