chore: release 0.29.0.rc0 (#1600)

This commit is contained in:
Jan-Otto Kröpke
2024-09-11 00:34:10 +02:00
committed by GitHub
parent 83b0aa8f62
commit f712c07c38
119 changed files with 5113 additions and 2255 deletions

View File

@@ -2,47 +2,24 @@
The service collector exposes metrics about Windows Services
The collector exists in 2 different version. Version 1 is using WMI to query all services and is able to provide additional
information. Version 2 is a more efficient solution by directly connecting to the service manager, but is not able to
provide additional information like `run_as` or start configuration
## Flags
### `--collector.service.services-where`
A WMI filter on which services to include. Recommended to keep down number of returned metrics.
Example: `--collector.service.services-where="Name='windows_exporter'"`
Example config win_exporter.yml for multiple services: `services-where: Name='SQLServer' OR Name='Couchbase' OR Name='Spooler' OR Name='ActiveMQ'`
### `--collector.service.use-api`
Uses API calls instead of WMI for performance optimization. **Note** the previous flag (`--collector.service.services-where`) won't have any effect on this mode.
### `--collector.service.v2`
Version 2 of the service collector. Is using API calls for performance optimization. **Note** the previous flag (`--collector.service.services-where`) won't have any effect on this mode.
For additional performance reasons, it doesn't provide any additional information like `run_as` or start configuration.
# collector V1
|||
-|-
Metric name prefix | `service`
Classes | [`Win32_Service`](https://msdn.microsoft.com/en-us/library/aa394418(v=vs.85).aspx)
Classes | none
Enabled by default? | Yes
## Flags
None
## Metrics
Name | Description | Type | Labels
-----|-------------|------|-------
`windows_service_info` | Contains service information in labels, constant 1 | gauge | name, display_name, process_id, run_as
`windows_service_state` | The state of the service, 1 if the current state, 0 otherwise | gauge | name, state
`windows_service_start_mode` | The start mode of the service, 1 if the current start mode, 0 otherwise | gauge | name, start_mode
`windows_service_status` | The status of the service, 1 if the current status, 0 otherwise | gauge | name, status
For the values of the `state`, `start_mode`, `status` and `run_as` labels, see below.
| Name | Description | Type | Labels |
|------------------------------|-----------------------------------------------------------------------------------------------|-------|---------------------------------------|
| `windows_service_info` | Contains service information run as user in labels, constant 1 | gauge | name, display_name, path_name, run_as |
| `windows_service_start_mode` | The start mode of the service, 1 if the current start mode, 0 otherwise | gauge | name, start_mode |
| `windows_service_state` | The state of the service, 1 if the current state, 0 otherwise | gauge | name, state |
| `windows_service_process` | Process of started service. The value is the creation time of the process as a unix timestamp | gauge | name, process_id |
### States
@@ -65,81 +42,50 @@ A service can have the following start modes:
- `manual`
- `disabled`
### Status (not available in API mode)
A service can have any of the following statuses:
- `ok`
- `error`
- `degraded`
- `unknown`
- `pred fail`
- `starting`
- `stopping`
- `service`
- `stressed`
- `nonrecover`
- `no contact`
- `lost comm`
Note that there is some overlap with service state.
### Run As
Account name under which a service runs. Depending on the service type, the account name may be in the form of "DomainName\Username" or UPN format ("Username@DomainName").
It corresponds to the `StartName` attribute of the `Win32_Service` class.
`StartName` attribute can be NULL and in such case the label is reported as an empty string. Notice that if the attribute is NULL the service is logged on as the `LocalSystem` account or, for kernel or system-level drive, it runs with a default object name created by the I/O system based on the service name, for example, DWDOM\Admin.
### Example metric
Lists the services that have a 'disabled' start mode.
```
windows_service_start_mode{exported_name=~"(mssqlserver|sqlserveragent)",start_mode="disabled"}
```
## Useful queries
Counts the number of Microsoft SQL Server/Agent Processes
```
count(windows_service_state{exported_name=~"(sqlserveragent|mssqlserver)",state="running"})
```
# collector V2
|||
-|-
Metric name prefix | `service`
Classes | none
Enabled by default? | No
## Metrics
Name | Description | Type | Labels
-----|-------------|------|-------
`windows_service_state` | The state of the service, 1 if the current state, 0 otherwise | gauge | name, display_name, state
### States
A service can be in the following states:
- `stopped`
- `start pending`
- `stop pending`
- `running`
- `continue pending`
- `pause pending`
- `paused`
- `unknown`
### Example metric
```
windows_service_state{display_name="Declared Configuration(DC) service",name="dcsvc",status="continue pending"} 0
windows_service_state{display_name="Declared Configuration(DC) service",name="dcsvc",status="pause pending"} 0
windows_service_state{display_name="Declared Configuration(DC) service",name="dcsvc",status="paused"} 0
windows_service_state{display_name="Declared Configuration(DC) service",name="dcsvc",status="running"} 0
windows_service_state{display_name="Declared Configuration(DC) service",name="dcsvc",status="start pending"} 0
windows_service_state{display_name="Declared Configuration(DC) service",name="dcsvc",status="stop pending"} 0
windows_service_state{display_name="Declared Configuration(DC) service",name="dcsvc",status="stopped"} 1
# HELP windows_service_info A metric with a constant '1' value labeled with service information
# TYPE windows_service_info gauge
windows_service_info{display_name="Declared Configuration(DC) service",name="dcsvc",path_name="C:\\WINDOWS\\system32\\svchost.exe -k netsvcs -p",run_as="LocalSystem"} 1
windows_service_info{display_name="Designs",name="Themes",path_name="C:\\WINDOWS\\System32\\svchost.exe -k netsvcs -p",run_as="LocalSystem"} 1
# HELP windows_service_process Process of started service. The value is the creation time of the process as a unix timestamp.
# TYPE windows_service_process gauge
windows_service_process{name="Themes",process_id="2856"} 1.7244891e+09
# HELP windows_service_start_mode The start mode of the service (StartMode)
# TYPE windows_service_start_mode gauge
windows_service_start_mode{name="Themes",start_mode="auto"} 1
windows_service_start_mode{name="Themes",start_mode="boot"} 0
windows_service_start_mode{name="Themes",start_mode="disabled"} 0
windows_service_start_mode{name="Themes",start_mode="manual"} 0
windows_service_start_mode{name="Themes",start_mode="system"} 0
windows_service_start_mode{name="dcsvc",start_mode="auto"} 0
windows_service_start_mode{name="dcsvc",start_mode="boot"} 0
windows_service_start_mode{name="dcsvc",start_mode="disabled"} 0
windows_service_start_mode{name="dcsvc",start_mode="manual"} 1
windows_service_start_mode{name="dcsvc",start_mode="system"} 0
# HELP windows_service_state The state of the service (State)
# TYPE windows_service_state gauge
windows_service_state{name="Themes",status="continue pending"} 0
windows_service_state{name="Themes",status="pause pending"} 0
windows_service_state{name="Themes",status="paused"} 0
windows_service_state{name="Themes",status="running"} 1
windows_service_state{name="Themes",status="start pending"} 0
windows_service_state{name="Themes",status="stop pending"} 0
windows_service_state{name="Themes",status="stopped"} 0
windows_service_state{name="dcsvc",status="continue pending"} 0
windows_service_state{name="dcsvc",status="pause pending"} 0
windows_service_state{name="dcsvc",status="paused"} 0
windows_service_state{name="dcsvc",status="running"} 0
windows_service_state{name="dcsvc",status="start pending"} 0
windows_service_state{name="dcsvc",status="stop pending"} 0
windows_service_state{name="dcsvc",status="stopped"} 1
```
## Useful queries
@@ -163,8 +109,8 @@ groups:
labels:
severity: high
annotations:
summary: "Service {{ $labels.exported_name }} down"
description: "Service {{ $labels.exported_name }} on instance {{ $labels.instance }} has been down for more than 3 minutes."
summary: "Service {{ $labels.name }} down"
description: "Service {{ $labels.name }} on instance {{ $labels.instance }} has been down for more than 3 minutes."
# Sends an alert when the 'mssqlserver' service is not in the running state for 3 minutes.
- alert: SQL Server DOWN
@@ -173,7 +119,7 @@ groups:
labels:
severity: high
annotations:
summary: "Service {{ $labels.exported_name }} down"
description: "Service {{ $labels.exported_name }} on instance {{ $labels.instance }} has been down for more than 3 minutes."
summary: "Service {{ $labels.name }} down"
description: "Service {{ $labels.name }} on instance {{ $labels.instance }} has been down for more than 3 minutes."
```
In this example, `instance` is the target label of the host. So each alert will be processed per host, which is then used in the alert description.