mirror of
https://github.com/fosrl/docs-v2.git
synced 2026-02-13 16:36:48 +00:00
220 lines
6.3 KiB
Plaintext
220 lines
6.3 KiB
Plaintext
---
|
|
title: "Health Checks"
|
|
description: "Configure automated health monitoring and failover for high availability"
|
|
---
|
|
|
|
<iframe
|
|
className="w-full aspect-video rounded-xl"
|
|
src="https://www.youtube.com/embed/Xdme_2-AMas"
|
|
title="YouTube video player"
|
|
frameBorder="0"
|
|
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
|
|
allowFullScreen
|
|
></iframe>
|
|
|
|
## Overview
|
|
|
|
Pangolin provides automated health checking for [targets](./resources/targets.mdx) to ensure traffic is only routed to healthy services. Health checks are essential for building highly available services, as they automatically remove unhealthy targets from traffic routing and load balancing.
|
|
|
|
## How Health Checks Work
|
|
|
|
### Monitoring Process
|
|
|
|
Health checks operate continuously in the background:
|
|
|
|
1. **Periodic Checks**: Pangolin sends requests to your target endpoints at configured intervals
|
|
2. **Status Evaluation**: Responses are evaluated against your configured criteria
|
|
3. **Traffic Management**: Healthy targets receive traffic, unhealthy targets are excluded
|
|
4. **Automatic Recovery**: Targets are automatically re-enabled when they become healthy again
|
|
|
|
### Health Check vs Target Endpoint
|
|
|
|
<Card title="Flexible Monitoring">
|
|
The health check endpoint can be the same as your target, but you can also monitor a different endpoint. This allows you to create dedicated health check endpoints that provide more detailed service status information.
|
|
</Card>
|
|
|
|
## Target Health States
|
|
|
|
Targets can exist in three distinct states that determine how traffic is routed:
|
|
|
|
<CardGroup cols={3}>
|
|
<Card title="Unknown" icon="question" color="#gray">
|
|
**Initial State**: Targets start in this state before first health check
|
|
|
|
**Traffic Behavior**: Unknown targets still route traffic normally
|
|
|
|
**Duration**: Until first health check completes
|
|
</Card>
|
|
|
|
<Card title="Unhealthy" icon="x" color="#red">
|
|
**Failed Checks**: Target has failed health check criteria
|
|
|
|
**Traffic Behavior**: No traffic is routed to unhealthy targets
|
|
|
|
**Load Balancing**: Excluded from load balancing rotation
|
|
</Card>
|
|
|
|
<Card title="Healthy" icon="check" color="#green">
|
|
**Passing Checks**: Target is responding correctly to health checks
|
|
|
|
**Traffic Behavior**: Receives traffic according to load balancing rules
|
|
|
|
**Load Balancing**: Included in load balancing rotation
|
|
</Card>
|
|
</CardGroup>
|
|
|
|
## Configuring Health Checks
|
|
|
|
<Steps>
|
|
<Step title="Access Target Settings">
|
|
In the Pangolin dashboard, navigate to your resource and locate the target in the table.
|
|
</Step>
|
|
|
|
<Step title="Open Health Check Configuration">
|
|
Click the settings wheel (⚙️) next to the health check endpoint column.
|
|
</Step>
|
|
|
|
<Step title="Configure Health Check Parameters">
|
|
Fill out the health check configuration with your desired parameters.
|
|
</Step>
|
|
|
|
<Step title="Save Configuration">
|
|
Save your settings to enable health checking for the target.
|
|
</Step>
|
|
</Steps>
|
|
|
|
## Health Check Parameters
|
|
|
|
### Endpoint Configuration
|
|
|
|
<Card title="Health Check Target">
|
|
**Target Endpoint**: The URL or address to monitor for health status
|
|
|
|
**Default Behavior**: Usually the same as your target endpoint
|
|
|
|
**Custom Endpoints**: Can monitor different endpoints (e.g., `/health`, `/status`)
|
|
</Card>
|
|
|
|
### Timing Configuration
|
|
|
|
<CardGroup cols={2}>
|
|
<Card title="Healthy Interval">
|
|
**Purpose**: How often to check targets that are currently healthy
|
|
|
|
**Typical Range**: 30-60 seconds
|
|
|
|
**Consideration**: Less frequent checks reduce overhead
|
|
</Card>
|
|
|
|
<Card title="Unhealthy Interval">
|
|
**Purpose**: How often to check targets that are currently unhealthy
|
|
|
|
**Typical Range**: 10-30 seconds
|
|
|
|
**Consideration**: More frequent checks enable faster recovery
|
|
</Card>
|
|
</CardGroup>
|
|
|
|
### Response Configuration
|
|
|
|
<Card title="Timeout Settings">
|
|
**Request Timeout**: Maximum time to wait for a health check response
|
|
|
|
**Default Behavior**: Requests exceeding timeout are considered failed
|
|
|
|
**Recommended**: Set based on your service's typical response time
|
|
</Card>
|
|
|
|
<Card title="HTTP Response Codes">
|
|
**Healthy Codes**: Which HTTP status codes indicate a healthy target
|
|
|
|
**Common Settings**: 200, 201, 202, 204
|
|
|
|
**Custom Codes**: Configure based on your service's health endpoint behavior
|
|
</Card>
|
|
|
|
## Failover Behavior
|
|
|
|
### Automatic Traffic Exclusion
|
|
|
|
When a target becomes unhealthy:
|
|
|
|
<Steps>
|
|
<Step title="Health Check Failure">
|
|
Target fails to meet health check criteria (response code, timeout, etc.)
|
|
</Step>
|
|
|
|
<Step title="Status Update">
|
|
Target status changes from "Healthy" to "Unhealthy"
|
|
</Step>
|
|
|
|
<Step title="Traffic Removal">
|
|
Target is immediately removed from traffic routing configuration
|
|
</Step>
|
|
|
|
<Step title="Load Balancer Update">
|
|
Load balancing configuration is updated to exclude the unhealthy target
|
|
</Step>
|
|
|
|
<Step title="Continued Monitoring">
|
|
Health checks continue at the unhealthy interval for recovery detection
|
|
</Step>
|
|
</Steps>
|
|
|
|
### Automatic Recovery
|
|
|
|
When an unhealthy target recovers:
|
|
|
|
<Steps>
|
|
<Step title="Successful Health Check">
|
|
Target begins responding correctly to health checks
|
|
</Step>
|
|
|
|
<Step title="Status Update">
|
|
Target status changes from "Unhealthy" to "Healthy"
|
|
</Step>
|
|
|
|
<Step title="Traffic Restoration">
|
|
Target is automatically added back to traffic routing
|
|
</Step>
|
|
|
|
<Step title="Load Balancer Update">
|
|
Load balancing resumes including the recovered target
|
|
</Step>
|
|
</Steps>
|
|
|
|
## High Availability Strategies
|
|
|
|
### Multi-Target Redundancy
|
|
|
|
<Card title="Service Redundancy">
|
|
Deploy multiple instances of your service across different targets to ensure availability even when some targets fail.
|
|
</Card>
|
|
|
|
```
|
|
Resource: web-application
|
|
├── Target 1: web-01.local:8080 (Site A) - Healthy ✅
|
|
├── Target 2: web-02.local:8080 (Site A) - Unhealthy ❌
|
|
└── Target 3: web-03.local:8080 (Site B) - Healthy ✅
|
|
|
|
Traffic routes to: Target 1 & Target 3 only
|
|
```
|
|
|
|
### Cross-Site Failover
|
|
|
|
<Card title="Geographic Distribution">
|
|
Distribute targets across multiple sites to protect against site-level failures.
|
|
</Card>
|
|
|
|
```
|
|
Resource: api-service
|
|
├── Primary Site Targets
|
|
│ ├── api-01.primary:8443 - Healthy ✅
|
|
│ └── api-02.primary:8443 - Healthy ✅
|
|
└── Backup Site Targets
|
|
├── api-01.backup:8443 - Healthy ✅
|
|
└── api-02.backup:8443 - Healthy ✅
|
|
|
|
All targets receive traffic via load balancing
|
|
```
|