docs-v2/manage/healthchecks-failover.mdx

---
title: "Health Checks"
description: "Configure automated health monitoring and failover for high availability"
---

<iframe
  className="w-full aspect-video rounded-xl"
  src="https://www.youtube.com/embed/Xdme_2-AMas"
  title="YouTube video player"
  frameBorder="0"
  allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
  allowFullScreen
></iframe>

## Overview

Pangolin provides automated health checking for [targets](./resources/targets.mdx) to ensure traffic is only routed to healthy services. Health checks are essential for building highly available services, as they automatically remove unhealthy targets from traffic routing and load balancing.

## How Health Checks Work

### Monitoring Process

Health checks operate continuously in the background:

1. **Periodic Checks**: Pangolin sends requests to your target endpoints at configured intervals
2. **Status Evaluation**: Responses are evaluated against your configured criteria
3. **Traffic Management**: Healthy targets receive traffic, unhealthy targets are excluded
4. **Automatic Recovery**: Targets are automatically re-enabled when they become healthy again

### Health Check vs Target Endpoint

<Card title="Flexible Monitoring">
  The health check endpoint can be the same as your target, but you can also monitor a different endpoint. This allows you to create dedicated health check endpoints that provide more detailed service status information.
</Card>

## Target Health States

Targets can exist in three distinct states that determine how traffic is routed:

<CardGroup cols={3}>
<Card title="Unknown" icon="question" color="#gray">
  **Initial State**: Targets start in this state before first health check

  **Traffic Behavior**: Unknown targets still route traffic normally

  **Duration**: Until first health check completes
</Card>

<Card title="Unhealthy" icon="x" color="#red">
  **Failed Checks**: Target has failed health check criteria

  **Traffic Behavior**: No traffic is routed to unhealthy targets

  **Load Balancing**: Excluded from load balancing rotation
</Card>

<Card title="Healthy" icon="check" color="#green">
  **Passing Checks**: Target is responding correctly to health checks

  **Traffic Behavior**: Receives traffic according to load balancing rules

  **Load Balancing**: Included in load balancing rotation
</Card>
</CardGroup>

## Configuring Health Checks

<Steps>
<Step title="Access Target Settings">
  In the Pangolin dashboard, navigate to your resource and locate the target in the table.
</Step>

<Step title="Open Health Check Configuration">
  Click the settings wheel (⚙️) next to the health check endpoint column.
</Step>

<Step title="Configure Health Check Parameters">
  Fill out the health check configuration with your desired parameters.
</Step>

<Step title="Save Configuration">
  Save your settings to enable health checking for the target.
</Step>
</Steps>

## Health Check Parameters

### Endpoint Configuration

<Card title="Health Check Target">
  **Target Endpoint**: The URL or address to monitor for health status

  **Default Behavior**: Usually the same as your target endpoint

  **Custom Endpoints**: Can monitor different endpoints (e.g., `/health`, `/status`)
</Card>

### Timing Configuration

<CardGroup cols={2}>
<Card title="Healthy Interval">
  **Purpose**: How often to check targets that are currently healthy

  **Typical Range**: 30-60 seconds

  **Consideration**: Less frequent checks reduce overhead
</Card>

<Card title="Unhealthy Interval">
  **Purpose**: How often to check targets that are currently unhealthy

  **Typical Range**: 10-30 seconds

  **Consideration**: More frequent checks enable faster recovery
</Card>
</CardGroup>

### Response Configuration

<Card title="Timeout Settings">
  **Request Timeout**: Maximum time to wait for a health check response

  **Default Behavior**: Requests exceeding timeout are considered failed

  **Recommended**: Set based on your service's typical response time
</Card>

<Card title="HTTP Response Codes">
  **Healthy Codes**: Which HTTP status codes indicate a healthy target

  **Common Settings**: 200, 201, 202, 204

  **Custom Codes**: Configure based on your service's health endpoint behavior
</Card>

## Failover Behavior

### Automatic Traffic Exclusion

When a target becomes unhealthy:

<Steps>
<Step title="Health Check Failure">
  Target fails to meet health check criteria (response code, timeout, etc.)
</Step>

<Step title="Status Update">
  Target status changes from "Healthy" to "Unhealthy"
</Step>

<Step title="Traffic Removal">
  Target is immediately removed from traffic routing configuration
</Step>

<Step title="Load Balancer Update">
  Load balancing configuration is updated to exclude the unhealthy target
</Step>

<Step title="Continued Monitoring">
  Health checks continue at the unhealthy interval for recovery detection
</Step>
</Steps>

### Automatic Recovery

When an unhealthy target recovers:

<Steps>
<Step title="Successful Health Check">
  Target begins responding correctly to health checks
</Step>

<Step title="Status Update">
  Target status changes from "Unhealthy" to "Healthy"
</Step>

<Step title="Traffic Restoration">
  Target is automatically added back to traffic routing
</Step>

<Step title="Load Balancer Update">
  Load balancing resumes including the recovered target
</Step>
</Steps>

## High Availability Strategies

### Multi-Target Redundancy

<Card title="Service Redundancy">
  Deploy multiple instances of your service across different targets to ensure availability even when some targets fail.
</Card>

```
Resource: web-application
├── Target 1: web-01.local:8080 (Site A) - Healthy ✅
├── Target 2: web-02.local:8080 (Site A) - Unhealthy ❌
└── Target 3: web-03.local:8080 (Site B) - Healthy ✅

Traffic routes to: Target 1 & Target 3 only
```

### Cross-Site Failover

<Card title="Geographic Distribution">
  Distribute targets across multiple sites to protect against site-level failures.
</Card>

```
Resource: api-service
├── Primary Site Targets
│   ├── api-01.primary:8443 - Healthy ✅
│   └── api-02.primary:8443 - Healthy ✅
└── Backup Site Targets
    ├── api-01.backup:8443 - Healthy ✅
    └── api-02.backup:8443 - Healthy ✅

All targets receive traffic via load balancing
```