Skip to content

Health Monitoring

Rindexer includes a comprehensive health monitoring system that provides real-time insights into the status of your indexing infrastructure. This built-in monitoring helps you ensure your indexers are running smoothly and quickly identify issues when they occur.

Overview

The health monitoring system tracks the status of key components:

  • Database connectivity - PostgreSQL connection health
  • Indexing status - Whether the indexer is running and active task count
  • Sync status - Data synchronization health across storage backends
  • Overall system health - Aggregated status across all components

Health Server

The health monitoring server runs automatically alongside your rindexer instance on a separate port. By default, it runs on port 8080, but this can be configured.

Starting the Health Server

The health server starts automatically when you run rindexer with indexing enabled. No additional configuration is required.

# Health server starts automatically with indexing
rindexer start indexer
rindexer start all

Health Endpoints

GET /health

Returns the complete health status of your rindexer instance.

Response Format:
{
  "status": "healthy",
  "timestamp": "2024-01-15T10:30:00Z",
  "services": {
    "database": "healthy",
    "indexing": "healthy",
    "sync": "healthy"
  },
  "indexing": {
    "active_tasks": 2,
    "is_running": true
  }
}
HTTP Status Codes:
  • 200 OK - System is healthy
  • 503 Service Unavailable - System has issues

Health Status Types

The health endpoint returns different status types for each service:

StatusDescription
healthyService is functioning normally
unhealthyService has encountered an error
unknownStatus cannot be determined
not_configuredService is not set up
disabledService is intentionally disabled
no_dataService is working but no data is available
stoppedService is not running

Service Health Checks

Database Health Check

The database health check verifies PostgreSQL connectivity and functionality:

  • healthy: PostgreSQL is enabled and a simple SELECT 1 query succeeds
  • unhealthy: PostgreSQL is enabled but the connection fails or query errors occur
  • not_configured: PostgreSQL is enabled but no database client is available
  • disabled: PostgreSQL is not enabled in the configuration

What it checks: Basic database connectivity by executing SELECT 1 against the PostgreSQL instance.

Indexing Health Check

The indexing health check monitors the indexer process state:

  • healthy: The indexer is currently running (system state flag is set)
  • stopped: The indexer is not running (system state flag is not set)

What it checks: The global IS_RUNNING flag that tracks whether the indexer process is active.

Sync Health Check

The sync health check verifies data synchronization status based on your storage configuration:

For PostgreSQL storage:
  • healthy: Database has event tables (excluding system tables like latest_block, *_last_known_*, *_last_run_*)
  • no_data: No event tables exist yet (acceptable for new deployments)
  • unhealthy: Database query fails or connection issues
  • not_configured: No database client available
For CSV storage:
  • healthy: CSV directory exists and contains .csv files
  • no_data: CSV directory doesn't exist or contains no .csv files
  • unhealthy: CSV directory exists but cannot be read
  • not_configured: CSV storage not configured

What it checks:

  • PostgreSQL: Queries information_schema.tables to find user-created event tables
  • CSV: Checks if the CSV directory exists and contains CSV files

Overall Health Status

The overall health status is determined by combining all service checks:

  • healthy: All critical services are healthy, or sync shows no_data (acceptable for new deployments)
  • unhealthy: Any critical service is unhealthy, not_configured, or indexing is stopped

Critical services: Database, Indexing, and Sync (when enabled)

Health Server Lifecycle

The health server's lifecycle depends on which services you start:

rindexer start indexer (with end_block set)

  • Short-lived: Health server starts with the indexer and dies when indexing completes
  • Use case: Historical data indexing that has a defined end point
  • Health monitoring: Only available during the indexing process

rindexer start indexer (no end_block set)

  • Long-lived: Health server starts with the indexer and stays alive for live indexing
  • Use case: Continuous live indexing that runs indefinitely
  • Health monitoring: Available continuously while the indexer is running

rindexer start graphql

  • No health server: Health server is not started in GraphQL-only mode
  • Use case: Running only the GraphQL API without indexing
  • Health monitoring: Not available (health server requires indexing to be enabled)

rindexer start all

  • Long-lived: Health server starts with the indexer and follows the GraphQL server lifecycle
  • Use case: Running both indexing and GraphQL API together
  • Health monitoring: Available as long as the GraphQL server is running

Configuration

Custom Health Port

You can configure the health server port using the health_override_port setting in your rindexer.yaml file:

global:
  health_override_port: 8081

Production Monitoring

Load Balancer Health Checks

Configure your load balancer to use the health endpoint for health checks:

Health Check URL: http://your-rindexer-instance:8080/health
Expected Status: 200 OK

Monitoring Tools

You can integrate with monitoring tools like Prometheus, Grafana, or DataDog to track health metrics and set up alerts based on HTTP status codes and response times.

Troubleshooting

Common Issues

  • Health server not starting: Check if port is in use, verify YAML configuration
  • Database health failing: Verify PostgreSQL connection and permissions
  • Sync health issues: Check storage configuration and file permissions

Debugging

Enable debug logging for detailed health information:

RUST_LOG=debug rindexer start indexer

Best Practices

  • Set up continuous monitoring of the health endpoint
  • Configure appropriate alert thresholds
  • Keep health check logs for troubleshooting
  • Monitor multiple instances if running in a cluster

API Reference

Health Endpoint

  • URL: GET /health
  • Response: JSON with health status and service information
  • Status Codes: 200 (healthy), 503 (unhealthy)