Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Health Checks

The /health command runs diagnostics across all system components and reports status.

Running Health Checks

# All checks
uv run gaius-cli --cmd "/health" --format json

# Specific category
uv run gaius-cli --cmd "/health gpu" --format json
uv run gaius-cli --cmd "/health endpoints" --format json
uv run gaius-cli --cmd "/health infrastructure" --format json

Interpreting Results

Each check reports a status:

StatusMeaning
PASSComponent is healthy
WARNComponent has issues but is functional
FAILComponent is unhealthy

Applying Fixes

When checks fail, use /health fix:

# Fix a specific service
uv run gaius-cli --cmd "/health fix engine" --format json

# Available services
# engine, dataset, nifi, postgres, qdrant, minio, endpoints, evolution

Always try /health fix before manual intervention. This exercises the self-healing system and helps it improve over time.

Manual Fallback

If /health fix fails:

# Full clean restart
just restart-clean

# GPU-specific cleanup
just gpu-cleanup
just gpu-deep-cleanup

FMEA Diagnostics

For deeper analysis:

# FMEA summary with RPN scores
uv run gaius-cli --cmd "/fmea" --format json

# Failure mode details
uv run gaius-cli --cmd "/fmea detail GPU_001" --format json