Web-based server hardware monitoring via IPMI and Redfish
Free, self-hosted IPMI/BMC monitoring for your server fleet.
Collect System Event Logs (SEL), monitor sensors, track ECC errors, gather SSH system logs, and get alerts โ all from a beautiful web dashboard.
| Guide | Description |
|---|---|
| User Guide | Complete documentation for using IPMI Monitor |
| IPMI SEL Reference | Decode BMC event logs and troubleshoot hardware issues |
| Developer Guide | Git workflow, releases, CI/CD |
Deploy everything with a single command using a config file:
# Install from dev branch (latest features)
pipx install git+https://github.com/cryptolabsza/ipmi-monitor.git@dev
# Deploy with config file (no prompts)
sudo ipmi-monitor quickstart -c /path/to/config.yaml -y
See examples/ipmi-config.yaml for a complete config template.
# Install pipx (prerequisite)
apt install pipx -y && pipx ensurepath
source ~/.bashrc
# Install the CLI tool
pipx install ipmi-monitor
# Run the quickstart wizard (use full path since pipx bin isn't in sudo PATH)
sudo ~/.local/bin/ipmi-monitor quickstart
Thatโs it! The wizard will:
docker run -d \
--name ipmi-monitor \
-p 5000:5000 \
-v ipmi_data:/app/data \
-e IPMI_USER=admin \
-e IPMI_PASS=YOUR_BMC_PASSWORD \
-e ADMIN_PASS=YOUR_ADMIN_PASSWORD \
-e SECRET_KEY=YOUR_RANDOM_SECRET_KEY \
ghcr.io/cryptolabsza/ipmi-monitor:latest
Then open http://localhost:5000 and add your servers!
See User Guide for Docker Compose setup.
After installation, use the ipmi-monitor CLI:
| Command | Description |
|---|---|
sudo ipmi-monitor quickstart |
โก One-command Docker deployment (recommended) |
ipmi-monitor status |
Show container status |
ipmi-monitor logs [-f] |
View container logs |
ipmi-monitor start |
Start containers |
ipmi-monitor stop |
Stop containers |
ipmi-monitor restart |
Restart containers |
ipmi-monitor upgrade |
Pull latest image & restart |
ipmi-monitor add-server |
Add a server interactively |
ipmi-monitor list-servers |
List configured servers |
ipmi-monitor setup-ssl |
Set up HTTPS reverse proxy |
ipmi-monitor uninstall |
Uninstall IPMI Monitor (with options) |
ipmi-monitor version |
Show detailed version info |
ipmi-monitor setup-ssl |
Retry Letโs Encrypt SSL setup |
Main dashboard showing 39 servers with real-time status
![]() Event Log - SEL events |
![]() Live Sensors |
![]() Hardware Inventory |
![]() SSH System Logs |
| Feature | Description |
|---|---|
| ๐ SEL Collection | Parallel IPMI event collection (32 workers) |
| ๐ Real-time Dashboard | Auto-refreshing server status cards |
| ๐ก๏ธ Sensor Monitoring | Temperature, fan, voltage, power readings |
| ๐พ ECC Tracking | Identify which DIMM has memory errors |
| ๐ฎ GPU Health | Detect NVIDIA Xid errors via SSH |
| ๐ SSH System Logs | Collect dmesg, journalctl, syslog, mcelog |
| ๐ฅ๏ธ Platform Logs | Collect Vast.ai daemon and RunPod agent logs |
| ๐ง Hardware Errors | AER, PCIe, ECC errors parsed automatically |
| ๐จ Alerts | Email, Telegram, webhook notifications |
| โ Alert Resolution | Notify when issues clear |
| ๐ Prometheus | Native /metrics endpoint for Grafana |
| ๐ User Management | Admin and read-only access levels |
| ๐ฅ Backup/Restore | Export everything for disaster recovery |
| ๐ BMC Reset | Cold/warm reset without affecting host OS |
| ๐ณ Docker Ready | Multi-arch images (amd64/arm64) |
| ๐ Auto-Updates | Watchtower keeps containers updated |
| Feature | Description |
|---|---|
| ๐ฆ Quickstart Wizard | One-command Docker deployment with CryptoLabs Proxy, SSL, Watchtower |
| ๐ CryptoLabs Proxy | Unified reverse proxy with Fleet Management landing page at / |
| ๐ DC Overview Import | Auto-detect DC Overview installation and import servers/SSH keys |
| ๐ SSH Key Management | Auto-detect keys, paste content, or generate new ED25519 keys |
| ๐ SSH Log Collection | Optional SSH log collection (dmesg, syslog, GPU errors) during setup |
| ๐ Initial Data Collection | Fresh installs auto-collect sensors/events with progress modal |
| ๐ Auto SSL Renewal | Certbot container automatically obtains/renews Letโs Encrypt certs |
| ๐ Subpath Routing | Deploy at /ipmi/ alongside other CryptoLabs services |
| ๐ท๏ธ Site Name Branding | Configure site name via DC Overview for consistent branding |
| ๐ฅ๏ธ Vast.ai/RunPod Logs | Auto-collects daemon logs when deployed via DC Overview with exporters |
| ๐ Watchtower Integration | Automatic container updates every 5 minutes |
| ๐ค Read-Write Role | New role with settings access but no user management |
| ๐ฅ Fixed Export/Import | Alert rules now export/import correctly |
| ๐ SEL Management | Enable/disable event logging, view SEL info, get SEL time |
| ๐ Sensor Highlighting | Changed sensor values pulse green after refresh |
| โณ Diagnostics Loading States | Download buttons show progress to prevent double-clicks |
| ๐ Grafana Config | prometheus.yml example and endpoint documentation |
| ๐ก๏ธ Uninstall Options | Choose to remove containers, config, or both |
Upgrade with AI-powered insights from CryptoLabs:
| Feature | Description |
|---|---|
| ๐ Daily Summaries | AI-generated fleet health with GPU focus |
| ๐ง Maintenance Tasks | Auto-generated from events |
| ๐ Predictions | Failure warnings before they happen |
| ๐ Root Cause Analysis | AI explains what went wrong |
| ๐ฌ AI Chat | Ask questions about your servers |
| ๐ค Recovery Agent | Autonomous GPU recovery with escalation |
| ๐ข Multi-Site | One account, multiple datacenters |
| ๐ Task Queue | AI sends recovery tasks for execution |
Start your free trial: Settings โ AI Features โ Start Free Trial
| Variable | Default | Description |
|---|---|---|
APP_NAME |
IPMI Monitor | Displayed in header |
IPMI_USER |
admin | Default BMC username |
IPMI_PASS |
(required) | Default BMC password |
ADMIN_PASS |
changeme | Dashboard admin password |
SECRET_KEY |
(auto) | Flask session secret (set this!) |
POLL_INTERVAL |
300 | Seconds between collections |
SSH_LOG_INTERVAL |
(disabled) | Minutes between SSH log collection |
IPMI Monitor is designed for production datacenter environments:
IPMI Monitor runs as Docker containers with CryptoLabs Proxy for unified reverse proxy:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Your Server โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ cryptolabs-proxy Port 80/443 (HTTP/HTTPS) โ โ
โ โ โโโ / โ Fleet Management Landing Page โ โ
โ โ โโโ /ipmi/ โ IPMI Monitor โ โ
โ โ โโโ /dc/ โ DC Overview (if installed) โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ ipmi-monitor Port 5000 (internal) โ โ
โ โ โข Flask web application with SQLite โ โ
โ โ โข Background workers (IPMI polling, SSH log collection) โ โ
โ โ โข Initial data collection on first start โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ certbot Auto SSL renewal (every 12h) โ โ
โ โ watchtower Auto container updates (every 5m) โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ
โผ โผ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ BMC/IPMI โ โ Server OS โ
โ (port 623) โ โ (SSH port 22) โ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
Live Example: dc.cryptolabs.co.za - Fleet Management at /, IPMI Monitor at /ipmi/
IPMI Monitor exposes 150+ REST API endpoints. Here are the most commonly used:
| Endpoint | Description |
|---|---|
GET / |
Web dashboard |
GET /api/servers |
List all servers with status |
GET /api/events |
Get events (supports filters) |
GET /api/stats |
Dashboard statistics |
GET /api/maintenance |
Maintenance tasks |
GET /api/recovery-logs |
Recovery action history |
GET /api/uptime |
Server uptime data |
| Endpoint | Description |
|---|---|
GET /api/servers/managed |
All configured servers |
POST /api/servers/add |
Add new server |
PUT /api/servers/{bmc_ip} |
Update server config |
DELETE /api/servers/{bmc_ip} |
Remove server |
POST /api/servers/import |
Bulk import servers |
GET /api/servers/export |
Export server list |
| Endpoint | Description |
|---|---|
GET /server/{bmc_ip} |
Server detail page |
GET /api/server/{bmc_ip}/events |
Serverโs events |
GET /api/sensors/{bmc_ip} |
Live sensor readings |
GET /api/server/{bmc_ip}/ssh-logs |
SSH system logs |
POST /api/servers/{bmc_ip}/inventory |
Collect inventory |
POST /api/server/{bmc_ip}/power/{action} |
Power control (on/off/reset) |
POST /api/server/{bmc_ip}/bmc/{action} |
BMC reset (cold/warm) |
POST /api/server/{bmc_ip}/investigate |
Post-recovery investigation |
| Endpoint | Description |
|---|---|
GET /api/ssh-keys |
List stored SSH keys |
POST /api/ssh-keys |
Add SSH key |
POST /api/test/bmc |
Test BMC connection |
POST /api/test/ssh |
Test SSH connection |
POST /api/ssh-logs/collect-now |
Trigger SSH log collection |
| Endpoint | Description |
|---|---|
GET /api/alerts/rules |
Alert rules |
POST /api/alerts/rules |
Create alert rule |
GET /api/alerts/history |
Fired alerts |
GET /api/alerts/notifications |
Notification channels |
POST /api/alerts/notifications/{type}/test |
Test notification |
| Endpoint | Description |
|---|---|
GET /metrics |
Prometheus metrics |
GET /health |
Health check |
GET /api/version |
Version info |
GET /api/version/check |
Check for updates |
POST /api/collect |
Trigger IPMI collection |
| Endpoint | Description |
|---|---|
GET /api/ai/status |
AI sync status |
GET /api/ai/config |
AI configuration |
POST /api/ai/sync |
Trigger AI sync |
GET /api/ai/results |
Cached AI results |
See User Guide for complete endpoint documentation.
MIT License ยท Made with โค๏ธ by CryptoLabs