Web-based server hardware monitoring via IPMI and Redfish
IPMI System Event Log reference for Lenovo ThinkSystem servers with XClarity Controller (XCC).
| Platforms: ThinkSystem SR655 V3, SR675 V3, SR680a V3, SR780a V3 | BMC: XClarity Controller |
Lenovo ThinkSystem servers use the XClarity Controller (XCC) as their baseboard management controller. XCC provides IPMI 2.0 compatibility with additional Lenovo-specific features.
| Model | Form Factor | CPU | GPU Support | Max GPUs |
|---|---|---|---|---|
| SR655 V3 | 1U | AMD EPYC Genoa | Limited | 2 |
| SR675 V3 | 3U | AMD EPYC Genoa | Yes | 8 |
| SR680a V3 | 4U | AMD EPYC Genoa | Yes | 8 |
| SR780a V3 | 4U | AMD EPYC Genoa | Yes | 8 |
| Version | Notes |
|---|---|
| 5.10 | Initial V3 release |
| 8.10 - 9.20 | Stability improvements |
| 12.10 - 14.11 | Latest with enhanced GPU support |
| Sensor Type | Description |
|---|---|
| System Event | General system status changes |
| Boot/POST | Power-on self-test events |
| OS Status | Operating system watchdog |
| Firmware | Firmware update events |
Lenovo uses descriptive temperature sensor names:
| Sensor | Location | Warning | Critical |
|---|---|---|---|
| Ambient Temp | Air inlet | 35°C | 42°C |
| CPU1 Temp | Processor 1 | 85°C | 95°C |
| CPU2 Temp | Processor 2 | 85°C | 95°C |
| GPU1-8 Temp | GPU modules | 80°C | 90°C |
| DIMM Temp | Memory | 75°C | 85°C |
| PCH Temp | Platform Controller Hub | 85°C | 95°C |
| VRM Temp | Voltage Regulators | 95°C | 105°C |
| Sensor | Description |
|---|---|
| PSU1 Status | Power supply 1 health |
| PSU2 Status | Power supply 2 health |
| PSU Redundancy | Redundancy status |
| Power Cap | Power capping events |
| Event | Severity | Meaning |
|---|---|---|
| AC Lost | 🔴 Critical | Power supply lost AC input |
| Failure Predicted | 🟡 Warning | PSU degrading |
| Failure | 🔴 Critical | PSU has failed |
| Redundancy Lost | 🟡 Warning | Single PSU mode |
| Redundancy Restored | 🟢 Info | Both PSUs online |
Lenovo servers include PFA sensors that predict component failures:
| Sensor | Description | Action |
|---|---|---|
| PFA Memory | Memory degradation predicted | Plan DIMM replacement |
| PFA HDD | Drive failure predicted | Plan drive replacement |
| PFA CPU | Processor issue predicted | Contact support |
| PFA Fan | Fan degradation predicted | Replace fan |
| Sensor | Description |
|---|---|
| Lightpath Log | System diagnostic log entries |
| Lightpath Reminder | Maintenance reminder |
| Sensor | Description |
|---|---|
| GPU1-8 Status | Individual GPU health |
| GPU Power | GPU power draw |
| NVLink Status | GPU interconnect health |
| GPU Memory ECC | GPU memory errors |
System Event | Timestamp Clock Sync | Asserted
Power Unit | Power on | Asserted
Processor | Presence detected | Asserted
Memory | Presence detected | Asserted
System Event | OEM System boot | Asserted
Power Supply PSU1 | AC Lost | Asserted
Power Supply | Redundancy Lost | Asserted
# If both PSUs lose power:
Power Unit | Power off/down | Asserted
Temperature GPU3 | Upper Non-critical going high | Asserted
# System may throttle GPU
Temperature GPU3 | Upper Non-critical going high | Deasserted
Lenovo reports ECC errors with detailed location:
Memory DIMM A1 | Correctable ECC | Asserted
Memory DIMM A1 | Correctable ECC logging limit reached | Asserted
| CPU | Channel A | Channel B | Channel C | Channel D |
|---|---|---|---|---|
| CPU1 | A1-A4 | B1-B4 | C1-C4 | D1-D4 |
| CPU2 | E1-E4 | F1-F4 | G1-G4 | H1-H4 |
| Sensor | Normal Range | Description |
|---|---|---|
| Planar 3.3V | 3.135V - 3.465V | Main 3.3V rail |
| Planar 5V | 4.75V - 5.25V | Main 5V rail |
| Planar 12V | 11.4V - 12.6V | Main 12V rail |
| VBAT | 2.7V - 3.3V | CMOS battery |
| CPU VCore | Per spec | Processor core voltage |
| Sensor | Description | Critical Threshold |
|---|---|---|
| Fan 1-8 | System fans | < 1000 RPM |
| PSU Fan | Power supply fans | Reported by PSU |
| Event | Description | Action |
|---|---|---|
| Firmware Update Started | XCC firmware update in progress | Do not power off |
| Firmware Update Completed | XCC firmware update successful | May require BMC reset |
| Firmware Corruption | Firmware image corrupted | Recover via Lenovo tools |
The XClarity Controller provides a web interface for management:
https://<bmc_ip>/| Feature | Description |
|---|---|
| Remote Console | HTML5-based KVM |
| Virtual Media | ISO mounting |
| Power Control | Power on/off/restart |
| SEL Viewer | System Event Log browser |
| Sensor Dashboard | Real-time sensor readings |
| Firmware Update | In-band and out-of-band updates |
# Get sensor readings
ipmitool -I lanplus -H <bmc_ip> -U USERID -P <password> sensor
# Get SEL entries
ipmitool -I lanplus -H <bmc_ip> -U USERID -P <password> sel list
# Get FRU info
ipmitool -I lanplus -H <bmc_ip> -U USERID -P <password> fru
# BMC cold reset
ipmitool -I lanplus -H <bmc_ip> -U USERID -P <password> mc reset cold
# Get system info
curl -k -u USERID:PASSW0RD https://<bmc_ip>/redfish/v1/Systems/1
# Get thermal status
curl -k -u USERID:PASSW0RD https://<bmc_ip>/redfish/v1/Chassis/1/Thermal
# Get power status
curl -k -u USERID:PASSW0RD https://<bmc_ip>/redfish/v1/Chassis/1/Power
Last updated: December 2025