A SCADA system can emit thousands of alarms in a morning, and most are the same fault told ten different ways. An inverter trips, a string voltage sags, a comms link flaps, and the control room sees a wall of red that says nothing. The work is not collecting alarms. It is turning that raw flood into a handful of structured events an operator, or another system, can act on.
If an operator cannot tell a tripped breaker from a flapping comms link at a glance, the alarm system has failed, however much it logs.
From flood to signal
Most alarm noise comes from treating every edge as an event. A signal that chatters around a threshold, say a transformer temperature sitting at the trip point, fires a fresh alarm on each transition. The first step is to debounce: require a condition to hold for a defined dwell time, or apply hysteresis with separate set and clear thresholds, before anything propagates. This removes most duplicates without losing real faults.
The second step is to model state, not edges. An alarm has a lifecycle: raised, acknowledged, cleared. What matters downstream is the current condition of an asset, not the number of transitions it took to get there. We normalise raw Modbus registers and OPC UA nodes into one canonical state per asset, then emit an event only when that state changes. Deduplication and suppression of standing alarms fall out of a clean state model.
Correlate, prioritise, route
A single root cause shows up as many symptoms. A lost comms link to a string combiner can raise undervoltage, zero-power and stale-data alarms at once. Correlation groups these by asset topology and time window, so the operator sees one event, the comms fault, with the rest attached as context rather than as peers. Each event carries a priority from severity and impact, so a grid-export trip ranks above a nuisance reading on a single sensor.
With events structured and prioritised, routing is mechanical. We publish them to a message bus over MQTT or similar, where consumers subscribe to what they care about: a ticketing system for maintenance, an on-call notifier for safety-critical trips, a time-series store such as TimescaleDB for trending and post-event analysis. Routing rules live in one place, so the same event can update a dashboard, open a work order and stay off the pager when it does not warrant one.
Alarm fatigue is not solved by better dashboards over the same flood. It is solved upstream, by debouncing, modelling state, correlating root cause and routing on priority, so what reaches a person is rare and meaningful. We build this event layer between SCADA and the systems that depend on it. If your control room is drowning in red, talk to us about it.