In search of reliable reliability data

In poring over thousands of reliability reports and metrics, it is clear that utilities have vastly different ways of collecting and calculating data. While it is fine for utilities to independently decide how to make their data collection and analysis work, it does make benchmarking these measures difficult. 

Before the American Public Power Association puts out a report or uses a measure for national benchmarking, we carefully check through the data and flag any irregularities. Some common issues we find within data include missing information or unknowns, rounding, and differing definitions. 

Don’t get me wrong — most utility data are pretty good. But some simple checks can make it even better, which can help you make more informed decisions. 

The best performers often have few unknown causes of outages. Sometimes you are not going to be able to figure out what the cause of an outage is, but when you have an unknown, you don’t have enough information to make any decisions or take any action. If you look at your outages and see that you have an unknown cause rate that’s higher than five percent, that is an opportunity to educate and train people in the field on how this information might be able to help the utility invest better in the future. 

Another common issue is rounding. For example, outage start times and end times should be random, but often data sets show clear spikes at even intervals throughout the hour. The same rounding happens with the number of customers affected, and we see spikes at counts ending in 50 and 20. That indicates a human process that might not be giving you a fully accurate picture. 

Smart meters will make some rounding go away, but they can create errors, too. Meter mis-mapping can occur, and the people who wrote your outage management system software might not understand how you want things to be calculated or connected. Taking a close look at the matchup between customer records and the protective device and the customers that are out, and the minutes that are out and the time, can give you some insight into how well your meters are mapped to locations, circuits, and protective devices. 

The calculation of major events, or including major events in general reliability data, can also be problematic. This is because utilities don’t have similar definitions of what constitutes a major event and don’t calculate it in the same way. A major event is really a way to describe something that is unusual. But this description is not helpful in explaining to the public how the next major storm could impact the utility or planning how you should invest to mitigate one. 

After major events, the best we can do in terms of benchmarking is to look at restoration curves as a group relative to the other segments of the industry. Following Hurricane Irma in Florida in 2017, we compared the curves for public power and other utilities and saw that public power restored 90 percent of customers, on average, nearly two days faster than other utilities and reached 50 percent of customers restored one day ahead of other utilities. 

Moving ahead, we have to think more critically about what information is useful and what is useless. Getting distracted by everything we could possibly collect about customers or equipment doesn’t help us be more reliable. We need to learn how to separately focus on what we collect for accounting purposes, what allows us to solve operations problems, and what helps us to make the best investments in reliability. 

A number of utilities are looking at reliability from an economic standpoint. This can include taking a backward look at trends in key indices (e.g., SAIDI) by outage cause to see which investments have had an impact, such as tree trimming, or by looking at estimated outage costs by circuit to prioritize system upgrades. For this latter measure, we’ve adapted the interruption cost estimator from Lawrence Berkeley National Laboratory in the eReliability Tracker to make it easy for public power utilities to estimate what outages truly cost. We’ve also worked with the Department of Energy to use outage cost data to allow utilities to simulate the cost of a systemwide outage caused by a cyberattack. 

We care about quality reliability metrics because we know how innate they are to your operations. And it makes me proud of the industry to see how intensely everyone is thinking about these problems.