Key points for power systems managers
Introduction
The Uptime Institute has released its Annual Global Data Center Survey 2022 – the twelfth in its annual series. It presents a snapshot of the practices, trends and challenges shaping the mission-critical digital infrastructure industry, and is the most comprehensive and longest running of its kind.
The survey examines the state of the industry in terms of performance, resiliency, efficiency and sustainability, staffing and innovative technologies. Interestingly, its subtitle is: “Resiliency remains critical in a volatile world”. Certainly, two themes are apparent throughout the Survey – causes, levels and effects of resiliency issues in data centres, and the increasing need to both achieve and demonstrate efficiency and therefore sustainability. And good power system design can contribute to improving both factors.
Here are the key outtakes from the report which we consider useful for businesses to understand.
Industry benchmarks
Data centres need benchmarks to track their levels of efficiency; the two major metrics used are power usage effectiveness (PUE) and rack power density. PUE reflects the proportion of total data centre power intake consumed by IT hardware, as opposed to the supporting infrastructure such as cooling and power conditioning systems. However, the IT function itself also suffers inefficiencies due to cooling fans, internal power distribution system losses and semiconductor power losses – yet there is no clear way to quantify these.
This difficulty of adopting composite IT efficiency metrics makes it likely that most operators will adopt a collection of relatively simpler indicators of efficiency to promote IT efficiency gains, such as server utilization, hardware age or the application of power management tools.
Slow-up in PUE reduction: Survey respondents’ average annual PUE in 2022 was 1.55. The figure initially dropped steeply from 2.5 in 2007 – when Uptime started tracking it – to 1.65 in 2014 as facilities adopted the easier, inexpensive efficiency measures. However, progress has slowed markedly since then.
Nevertheless, opportunities still exist. New data centre builds routinely outperform the average, achieving PUEs of 1.3 and below using facility designs and more advanced equipment that is optimized for lower energy use.
One caveat is that upcoming server processor chips may run hotter within a few years, causing average PUE to rise before it falls.
Rack power density: Increases in rack power density, after being slow for many years, are now accelerating; over a third of respondents reported rapid growth over the last three years.
Server refresh cycles are slowing, due to ongoing semiconductor shortages, higher prices, extended delivery times, and reduced buying power for smaller organisations. The trend may also reflect a slowdown in server power efficiency gains.
Sustainability and accountability
Sustainability has been a serious consideration for infrastructure operators for many years, but only since 2020 has a data centre’s environmental footprint rivalled its resiliency as a major concern. In the coming years, legislators and other authorities will force operators to report significantly more data and to demonstrate a commitment to good environmental stewardship.
Most operators collect data that relates to power efficiency, which is as much about saving money as it is about reducing environmental impact. In 2022, 85% say they report their overall data centre power use and 73% report PUE (for either internal or external use). But when it comes to carbon emissions, the proportion of those who collect this data is still low (37%). This may quickly become an area of concern for businesses, as most organizations and / or their customers will be required to report this data under new laws, initiatives and rules that are being implemented around the world.
Environmental reporting: Although many are ill-prepared, most data centre operators surveyed (63%) think that authorities in their region will require data centres to publicly report environmental data in the next five years. Given that much of this legislation is already in motion (even in less regulated countries, public financial reporting is likely to require mandatory sustainability reporting), many of the remaining 37% will be forced to revise their opinion and instigate preparations.
Part of the solution rather than the problem: Although the industry is often singled out as a major energy consumer and large carbon emitter, many in the industry see this as unfair, while also viewing themselves as being more of a solution than problem. Data centres are estimated to only account for 2% to 3% of electricity use, which represents about 0.4% to 0.75% of global CO2 emissions. Also, IT enables significant efficiencies and fuel savings elsewhere (for example, more advanced engineering and less business travel) that more than offset its consumption.
Making data centres more sustainable
Over several decades, efforts to improve data centre sustainability have spanned a large number of technologies, directives, strategies and initiatives. While many of these have played a role, respondents highlighted two in particular: more options for buying renewable energy, and improved cooling. Improved IT utilization — an issue that Uptime argues needs more attention — is only selected by one in six.
Nuclear power: Uptime’s 2022 annual survey shows that data centre operators /owners in major data centre economies around the world are cautiously in favour of nuclear power. 53% of respondents in North America, and 42% in Europe, think that nuclear power should play a core role in reducing carbon footprint – and 23% North America, 28% Europe, believe it should play a temporary or transitional role.
Resiliency and outages
Power systems can make key contributions to resiliency as well as sustainability. Operators striving to deliver services with resilient data centres continue to invest in redundancy, but outages remain an issue.
IT outages themselves have become less binary: failures are now often partial, distributed and dependent on user configurations. Overall, Uptime’s data suggests that the number of outages globally increases year on year, as the industry itself expands. However, outage frequency does not grow as fast as the global data centre footprint. In 2022, 60% of operators surveyed say they had an outage in the past three years — down from 69% in 2021 and 78% in 2020.
These figures may not yet represent a strong trend – but while they are improving, their level is still high. There are also signs that the impact of at least some outages is decreasing. Also, many outages are increasingly caused by partial failures of systems or equipment, rather than total failures — which may also help to lessen the impact.
As a counterpoint to this, outages are becoming more expensive. A quarter of respondents say an outage had cost more than $1 million in both direct and indirect costs, a significant increase from 2021 and a continuing clear trend.
Power is still the main cause of outages
Uptime’s 2022 annual survey findings are remarkably consistent with previous years. They show that on-site power problems remain the single biggest cause of significant site outages by a large margin.
Deeper analysis in separate Uptime research identifies the biggest causes of power-related outages to be uninterruptible power supply failures, followed less commonly by transfer switch (generator / grid) and generator failures. While utility grid failures are never attributed by Uptime as a primary cause of outages, the slight increase in power-related failures in recent years may correlate with degrading grid reliability that lay bare substandard maintenance and training at some data centre sites.
Most outages are preventable: Human error is a contributary factor in 60% to 80% of outages. This supports Uptime’s frequently made point: the most impactful and cost-effective way to reduce outage occurrences is to improve management, planning and training.
More are increasing data centre resiliency: The growing use of public cloud infrastructures and cloud-style enterprise IT has been accompanied by the broader adoption of multisite resiliency, which was once exclusive to mission-critical applications. New software development techniques help make distributed redundancy easier because network traffic and workloads can be dynamically diverted to, and lost service easily recovered at, other sites.
Yet, operators continue to invest in increasing the resiliency of their physical infrastructure. About 40% of respondents say that they have increased the redundancy levels of their primary data centres in the past three to five years. Power and cooling systems have received similar attention, with about a third of operators surveyed upgrading either or both.
Users unprepared for inevitable cloud outages: Organizations are becoming more confident in using the cloud for mission-critical workloads, partly due to a perception of improved visibility into operational resiliency. However, this confidence may be misplaced. Cloud providers recommend that users distribute their workloads across multiple availability zones, as zone outages are relatively common. Each availability zone has separate power, networking and connectivity to help prevent more than one zone failing simultaneously.
Vendors and supply chains
Thrown off-balance by the pandemic, supply chains remain stretched by continued demand for new data centre capacity and facility upgrades. Three-quarters of vendors surveyed, including equipment makers, engineering services firms and consultants, project their revenues to increase in 2022, compared with 2021.
Staffing shortfalls
Attracting and retaining qualified data centre staff has been a challenge for operators for more than a decade. Continuously escalating demand for data centre capacity has driven an increase in the number and size of facilities, and a proliferation of job openings that still outpaces recruitment.
Innovation and impact
The industry continues to seek ways to cut back on new build capital needs and to improve energy efficiency. Several technologies have been proposed as the next step in data centre evolution and these are currently undergoing widespread testing or proof of concept deployments.
Half of data centre operators surveyed say software-defined power is one of the technological innovations they thought most likely to deliver significant improvements in energy efficiency in the next five years. This category includes various capabilities that rely on software interacting with electrical systems, including load shedding, load balancing and server throttling.
Artificial intelligence (AI) for data centre operations, the development of multisite resiliency as an alternative to equipment redundancy and the adoption of direct liquid cooling (DLC) are among other efficiency technologies being closely observed.
Conclusion
Companies continue to use PUE to track their drive towards better efficiency, yet the metric is not perfect. Inefficiencies within the IT function must also be understood, but quantification is difficult. However, operators will work on this; in addition to countering rising energy costs, they must satisfy more stringent legislative demands to achieve and prove better sustainability.
Outages – the other major concern for data centres – although reducing, remain too frequent, while becoming increasingly expensive. The Report found the biggest causes of power-related failures to be uninterruptible power supply failures, followed less commonly by transfer switch (generator / grid) and generator failures. While many companies are turning to multisite data centre operation to improve resiliency, the Institute reiterates its long-held belief that the most impactful and cost-effective way to reduce outage occurrences is to improve management, planning and training.
To find out more about KUP’s range of UPS systems and UPS maintenance programmes, please get in touch by calling 01256 386700.