Software Self-Diagnoses Server Power Supply Replacement

March 23, 2016 by Power Pulse1595211359

Fujitsu Laboratories Ltd. has developed a self-diagnostic technology to determine when a power supply unit needs to be replaced. This is software that can run on board the microcontroller of a digitally-controlled power supply, such as those used in servers and other information and communication technology (ICT) devices. Details of this technology were presented at this week's Applied Power Electronics Conference and Exposition 2016 (APEC2016). Fujitsu is aiming for a commercial launch of the technology in 2018.

The power supply units used in ICT hardware have limited lifespans, and maintaining them efficiently is a serious issue in large installations such as datacenters. Using a proprietary model-based development system for power supply units, Fujitsu Laboratories successfully analyzed changes in signals propagating through controller circuits that are caused by power supply degradation. In this way it developed a new method for evaluating the degradation of power supply units based solely on information already used by the microcontroller of the power supply controls.

The newly-developed method was implemented as software in an evaluation environment using the model-based development systems for power supply units, and it was confirmed that it could automatically diagnose when the power supply unit needs to be replaced, without the addition of any hardware components that monitor power supply degradation. In datacenter operations, for example, this technology enables power supply units to be replaced on a planned basis, thereby lowering maintenance costs and improving the reliability of operations.

Key features of the technology are as follows: 1. Signal analysis of the power-supply control circuits, Because just connecting evaluation equipment to observe the internal signals used to control the power supply would, as a result of the noise from the equipment, affect the operation of the power supply unit, it has been difficult to observe the internal conditions of the control circuitry.

By building an analytic environment using the model-based development systems for power supply units, Fujitsu Laboratories succeeded in directly monitoring the inner workings of the circuitry. Through the results of analyses of data captured by varying the degree of degradation in the electrolytic capacitors, the company discovered that, by analyzing the output voltage during sudden changes of the power supply's output, it could evaluate the degradation in the electrolytic capacitors.

2. New method for determining when to replace power supply units, Based on the data gathered from the microcontroller, Fujitsu Laboratories plotted the relationship between the amount of voltage fluctuation against the amount of load fluctuation under normal ICT-device operations, and the degree of electrolytic-capacitor degradation, creating a new way to determine when power supply units need to be replaced.

This technology can be added to digitally controlled power supply units used in servers and other ICT hardware to automatically diagnose replacement times without requiring any additional hardware components. Because this technology can evaluate power supply unit degradation even with load increases during overnight backup jobs, for example, it can diagnose power supply unit replacement times as a part of everyday operations. This makes it possible to replace power supplies on a planned basis, reducing maintenance costs and increasing reliability for datacenters.

The power supply units used in servers and other ICT hardware have limited lifespans, and redundant configurations are used to improve reliability, with units arrayed in parallel so that if one fails, operations can continue. But in large-scale installations such as datacenters, increasing the number of power supply units brings with it added maintenance costs related to power-supply failure and replacement. For that reason, there has been a need for technology that can visualize the lifespan and degradation of power supply units, and thereby make maintenance more efficient.

Among the various parts that make up a power supply unit, electrolytic capacitors, which stabilize the input/output flow, are known to be most susceptible to degradation. Since servers are typically in continuous operation, one way to monitor the degradation of electrolytic capacitors without interfering in operations would be to monitor "ripple"-variations in output voltage that accompany the on/off cycling in switching mode power supplies. But because ripple is only a few percent of the output voltage, it could only be monitored with additional high-precision equipment, making this approach impractical for power supplies in ICT hardware, which have severe constraints on cost and installed surface area.

Using its proprietary model-based development systems for digitally controlled power supply units, Fujitsu Laboratories has developed a new method for detecting power supply degradation that, by analyzing signal changes in the control circuitry that are caused by degradation in the power supply unit, makes it possible to determine when a power supply unit needs to be replaced based solely on information currently used in the power supply controls.

In developing power supply control software, because it is necessary to control the circuit with precise timing, adding new functions in the software and testing them was difficult, but, here too, use of the model-based development systems for digitally controlled power supply units made the development process more efficient.

Fujitsu Laboratories plans to continue operational testing of this technology in actual power supplies used in servers, with the goal of a practical implementation in 2018. It also plans to expand the range of components that can be automatically diagnosed, and to add functions that would allow pending operations on a server needing a power supply unit replacement to be shunted to a different server, to further reduce the costs of server maintenance and further increase reliability.