"Free-cooled" datacenters use ambient outside air instead of air conditioning. That lets us see how environment affects system components. Biggest surprise: temperature is not the disk drive killing monster we thought. Here's what is.
At last months Usenix FAST 16 conference, in the Best Paper award winner Environmental Conditions and Disk Reliability in Free-cooled Datacenters, researchers Ioannis Manousakis and Thu D. Nguyen, of Rutgers, Sriram Sankar of GoDaddy, and Gregg McKnight and Ricardo Bianchini of Microsoft, studied how the higher and more variable temperatures and humidity of free-cooling affect hardware components. They reached three key conclusions:
- Relative humidity, not higher or more variable temperatures, has a dominant impact on disk failures.
- High relative humidity causes disk failures largely due to controller/adapter malfunction.
- Despite the higher failure rates, software to mask failures and enable free-cooling is a huge money-saver.
BackgroundDatacenters are energy hogs. A web-scale datacenter can use more than 30 megawatts and collectively they are estimated to use 2 percent of US electricity production.
Moreover, the chillers for water cooling and the backup power required to keep them running in a blackout are costly too. As the use of cloud services has grown, the cost of hyperscale datacenters has led to more experimentation such as free-cooling and higher operating temperatures.
But to fully optimize these techniques, operators also need to understand their impact on the equipment. If lower energy costs are offset by higher hardware costs and downtime, it isn't a win.
The studyThe researchers looked at 9 Microsoft datacenters around the world for periods ranging from 1.5 to 4 years, covering over 1 million drives. They gathered environmental data including temperature and relative humidity and the variation of each.
Being good scientists, they took the data and built a model to analyze the results. They quantified the trade-offs between energy, environment, reliability, and cost. Finally, they have some suggestions for datacenter design...