Thermal design exploration for the Open Compute Project

January 06, 2016 // By Tom Gregory
The Open Compute Project Foundation is tackling a big challenge: how to scale computing infrastructure in the most efficient and economical way possible.

The Foundation, founded by Facebook, is fostering a rapidly growing community of engineers around the world whose mission is to design and enable the delivery of the most efficient server, storage and data centre hardware designs for scalable computing.

To achieve this aim the Open Compute Project Foundation provides a structure in which individuals and organisations can share their intellectual property with Open Compute Projects.

The University of Texas at Arlington (UTA) is one such organisation that is involved in the Open Compute Project. UTA wanted to investigate new cooling strategies to improve the thermal design of the Open Compute Project's Intel-based servers. To assist this work the team at UTA turned to thermal simulation to help find a solution.

Two different methods were chosen to improve the server’s thermal design: one improved the ducting inside the server, while the other utilised warm water cooling. To assess these options UTA used the 6SigmaET software throughout their project to create, simulate and fine-tune their proposed solutions to the server’s thermal design issues.

Solution 1: Improved Ducting

The first server had a removable chassis cover with an integrated air ducting system. However, this ducting was only provided in the CPU1 region: this caused excessive flow bypass in the CPU0 region, resulting in warm air entering heat sink 1. The university decided to investigate whether modifications to the server’s ducting system would improve its thermal performance.

Physical experiments were conducted on the server to determine its system impedance, flow rates, total server power consumption, fan speeds and fan power consumption at various power levels. This experimental data was used to generate and calibrate a detailed CFD model of the server using 6SigmaET. It was solved using the KE turbulence model, and the CFD model was matched with the temperatures obtained from testing. The CFD model and experimental data showed good agreement (see figure 1), with a maximum error of 12%.

Fig. 1: Comparison of experimental temperature results and CFD temperature results for CPU0

 

The university then used 6SigmaET to improve the server’s ducting system parametrically. The key goal was to reduce flow bypass in the CPU0 region without causing a temperature rise in the CPU1 region (and an increase in fan power, which would increase total server power consumption). The calibrated CFD model was used to parameterize the size and location of the duct, and solved for each new design iteration to determine how the processor temperatures would be affected in each case. This process led to a final design for the improved ducting system. This design was prototyped, then tested in the same way as the original server.

Fig. 2: Temperature plots of the original server (top) and the server with improved ducting (bottom).

 

The test results were positive: fan power consumption was reduced by 23.4-40%, fan speeds by 22-26% and flow rate by 31.3-37.3%, while the server’s temperature stayed within the recommended range (see figure 2).