An ExaFlop Under 20 Megawatts by 2019? Not Likely
Kirk Cameron is obsessed with power efficiency for servers. The Virginia Tech professor, who created the benchmarks for the Department of Energy's Energy Star Program, was co-founder of the Green 500, and is co-founder of power management software startup company MiserWare, says that no matter how well the industry thinks it's improving power efficiency, it's losing the battle.
In a talk at SC12 on Monday, he asserted that exascale computing will face an almost intractable problem from energy consumption. The Department of Energy had set a goal of getting under 20 Mw for a single exaflop by 2019. Cameron says the word is now that the time frame is being set back to 2022. He doesn't think it will happen by then either.
"The power consumption of systems is increasing exponentially," he said. "We can claim that we're being more energy efficient on the one hand, because we're getting more megaflops per watt, but we're still on a trajectory of using more power for the newer machines than we did in the previous years. That doesn't really seem like green to me."
He noted that from 2007 to 2012, the HPC industry has seen a six-fold increase of flops per watt, but has still seen a 2 1/2 times increase in power consumption . If the industry stays on that path, it will reach about 15,000 megaflops per watt by around 2020. That comes out to 66 Mw per exaflop--more than three times the goal DOE had set.
What's worse is that the efficiency gained in the last five years has been driven in large part by the highly energy-efficient GPUs. He doesn't see anything on the horizon that will give the same boost in performance per watt. "We have to have something at least equivalent to the efficiency jump we got from GPUs to happen again in the next five to seven years," he said. "We need a lot of help. We need disruptive things. We need another thing like the GPU or we're not going to make it. But a silver bullet is unlikely."
For one thing, he doesn't believe the microprocessor technology will improve enough. What about the growing interest in low-power ARM chips, for example? "We've seen this before. IBM did it in Blue Gene, using a low power embedded chip. the problem is you need 130,000 of them to approach the same performance. You don't solve the problem, you just change where the problem manifests itself. Now you have 130,000 things to program around. We all know how that worked out."
The only potential solution is not to rely on better processors, but to create and get behind major power management techniques, such as getting a CPU to dynamically switch to a lower power state when its tasks don't require as much energy, and automatically increase its power usage as the compute load increases. (Of course, that's just the kind of thing he's working on.) But it's hard to get the HPC community to make that kind of mindset change.
Given the unlikelihood of a silver bullet, he has another suggestion for what's needed to get behind new techniques: "What we really need is a major catastrophe."
Intelligence agencies understand the urgency of the problem, he says. "I don't have a problem telling large scale intelligence agencies that they need power management. They've had brownouts. They've had these catastrophes that they've had to recover from. We haven't really seen that in HPC yet. Those catastrophes are what motivates people to use power management."
Still, being an optimist, he suggests that perhaps the HPC industry will see just such a catastrophe in the near future, and adds: "We can only hope."