Parallelism, performance, and memory

Extreme Computing with Top 500 Supercomputers

Moore’s Law

For many years computers have gotten faster primarily because the clock cycle was reduced and so the CPU was made faster. In 1965, Gordon Moore (co-founder of Intel) predicted that the number of transistors (or the transistor density, and hence the speed) on integrated chips would double every 18 months for the forseeable future. This is know as Moore’s law. This proved remarkably accurate for more than 40 years, see the graphs at [wikipedia-moores-law]. Note that doubling every 18 months means an increase by a factor of 4096 every 14 years.

Unfortunately, the days of just waiting for a faster computer in order to do larger calculations has come to an end. Two primary considerations are:

  • The limit has nearly been reached of how densely transistors can be packed and how fast a single processor can be made.

  • Even current processors can generally do computations much more quickly than sufficient quantities of data can be moved between memory and the CPU. If you are doing 1 billion meaningful multiplies per second you need to move lots of data around.

There is a hard limit imposed by the speed of light. A 2 GHz processor has a clock cycle of \(0.5\times 10^{-9}\) seconds. The speed of light is about \(3\times 10^8\) meters per second. So in one clock cycle, information cannot possibly travel more than 0.15 meters (i.e., \(0.15 \mbox{m} = (0.5\times 10^{-9} \mbox{sec.}) \times (3\times 10^8 \mbox{m/sec})\). In other words, a light year is a long distance but a light nanosecond is only about 0.15 meter or 0.5 feet!) If you’re trying to move billions of bits of information each second then you have a problem.

Another major problem is power consumption. Doubling the clock speed of a processor takes roughly 8 times as much power and also produces much more heat. By contrast, doubling the number of processors only takes twice as much power.

There are ways to continue improving computing power in the future, but they must include two things:

  • Increasing the number of cores or CPUs that can be used simultaneously (i.e., parallel computing)

    • Alternatively, move towards heterogeneous computing (i.e. CPU+GPU)

  • Using memory hierachies to improve the ability to have large amounts of data available to the CPU when it needs it.

Other Resources and Presentations on HPC

A list of ANL Training Program on Extreme Scale Computing.