3.5-D Blocking Optimization for Stencil Computations on Modern CPUs
January 1, 2012Intel Corporation Engineering, 2011-12
Liaison(s): Jatin Chhugani PhD
Advisor(s): Sarah Harris
Students(s): Stanislas Sebag (TL-S), Meera Punjiya (TL-F), Johnathan Chai (S), Steven Hang (S), Max Korbel (F), Trevor Apple (F)
The Intel clinic team is exploiting the architectural features of the Intel processor to decrease the execution time of a stencil operation. Stencil operations are used in simulations including seismic simulations or to compute large data grids. Typical stencil operation algorithms are bound by the available memory bandwidth and do not scale with rapidly increasing processor speed. To take advantage of increasing processor capability, the team designed a 3.5D blocking algorithm that uses caching, data-level parallelism (SIMD), and multi-threading on a multi-core, multi-socket platform to reduce the execution time of the stencil operation.