GrateTile: Efficient Sparse Tensor Tiling for CNN Processing

IEEE Workshop on Signal Processing Systems (SiPS)


Yu-Sheng Lin, Hung Chang Lu, Yang-Bin Tsao, Yi-Min Chih, Wei-Chao Chen, Shao-Yi Chien




We propose GrateTile, an efficient, hardware friendly data storage scheme for sparse CNN feature maps (activations). It divides data into uneven-sized subtensors and, with small indexing overhead, stores them in a compressed yet randomly accessible format. This design enables modern CNN accelerators to fetch and decompressed sub-tensors on-the-fly in a tiled processing manner.

GrateTile is suitable for architectures that favor aligned, coalesced data access, and only requires minimal changes to the overall architectural design. We simulate GrateTile with state-of-the-art CNNs and show an average of 55% DRAM bandwidth reduction while using only 0.6% of feature map size for indexing storage.


  • Neural Network Hardware
  • Data Compression
  • Sparse Matrix