431
Design in Tiles: Automating GEMM Deployment on Tile-Based Many-PE Accelerators
arXiv:2512.13638v1 Announce Type: new
Abstract: Tile-based many-Processing Element (PE) accelerators can achieve competitive performance on General Matrix Multiplication (GEMM), but they are extremely hard to program, as their optimal software mapping is deeply coupled with hardware design which is…
Abstract: Tile-based many-Processing Element (PE) accelerators can achieve competitive performance on General Matrix Multiplication (GEMM), but they are extremely hard to program, as their optimal software mapping is deeply coupled with hardware design which is…