Collaboration with University of Bordeaux for structured pruning co-design

Our team collaborated with scientists from the University of Bordeaux to analyze how configuration choices across the stack affect performance metrics.

Results demonstrate that structured pruning on systems featuring systolic array acceleration can effectively increase performance, while maintaining high QoS levels.

Up to 44% system-wide speedups due to structured pruning and quantization were measured, with only 1.4% word error rate degradation on the standard LibriSpeech dataset.

This work has been accepted for Interspeech 2025.

Congratulations to the authors:

Jean-Luc Rouas, Charles Brazier, Leila Ben Letaifa, Rafael Medina Morillas, Pedro Palacios Almendros, David Atienza and Giovanni Ansaloni
Systolic Arrays and Structured Pruning Co-design for Efficient Transformers in Edge Systems
Association for Computing Machinery
GLSVLSI ’25: Proceedings of the Great Lakes Symposium on VLSI 2025