
2nd September

10:30 – 12:30Registration
12:30 – 13:30Lunch
13:30 – 17:00TUTORIAL: Designing for the Neural Processing Unit on AMD Ryzen AI with Open-Source Tools
Chair: Dr. Mario Ruiz
13:30 – 14:00Tutorial Welcome and General Introductions
Dr. Mario Ruiz
14:00 – 14:30Introduction to Ryzen AI NPU and Riallto
Dr. Mario Ruiz
14:30 – 15:00Explore NPU architectural features with Riallto
Dr. Mario Ruiz
15:00 – 15:30Coffee Break
15:30 – 16:00Write your own compute kernel and connectivity
Dr. Mario Ruiz
16:00 – 16:30IRON AIE
Dr. Mario Ruiz
16:30 – 17:00AMD Ryzen AI Software
Dr. Mario Ruiz

We regret to inform you that the Workshop on Security for Custom Computing Machines (SCCM) has been postponed to avoid overlap with Conference on Cryptographic Hardware and Embedded Systems (CHES) and will not be held during FPL 2024

IMPORTANT:To attend AMD’s Tutorial on “Co-Designing Compute Architectures That Can Accelerate Neural Networks Using FINN,” it is mandatory to register at the following link to access AWS instances to use during the tutorial.  LINK:

3rd September

08:00 – 08:30 Registration
08:30 – 12:30 TUTORIAL: SODA Synthesizer
Chair: Antonino Tumeo
Additional Info
TUTORIAL: Co-Designing Compute Architectures That Can Accelerate Neural Networks Using FINN (Part I)
Chair: Thomas Preußer, Jakoba Petri-Koenig, Felix Jentzsch, Lukas Stasytis, Michaela Blott and Zaid Al-Ars
08:30 – 10:30 General introduction to FINN
In this tutorial, we present FINN, an open-source experimental framework by AMD Research to help the broader community explore QNN inference on FPGAs. FINN builds high-performance dataflow-style FPGA architectures specific to the custom network while providing a full-stack solution from quantization-aware model training to bitfile generation.
10:30 – 11:00 Coffee Break
11:00 – 12:30 FINN community and Poster Session
12:30 – 13:30 Lunch
13:30 – 17:15 TUTORIAL: Co-Designing Compute Architectures That Can Accelerate Neural Networks Using FINN (Part II)
Chair: Thomas Preußer, Jakoba Petri-Koenig, Felix Jentzsch, Lukas Stasytis, Michaela Blott and Zaid Al-Ars
13:30 – 17:15 Hands-on
  • Training a quantized MLP on the UNSW-NB15 dataset with Brevitas
  • Exporting the trained network to FINN-ONNX + verifying in FINN compiler
  • Performance estimation and bitfile generation with the FINN compiler
  • Using a simple CNV to explore the various options to configure the FINN builder tool
17:00 – 19:00 Welcome Reception Welcome Reception Info

4th September

08:00 – 08:30 Registration
08:30 – 08:40 Opening
08:40 – 09:40 Keynote
Chair: Luciano Lavagno
08:40 – 09:40 AMD Vitis™ High-Level Synthesis (HLS) Tool: Principles and Evolution
Alain Darte
09:40 – 10:30 Session: Applications
Chair: Carsten Trinitis
09:40 – 10:00 Exploring the Versal AI Engines for Signal Processing in Radio Astronomy
Victor van Wijhe, Vincent Sprave, Daniele Passaretti, Nikolaos Alachiotis, Gerrit Grutzeck, Thilo Pionteck and Steven van der Vlugt
10:00 – 10:20 JSON-CooP: A JSON Decompression/Parsing Co-Design for FPGAs
Tobias Hahn, Stefan Wildermann and Jürgen Teich
10:20 – 10:30 KIT: Kernel Isotropic Transformation of Bilateral Filters for Image Denoising on FPGA
Fanny Spagnolo, Pasquale Corsonello, Fabio Frustaci and Stefania Perri
10:30 – 11:00 Coffee Break
11:00 – 12:20 Session: Placement & Routing
Chair: Dirk Strootbandt
11:00 – 11:20 DynaRapid: Fast-Tracking from C to Routed Circuits (*)
Andrea Guerrieri, Srijeet Guha, Chris Lavin, Eddie Hung, Lana Josipovic and Paolo Ienne
11:20 – 11:40 The Road Less Traveled: Congestion-Aware NoC Placement and Packet Routing for FPGAs (*)
Soheil Gholami Shahrouz and Vaughn Betz
11:40 – 12:00 Better Together: Combining Analytical and Annealing Methods for FPGA Placement
Rachel Selina Rajarathnam, Kate Thurmer, Vaughn Betz, Mahesh A. Iyer and David Z. Pan
12:00 – 12:20 A High-Performance Routing Engine for Large-Scale FPGAs
Timothy Martin, Dani Maarouf, Shawki Areibi and Gary Grewal
12:20 – 13:20 Lunch
13:20 – 14:40 Session: High-Bandwidth & Virtual Memory
Chair: Jeff Goeders
13:20 – 13:40 SERI: High-Throughput Streaming Acceleration of Electron Repulsion Integral Computation in Quantum Chemistry using HBM-based FPGAs (#)
Philip Stachura, Guanyu Li, Xin Wu, Christian Plessl and Zhenman Fang
13:40 – 14:00 H2PIPE: High throughput CNN Inference on FPGAs with High-Bandwidth Memory (#)
Mario Doumet, Marius Stan, Mathew Hall and Vaughn Betz
14:00 – 14:20 FlexiMem: Modular and Reconfigurable Virtual Memory
Canberk Sönmez, Mohamed Mahfouz Shahawy, Cemalettin Cem Belentepe and Paolo Ienne
14:20 – 14:30 SoGraph: A State-Aware Architecture for Out-of-Memory Graph Processing on HBM-Equipped FPGAs
Qianyu Cheng, Zhendong Zheng, Tianhao Jiang, Cheng Tang, Teng Wang, Lei Gong, Xianglan Chen, Chao Wang and Xuehai Zhou
14:30 – 14:40 Leveraging HBM2 for Accelerating k-mer Counting with oneAPI on FPGAs
Owen Lucas and Alan George
14:40 – 15:10 Coffee Break
15:10 – 16:20 Session: High-Level Synthesis & Simulation
Chair: Antonino Tumeo
15:10 – 15:30 StencilStream: A SYCL-based Stencil Simulation Framework Targeting FPGAs (*)
Jan-Oliver Opdenhövel, Christoph Alt, Christian Plessl and Tobias Kenter
15:30 – 15:50 Efficient Design Space Exploration for Dynamic & Speculative High-Level Synthesis
Dylan Leothaud, Jean-Michel Gorius, Simon Rokicki and Steven Derrien
15:50 – 16:00 Fast Switching Activity Estimation for HLS-Produced Dataflow Circuits
Jiantao Liu, Maksymilian Graczyk, Andrea Guerrieri and Lana Josipović
16:00 – 16:10 FlexWalker: An Efficient Multi-Objective Design Space Exploration Framework for HLS Design
Zheyuan Zou, Cheng Tang, Lei Gong, Chao Wang and Xuehai Zhou
16:10 – 16:20 Chimera: A co-simulation framework combining with gem5 and FPGA platform for efficient verification
Chao Fu, Zengshi Wang and Jun Han
16:20 – 19:00 Exhibition 16:20 – 17:00 Sponsor’s Talks
17:00 – 19:00 Exhibition Reception

5th September

08:00 – 08:30 Registration
08:30 – 09:30 Keynote
Chair: Fabrizio Ferrandi
08:30 – 09:30 The Data Center of the Future: Disaggregated, Serverless and Heterogeneous
Miriam Leeser
09:30 – 10:20 Session: Machine Learning 1
Chair: Thilo Pionteck
09:30 – 09:50 NeuraLUT: Hiding Neural Network Density in Boolean Synthesizable Functions
Marta Andronic and George A. Constantinides
09:50 – 10:10 PolyLUT-Add: FPGA-based LUT Inference with Wide Inputs
Binglei Lou, Richard Rademacher, David Boland and Philip Leong
10:10 – 10:20 Kratos: An FPGA Benchmark for Unrolled Deep Neural Networks with Fine-Grained Sparsity and Mixed Precision
Xilai Dai, Yuzong Chen and Mohamed Abdelfattah
10:20 – 10:25 Poster Pitches
10:25 – 11:20 Coffee Break and Poster Session
11:20 – 12:20 Session: Cryptography & Security
Chair: David Andrews
11:20 – 11:40 UniGuard: A Unified Hardware-oriented Threat Detector for FPGA-based AI Accelerators (#)
Xiaobei Yan, Han Qiu and Tianwei Zhang
11:40 – 12:00 A Better Kyber Butterfly for FPGAs
Jonas Bertels, Quinten Norga and Ingrid Verbauwhede
12:00 – 12:20 Techniques for Exploring Fine-Grained LUT and Routing Aging on a 28nm FPGA
Hayden Cook and Jeffrey Goeders
12:20 – 13:20 Lunch
13:20 – 14:40 Session: Architectures
Chair: Suhaib Fahmy
13:20 – 13:40 A Software-Programmable Neural Processing Unit for Graph Neural Network Inference on FPGAs (#)
Taikun Zhang, Andrew Boutros, Sergey Gribok, Kwadwo Boateng and Vaughn Betz
13:40 – 14:00 Revealing Untapped DSP Optimization Potentials for FPGA-Based Systolic Matrix Engines
Jindong Li, Tenglong Li, Guobin Shen, Dongcheng Zhao, Qian Zhang and Yi Zeng
14:00 – 14:20 SA4: A Comprehensive Analysis and Optimization of Systolic Array Architecture for 4-bit Convolutions
Geng Yang, Jie Lei, Zhenman Fang, Jiaqing Zhang, Junrong Zhang, Weiying Xie and Yunsong Li
14:20 – 14:30 CFEACT: A CGRA-based Framework Enabling Agile CNN and Transformer Accelerator Design
Yiqing Mao, Xuchen Gao, Jiahang Lou, Yunhui Qiu, Wenbo Yin, Wai-Shing Luk and Lingli Wang
14:30 – 14:40 IMAGine: An In-Memory Accelerated GEMV Engine Overlay
Md Arafat Kabir, Tendayi Kamucheka, Nathaniel Fredricks, Joel Mandebi, Jason Bakos, Miaoqing Huang and David Andrews
14:40 – 15:10 Coffee Break
15:10 – 16:30 Session: Machine Learning 2
Chair: Vaughn Betz
15:10 – 15:30 AMA: An Analytical Approach to Maximizing the Efficiency of Deep Learning on Versal AI Engine
Xiaodong Deng, Shijie Wang, Tianyi Gao, Jing Liu, Longjun Liu and Nanning Zheng
15:30 – 15:50 A Heterogeneous Acceleration System for Attention-Based Multi-Agent Reinforcement Learning
Samuel Wiggins, Yuan Meng, Mahesh Iyer and Viktor Prasanna
15:50 – 16:10 Fitop-Trans: Maximizing Transformer Pipeline Efficiency through Fixed-Length Token Pruning on FPGA
Kejia Shi, Manting Zhang, Keqing Zhao, Xiaoxing Wu, Yang Liu, Jun Yu and Kun Wang
16:10 – 16:20 An Open-Source and Extensible Framework for Fast Prototyping and Benchmarking of Spiking Neural Network Hardware
Shadi Matinizadeh and Anup Das
16:20 – 16:30 HASS: Hardware-Aware Sparsity Search for Dataflow DNN Accelerator
Zhewen Yu, Sudarshan Sreeram, Krish Agrawal, Junyi Wu, Alexander Montgomerie-Corcoran, Cheng Zhang, Jianyi Cheng, Christos-Savvas Bouganis and Yiren Zhao
16:30 – 18:30 Special Session: FPGA Networking
Organized by Mario Baldi (AMD)
Additional Info
19:30 Social Dinner @ Esperia Restaurant Social Dinner

6th September

08:30 – 09:00Registration
09:00 – 10:00Keynote
Chair: Lana Josipovic
09:00 – 10:00Fantastic Arithmetic Beasts and where to find them
Florent de Dinechin and Bogdan Pasca
10:00 – 10:40Session: Edge & Low-Power Computing
Chair: Stefania Perri
10:00 – 10:20SDA: Low-Bit Stable Diffusion Acceleration on Edge FPGAs
Geng Yang, Yanyue Xie, Zhong Jia Xue, Sung-En Chang, Yanyu Li, Peiyan Dong, Jie Lei, Weiying Xie, Yanzhi Wang, Xue Lin and Zhenman Fang
10:20 – 10:30E3HDC: Energy Efficient Encoding for Hyper-Dimensional Computing on Edge Devices
Mahboobe Sadeghipourrudsari, Jonas Krautter, Vincent Meyers and Mehdi Tahoori
10:30 – 10:40Energy-Aware Synchronization of Hardware Tasks in Virtualized Embedded Systems
Cornelia Wulf, Gökhan Akgün, Mehdi Safarpour, Anastacia Grishchenko and Diana Goehringer
10:40 – 11:10Coffee Break
11:10 – 12:20Session: Arithmetic
Chair: Mario Porrmann
11:10 – 11:30FPGA Modular Multipliers using Hybrid Reduction Techniques
Sergey Gribok, Martin Langhammer and Bogdan Pasca
11:30 – 11:50Shedding the Bits: Pushing the Boundaries of Quantization with Minifloats on FPGAs
Shivam Aggarwal, Hans Jakob Damsgaard, Alessandro Pappalardo, Giuseppe Franco, Thomas B. Preußer, Michaela Blott and Tulika Mitra
11:50 – 12:10Exploring FPGA designs for MX and beyond
Ebby Samson, Naveen Mellempudi, Wayne Luk and George Constantinides
12:10 – 12:20Fast and Practical Strassen’s Matrix Multiplication using FPGAs
Afzal Ahmad, Linfeng Du and Wei Zhang
12:20 – 13:20Lunch
13:20 – 14:40Session: Accelerators
Chair: Andrea Guerrieri
13:20 – 13:40FORC: A High-Throughput Streaming FPGA Accelerator for Optimized Row Columnar File Decoders in Big Data Engines
Abdul Wadood, Alec Lu, Ken Zhang and Zhenman Fang
13:40 – 14:00BitBlender: Scalable Bloom Filter Acceleration on FPGAs with Dynamic Scheduling
Kenneth Liu, Alec Lu and Zhenman Fang
14:00 – 14:20LORA: A Latency-Oriented Recurrent Architecture for GPT Model on Multi-FPGA Platform with Communication Optimization
Zhendong Zheng, Qianyu Cheng, Teng Wang, Lei Gong, Xianglan Chen, Cheng Tang, Chao Wang and Xuehai Zhou
14:20 – 14:30DTrans: A Dataflow-transformation FPGA Accelerator with Nonlinear-operators fusion aiming for the Generative Model
Xuanzheng Wang, Shuo Miao, Peng Qu and Youhui Zhang
14:30 – 14:40CFSA: An Efficient CPU-FPGA Synergies Accelerator for Neural Radiation Field Rendering
Shangrong Li, Kai Liu, Wei Liu, Zibo Guo and Chongyang Ding
14:40 – 14:50Closing

(*) Michal Servit Best Paper Award Candidates
(#) Stamatis Vassiliadis Best Paper Award Candidates

Scroll to Top