OUDRHIRI Ali
Team : ALSOC
Departure date : 12/20/2023
https://fr.linkedin.com/in/ali-oudrhiri-3124b415a
Supervision : Alix MUNIER
Co-supervision : Roselyne CHOTIN
Performance of a Neural Network Accelerator Architecture and its Optimization Using a Pipeline-Based Approach
In recent years, neural networks have gained widespread popularity for their versatility and effectiveness in solving a wide range of complex tasks. Their ability to learn and make predictions from large data-sets has revolutionized various fields. However, as neural networks continue to find applications in an ever-expanding array of domains, their significant computational requirements become a pressing challenge. This computational demand is particularly problematic when deploying neural networks in resource-constrained embedded devices, especially within the context of edge computing for inference tasks.
Nowadays, neural network accelerator chips emerge as the optimal choice for supporting neural networks at the edge. These chips offer remarkable efficiency with their compact size, low power consumption, and reduced latency. Moreover, the fact that they are integrated on the same chip environment also enhances security by minimizing external data communication. In the frame of edge computing, diverse requirements have emerged, necessitating trade-offs in various performance aspects. This has led to the development of accelerator architectures that are highly configurable, allowing them to adapt to distinct performance demands.
In this context, the focus lies on Gemini, a configurable inference neural network accelerator designed with imposed architecture and implemented using High-Level Synthesis techniques. The considerations for its design and implementation were driven by the need for parallelization configurability and performance optimization.
Once this accelerator was designed, demonstrating the power of its configurability became essential, helping users select the most suitable architecture for their neural networks. To achieve this objective, this thesis contributed to the development of a performance prediction strategy operating at a high-level of abstraction, which considers the chosen architecture and neural network configuration. This tool assists clients in making decisions regarding the appropriate architecture for their specific neural network applications.
During the research, we noticed that using one accelerator presents several limits and that increasing parallelism had limitations on performances. Consequently, we adopted a new strategy for optimizing neural network acceleration. This time, we took a high-level approach that did not require fine-grained accelerator optimizations. We organized multiple Gemini instances into a pipeline and allocated layers to different accelerators to maximize performance. We proposed solutions for two scenarios: a user scenario where the pipeline structure is predefined with a fixed number of accelerators, accelerator configurations, and RAM sizes. We proposed solutions to map the layers on the different accelerators to optimise the execution performance. We did the same for a designer scenario, where the pipeline structure is not fixed, this time it is allowed to choose the number and configuration of the accelerators to optimize the execution and also hardware performances. This pipeline strategy has proven to be effective for the Gemini accelerator.
Although this thesis originated from a specific industrial need, certain solutions developed during the research can be applied or adapted to other neural network accelerators. Notably, the performance prediction strategy and high-level optimization of NN processing through pipelining multiple instances offer valuable insights for broader application.
Defence : 12/20/2023
Jury members :
Angeliki KRITIKAKOU, Assoc.Prof, Inria, Univ Rennes, CNRS, IRISA, Rennes [Rapporteur]
Philippe COUSSY, Prof., Lab-STICC, Univ. de Bretagne-Sud, Lorient [Rapporteur]
Roselyne CHOTIN, Assoc.Prof, Sorbonne Univ., CNRS, LIP6, Paris
Maxime PELCAT, Assoc.Prof, IETR, INSA Rennes
Frédéric PETROT, Prof., UGA, CNRS, Grenoble INP, TIMA, Grenoble
Pascal URARD, Directeur innovation, STMicroelectronics, Crolles
Alix MUNIER KORDON, Prof., Sorbonne Univ., CNRS, LIP6, Paris
2023-2024 Publications
-
2024
- A. Oudrhiri, A. Munier : “Pipeline Configuration Methodology for Optimizing Neural Network Accelerators Utilization under a Throughput Constraint”, (2024)
-
2023
- A. Oudrhiri : “Performance of a Neural Network Accelerator Architecture and its Optimization Using a Pipeline-Based Approach”, thesis, phd defence 12/20/2023, supervision Munier, Alix, co-supervision : Roselyne, CHOTIN (2023)
- A. Oudrhiri, E. Taly, N. Bain, A. Munier‑Kordon, R. Guizzetti, P. Urard : “Performance Modeling and Estimation of a Configurable Output Stationary Neural Network Accelerator”, (2023)