CMC Microsystems is pleased to organize the 2nd workshop on accelerating AI, highlighting the challenges and opportunities of AI acceleration from the cloud to the edge.
This workshop aims to bring together experts from industry and academia to share their latest achievements and innovations in the field of AI and machine learning algorithms, software, and hardware from the cloud to deeply embedded machine learning systems.
- Inference in Edge Computing: Applications for AI accelerators in consumer Electronics including cameras, robotics, autonomous vehicles etc.
- Software-Hardware Codesign and optimization for Machine Learning
- Benchmarking Machine Learning Workloads on Emerging Hardware
- RISC-V for Edge AI applications
- Latest trends in AI chip design and commercialization
- Applying AI for EDA and CAD
This workshop will
- Promote innovation, adoption and early access to advanced technologies including silicon and systems for accelerating AI workloads from cloud to the edge.
- Share insights and experiences with others; explore collaboration opportunities and connect leaders from industry to AI researchers and start-ups.
- Influence technology selection (roadmap) and development activities of emerging AI trends.
The workshop is open to professors, research associates and graduate students at Canadian universities as well as industrial attendees who wish to provide input and advice.
|May 4, 2021||1:00 pm to 5:00 pm EDT||Virtual|
|1:00 PM||Yassine Hariri||CMC Microsystems||Welcome and opening remarks|
|1:10 PM||Qian Wang||Huawei Technologies Canada Co.,Ltd.||CANN: Unified Heterogeneous Computing Architecture to Unleash Ultimate Hardware Computing Power|
|1:30 PM||Griffin Lacey||Nvidia||Data Science on GPUs|
|1:50 PM||Davis Sawyer||DeepLite||Blackbox optimization framework for Constrained Deep Learning Models|
|2:10 PM||Pavel Sinha||Aarish Technologies||Low power AI system design|
|2:50 PM||George Shaker||University of Waterloo||Advances in Sensing Using AI & Low-cost Radars|
|3:10 PM||Paul Chow||University of Toronto||A Heterogeneous Platform for Large-Scale Machine Learning|
|3:30 PM||Mohammad Hossein Askari Hemmat||Polytechnique Montréal||BARVINN: Barrel RISC-V Neural Network Accelerator|
|3:50 PM||Rick O’Connor||OpenHW Group||CORE-V Cores: industrial grade, open-source RISC-V cores enabling AI accelerators at the edge.|
|4:10 PM||Panel and Open Discussion|
|5:00 PM||Workshop close|
Pricing and Registration
Yassine Hariri, Hariri@cmc.ca, CMC Microsystems
Over 15 years of experience in advanced and hybrid computing systems from the cloud to the edge, with a focus on artificial intelligence, computer vision, video, image and sensor fusion workloads acceleration, FPGA-based prototyping, software stack, and domain-specific hardware architectures. Currently leading projects related to the specification, development, implementation, deployment, and support of the next generation of artificial intelligence, advanced and hybrid computing infrastructure based on the combination of FPGAs, GPUs, and custom hardware accelerators. Dr. Hariri earned his B.A.Sc. in Computer Engineering from Ecole Marocaine des Sciences de l’ingénieur, Casablanca, Morocco, in 1998, and the M.S. and Ph.D. degrees from Ecole de Technologies Supérieure (ETS), Montreal, QC, Canada, in 2002 and 2008, respectively, all in electrical engineering.
|1:00 PM||Yassine Hariri||CMC Microsystems||Welcome and opening remarks||With Moore’s law losing its steam and the growing reliance on the cloud, new innovations in computing demand revolutionary changes at all levels of the computing stack: from compilers, applications and algorithms to the architecture of the datacenter, processors, microarchitectures and circuits. Some of the grand challenges for the next decade include investigating non Von-Neumann architectures, reducing the gap between software and hardware development cycles and in general empowering a broader community with the means to leverage application-specific computing hardware, bringing inference and even training for machine learning models to the edge devices. This presentation sets the context of the 2nd workshop on Accelerating AI, and highlights some of the most important challenges and opportunities facing AI deployment in the cloud and at the edge.||Over 15 years of experience in advanced and hybrid computing systems from the cloud to the edge, with a focus on artificial intelligence, computer vision, video, image and sensor fusion workloads acceleration, FPGA based prototyping, software stack, and domain-specific hardware architectures. Currently leading projects related to the specification, development, implementation, deployment, and support of the next generation of artificial intelligence, advanced and hybrid computing infrastructure based on the combination of FPGAs, GPUs, and custom hardware accelerators. Dr. Hariri earned his B.A.Sc. in Computer Engineering from Ecole Marocaine des Sciences de l?ingénieur, Casablanca, Morocco, in 1998, and the M.S. and Ph.D. degrees from Ecole de Technologies Supérieure (ETS), Montreal, QC, Canada, in 2002 and 2008, respectively, all in electrical engineering.|
|1:10 PM||Qian Wang||Huawei||CANN: Unified Heterogeneous Computing Architecture to Unleash Ultimate Hardware Computing Power||CANN is a computing architecture for neural networks invented by Huawei. It delivers ultimate graph compilation technology and abundant high-performance operators to help achieve the maximum computing power of Huawei?s Ascend AI Processors. This keynote will provide an introduction to this architecture and the related AI processors/products.||Dr. Wang received his Ph.D. in Electrical and Computer Engineering from University of Alberta. Now, he is a senior researcher in Huawei Canada, working on AI hardware, software and application development.|
|1:30 PM||Griffin Lacey||Nvidia||Data Science on GPUs||This talk will discuss how RAPIDS and the open source ecosystem are advancing data science. Learn how to get started leveraging these open-source libraries for faster performance and easier development on GPUs. See the latest engineering work and new release features, including benchmarks and roadmaps.||Griffin Lacey is a senior data scientist for NVIDIA. In his current role, he assists customers in designing and deploying their scientific compute infrastructure. Prior to NVIDIA, Griffin was a deep learning researcher for the University of Guelph, as well as at Google. He holds a Bachelors and Masters of Engineering degree from the University of Guelph, where his research focused on the efficiency of deep learning at both the hardware and software level|
|1:50 PM||Davis Sawyer||DeepLite||Blackbox optimization framework for Constrained Deep Learning Models||Designing deep learning-based solutions is becoming a race for training deeper models with a greater number of layers. While a large-size deeper model could provide competitive accuracy, it creates a lot of logistical challenges and unreasonable resource requirements during development and deployment. This has been one of the key reasons for deep learning models not being excessively used in various production environments, especially in edge devices. There is an immediate requirement for optimizing and compressing these deep learning models, to enable on-device intelligence. In this research, we introduce a black-box framework, Deeplite Neutrino for production-ready optimization of deep learning models. The framework provides an easy mechanism for the end-users to provide constraints such as a tolerable drop in accuracy or target size of the optimized models, to guide the whole optimization process. The framework is easy to include in an existing production pipeline and is available as a Python Package, supporting PyTorch and ONNX frameworks. The performance is shown across multiple benchmark datasets and popular deep learning models. We will also discuss how the framework is currently used in production and findings from these industry deployments.||Davis Sawyer is a Canadian tech entrepreneur and co-founder of Deeplite Inc., a Montreal-based AI software startup. As CPO, he focuses on product development and go-to-market strategy Prior to Deeplite, Davis developed statistical models for pharmaceutical safety at Takeda Oncology. He is the local chair for tinyML Montreal, a C2 Montreal Emerging Entrepreneur and is excited about the future energy-efficient computing.|
|2:10 PM||Pavel Sinha||Aarish Technologies||Low power AI system design||There’s a lot of noise in the industry today about accelerated machine learning or ways of making convolutional neural nets (CNNs) run faster. Most approaches to solving CNN accelerators are in the form of parallel processors programmed with an optimized compiler to perform the heavy lifting task of CNNs. Aarish’s approach is quite different, distancing itself from the general principle of instruction-based application-specific processors. We heavily reduce power consumption owing to our novel approach and thereby significantly differentiate ourselves from the rest. Aarish realizes that the ultimate power-optimized solution will never emerge from merely tweaking HW but must be the outcome of a synergy that optimizes HW, algorithm, and the overall system simultaneously. Aarish has innovated a hardware computing solution that is fundamentally different in approach from the rest of the parallel-processing paradigm that typically uses available load-store instruction set architecture solutions. Aarish’s hardware solution is highly scalable, enabling multiple ASICs to cascade to achieve higher computational capability or cascading multiple parallel ASICs to achieve extremely high throughput. Our solution provides our customers with a highly flexible solution to implement their AI compute needs. We estimate a net reduction in GHG equivalent of 8.38 KTonn CO2 per year in Quebec alone. Our solution provides the lowest power profile in the market. Further, Aarish has developed an algorithm that makes it possible to run a CNN with a 90+% reduction in computational cost (typical results are above 70%).||Pavel Sinha has over ten years of Industry R&D experience. He worked with the R&D division of Qualcomm as a Senior member of the team as Video ASIC Architect. Pavel also worked with Cadence as a Principal Engineer in their R&D division, designing ultra-high-speed on-silicon emulation Processors. Pavel holds a Bachelor’s and a Master’s degree in Electrical and Computer Engineer. He is pursuing a Ph.D. at McGill University, Montreal, Canada, with a specialization in Artificial-Intelligent/Machine-Learning and VLSI. Pavel’s Ph.D. work has resulted in Aarish Technologies, a startup, where he is the chief scientist, founder, and CEO. Aarish is developing high-performance, low-power, and low-cost Artificial Intelligence (AI) accelerators. Aarish was established in 2018 by founding members from Silicon Valley and the growing AI hub of Montreal. Pavel holds several patents in the industry and leads a cutting-edge research team.|
|2:50 PM||George Shaker||University of Waterloo||Advances in Sensing Using AI & Low-cost Radars||In this talk, we will present an overview of advanced sensing functionalities using low-cost radars combined with AI. We will demonstrate some applications in remote healthcare monitoring. We will discuss the design process, testing procedures, and implementation challenges.||Prof. George Shaker, Wireless Sensors & Devices Lab, University of Waterloo|
|3:10 PM||Paul Chow||University of Toronto||A Heterogeneous Platform for Large-Scale Machine Learning||AIgean, pronounced like the sea, is an open framework to build and deploy machine learning (ML) algorithms on a heterogeneous cluster of devices (CPUs and FPGAs). AIgean provides a full end-to-end multi-FPGA/CPU implementation of a neural network. The user supplies a high-level neural network description and our tool flow is responsible for the synthesizing of the individual layers, partitioning layers across different nodes as well as the bridging and routing required for these layers to communicate. We introduced AIgean last year at the 1st Workshop on Accelerating AI. In this talk, we will first review the goals and workings of AIgean, and discuss what we have done in the past year to fully complete a pushbutton flow from network description to multiple bitstreams that implement the network on FPGAs. We will present results from implementing ResNET-50 with versions using nine and 12 FPGAs, and other results that we achieve from the time this abstract was written until the workshop.||Paul Chow is a professor in the faculty of The Edward S. Rogers Sr. Department of Electrical and Computer Engineering at the University of Toronto where he holds the Dusan and Anne Miklas Chair in Engineering Design. He was a major contributor to the early RISC processor technology developed at Stanford University that helped spawn the rapid rise of computing performance in the past 30 years. Paul helped to establish the FPGA research group at UofT and did some of the early research in FPGA architectures, applications and reconfigurable computing. He has two papers of the 25 papers selected as the most influential papers in the first 20 years of FCCM, the premier conference on reconfigurable computing. His current research focuses on reconfigurable computing with an emphasis on programming models, middleware to support programming and portability, and scaling to large-scale, distributed FPGA deployments.|
|3:30 PM||Mohammad Hossein Askari Hemmat||Polytechnique Montréal||BARVINN: Barrel RISC-V Neural Network Accelerator||This talk presents a RISC-V processor designed to control an array of hardware accelerators for deep neural network models called matrix vector product units (MVUs). To control these MVUs, we designed a multithreaded barrel RISC-V processor. On every clock cycle, a different hardware thread (hart) is scheduled. By making the proposed barrel processor N-way threaded, we can assign one thread to control each of the MVUs. Each MVU is capable of arbitrary-precision GEneral Matrix Vector (GEMV) operations. To reduce the area required for implementation, our processor is an implementation of the RV32I augmented with a set of custom CSRs for controlling the PEs. Our design passes all RISC-V tests written in assembly and compiled with RISC-V gcc. When implemented on a popular FPGA, Our 8-hart barrel processor runs at 250 MHz with a CPI (cycle per instruction) of 1 while consuming 0.372W. To demonstrate the capabilities of our design, we computed a GEMV operation with an input matrix size of 8 by 128 and a weight matrix size of 128 by 128 with two-bit precision in only 16 clock cycles.||MohammadHossein AskariHemmat is a second year PhD student in the Electrical Engineering department of Ecole Polytechnique Montreal. In his doctoral research, he investigates methods for making computation of Deep Neural Network more efficient. Before pursuing a PhD degree, he worked for two years as an ASIC Verification Engineer at Microsemi and worked for a year as Software Engineer at Tru Simulation.|
|3:50 PM||Rick O’Connor||OpenHW Group||CORE-V Cores: industrial grade, open-source RISC-V cores enabling AI accelerators at the edge||This talk will provide an overview of the CORE-V Family which is an OpenHW Group project to develop, deploy, execute pre-silicon functional verification and SoC based evaluation kits of the CORE-V family of open-source RISC-V cores. Written in SystemVerilog, the CORE-V open-source IP matches the quality of IP offered by established commercial providers and is verified with state-of-the-art, auditable flows. The talk will also cover OpenHW Accelerate, a $22.5M research grant in partnership with Mitacs to enable OpenHW based research projects across a variety of computing applications.|
A proven industry executive with 25+ years of experience across the semiconductor, compute and telecom sectors driving open standards based technology deployment. Rick’s work has been focused on leading-edge product & business development with emphasis in the areas of strategy formulation and execution, technology investments, marketing and business development, new product development and introduction, strategic relationship management and driving M&A activity.
Specialties: Rick possesses a unique combination of business and technical skills and has particular strengths in market development; extensive technology and system level knowledge; leading business teams and new product development from concept to deployment; and extensive contacts across the industry. Rick has strong interpersonal & presentation skills and is a regular speaker at key industry forums with frequent interaction with industry and financial analysts.
|4:10 PM||Panel and Open Discussion|
|5:00 PM||Workshop close|
If you have any comments or questions regarding the contents of this workshop, please contact Yassine Hariri at Hariri@cmc.ca.
Workshop cancellations must be received in writing at least one (1) week before the beginning date of the course in question to receive a full refund of the registration fee. A cancellation made after the deadline will not receive a refund. CMC Microsystems makes no commitments on refunds for travel or accommodations.