CMC Microsystems is pleased to organize a one-day workshop highlighting the challenges and opportunities of AI acceleration from the cloud to the very edge. The same workshop is hosted twice on different days at different locations. You may attend either or both. If you are interested in another date, view the workshop in March.
This workshop aims to bring together experts from industry and academia to share their latest achievements and innovations targeting both training and inference from cloud to edge, with a focus on:
- new architectures and approaches to accelerate deep learning (DL) workloads,
- software stack and deep learning frameworks,
- open-source processor technology RISC-V customized with ultra-low power highly-specialized computing engines for DL inference at the very edge, and
- the latest trends in AI chip design and commercialization.
Agenda
Abstract:
Deep-learning-based solutions for embedded vision have emerged as a key application of the growing class of artificial intelligence-based solutions. Specialized accelerators for deep neural networks (DNN) have emerged in order to achieve the highest performance at low-cost and low-power. Computational requirements for DNN accelerators continue to increase, driven in particular by autonomous driving applications.
This presentation introduces some techniques for efficient scaling of DNN graph performance on multiple DNN accelerators, with a particular focus on bandwidth reduction technologies. This includes data compression, layer merging and efficient data sharing across multiple accelerators.
Bio:
Dr. Pierre G. Paulin is Director of R&D for Embedded Vision at Synopsys. He is responsible for the application development, architecture design and S/W programming tools for embedded vision processors supporting classical computer vision and deep learning-based solutions. Prior to this, he was director of System-on-Chip Platform Automation at STMicroelectronics in Canada, working on platform programming tools for multi-processor systems-on-a-chip, targeting computer vision, video codecs and network processors.
This followed his previous positions as director of Embedded Systems Technologies for STMicroelectronics in Grenoble, France, and manager of Embedded Software and High-level synthesis tools with Nortel Networks in Canada. His interests include embedded vision, AI, video processing, multi-processor systems, and system-level design.
He obtained a Ph.D. from Carleton University, Ottawa, and B.Sc. and M.Sc. degrees from Laval University, Quebec. He won the best paper award at ISSS-Codes in 2004. He is a member of the IEEE.
Abstract:
Deep neural networks (DNN) are a key enabler for a number of current and emerging technologies. Computational requirements for DNNs can be large, inspiring research on low-precision and mixed-precision instead of full floating-point operations. Generic processors such as CPUs and GPUs are not necessarily optimized to compute neural networks efficiently, especially with low- and mixed-precision operations. In this presentation, recent advances in using programmable logic devices such as FPGA to build custom accelerators for DNNs will be reviewed, notably those exploiting low-precision and quantization. A new FPGA-based DNN accelerator design will also be covered, one that supports mixed-precision operations on low-bit numbers (down to 1-bit).
Bio:
Sean Wagner is a research scientist with the IBM Canada Research and Development Centre. He earned his B.A.Sc. in Computer Engineering from the University of Waterloo, and M.A.Sc. and Ph.D. from the Electrical and Computer Engineering Department of the University of Toronto. Sean specializes in high-performance computer hardware and architecture (reconfigurable and heterogeneous systems in particular), and has carried out research in photonics, nanofabrication, and communications systems. In his work with SOSCIP (www.soscip.org), he provides technical leadership to the research consortium and it’s industry/academic collaborative research projects, helping researchers use SOSCIP’s advanced IBM high-performance computing platforms. With research partners at SOSCIP member institutions, Dr. Wagner has helped with the development of numerous projects including accelerating real-time fMRI analysis for neuroscience applications and accelerating photodynamic cancer therapy planning software.
Abstract:
DaVinci is an AI processor architecture invented by Huawei. It is a unified architecture for neural network acceleration with best-in-class power efficiency. This keynote will provide an introduction to this architecture and the AI processors/products based on it.
Bio:
Dr. Xu leads the research and development of IC architecture and algorithm design, in the area of AI, Computer Graphics and Computer Vision. He received his B.Sc. and M. Sc. in Computer Science from Tsinghua University and Ph.D. from the University of Regina. Prior to joining Huawei, he was the co-founder and CEO of Yunen Communication Inc., and CTO of Reality Commerce Corp., and IC algorithm architect at Teradici inc., working on various video and graphics processing services and products.
Abstract:
The computational demands of modern deep learning algorithms are increasing at a tremendous rate. AMD Radeon Instinct GPUs and AMD’s open-source ROCm software infrastructure provide an efficient, high-performance platform for training a wide range of models used in deep learning. This session will cover AMD’s current product offerings, supported software environments, and machine learning workloads which are particularly suited to GPU acceleration. There will also be a discussion of unique capabilities possible with systems based on the combination of AMD CPU technology and AMD GPU technology.
Bio:
Niles Burbank leads the Solutions Architecture team in AMD’s Data Center GPU Business Unit. His team supports strategic customers in deploying AMD GPUs for machine learning, high-performance computing, cloud gaming, and virtual desktop infrastructure (VDI). Prior to his current position, Niles served in a variety of product planning and product management roles at AMD and ATI Technologies for more than twenty years. Niles holds a bachelor’s degree in engineering physics from the Royal Military College of Canada and a master’s degree in electrical engineering from the University of Toronto.
Bio:
Paul Chow is a professor in the faculty of The Edward S. Rogers Sr. Department of Electrical and Computer Engineering at the University of Toronto where he holds the Dusan and Anne Miklas Chair in Engineering Design. He was a major contributor to the early RISC processor technology developed at Stanford University that helped spawn the rapid rise of computing performance in the past 30 years. Paul helped to establish the FPGA research group at UofT and did some of the early research in FPGA architectures, applications and reconfigurable computing. He has two papers of the 25 papers selected as the most influential papers in the first 20 years of FCCM, the premier conference on reconfigurable computing. His current research focuses on reconfigurable computing with an emphasis on programming models, middleware to support programming and portability, and scaling to large-scale, distributed FPGA deployments.
Abstract:
For several decades now, Moore’s Law and Dennard Scaling have allowed software developers to effortlessly satisfy the increasing demand for computing power. Recent years, however, have seen the dramatic slowdown of Moore’s law and the end of Dennard scaling. Chip designers have thus turned to application-specific accelerators to meet the processing demands imposed by IoT, big data and AI. Nowhere is this more prevalent than in machine learning, where a huge number of accelerators have been proposed recently. While these accelerators offer significant speed-ups, they are challenging for software developers to use as their use requires specialized knowledge of the underlying hardware of each chip to maximize performance. The difficulty is further exacerbated by the large number of software frameworks available for ML. With new chips being released on a frequent cadence, programmers and framework developers must scramble to maximize performance on the latest ML accelerators. This situation threatens to nullify all the speedup benefits offered by new ML accelerators. A fundamental shift is needed in how programmers work with accelerators to quickly and accurately program them to maximize performance. This is vital for ensuring we can keep pace with the rising demand for ML compute in the years to come.
Abstract:
Hardware systems for accelerating Artificial Intelligence (AI) and Deep Learning (DL) models have been widely studied recently in response to a substantial need to accelerate AI and DL models. For this reason, many researchers and industrialists have been proposing different approaches to accelerate the heavy calculations of AI models and answer to their continuous increase in size and complexity. Among these techniques, Neuromorphic technology promises high performances, low cost, and real-time processing.
Neuromorphic technology is devoted to the design and development of computational hardware that mimics the characteristics and capabilities of neuro-biological systems. They have been mostly implemented in the analog domain before seeing the emergence of new approaches with the hybrid implementation or even only digital implementation using Field Programmable Gate Arrays (FPGA). With FPGAs, we can improve the flexibility of the implemented models to obtain better accuracy and real-time processing. As well, FPGAs enables creating hybrid systems quickly. Besides, FPGAs boards could implement realistic models that incorporate the non-linearity, plasticity, excitations, and extinctions of the biological model. Several Neuromorphic platforms (SpiNNaker, Loihi, True North…) exist and can implement a large neuronal network consisting of thousands to millions of neurons!
Bio:
Idir Mellal Ph.D. Industrial Post Doctoral Fellow, a specialist in Hardware Implementation and Digital Design. He is a talented FPGA Implementations expert with more than 10 years of experience. His primary interest is building an effective and robust neuromorphic platform for neurocomputing and accelerating AI models. Currently leading a project at Krembil Research Institute, at Toronto Western Hospital, for building a digital neuromorphic platform mimicking biological neurons. Dr. Mellal earned all his degrees in Electrical Engineering, B.A.Sc., and M.S. degrees at Mouloud Mammeri University, in Tizi Ouzou, Algeria, in 2008 and 2010, respectively; and a Ph.D. at the University of Quebec in Outaouais, Gatineau, Quebec, in 2018.
Abstract:
The democratization of DNA sequencing has been an important addition to the genomics industry and the bioinformatics research field since the market introduction of small and cheap sequencers in 2014. This development is the result of decades-long research in nanosensors and the unabated advancements in semiconductor technology. These advancements have yielded custom chips capable of integrating nanopore sensor arrays consisting of thousands of channels per square-centimetre and mixed-signal CMOS circuits along with substantial computing powers. However, these mobile sequencers are still facing serious computing challenges in terms of power, speed and memory. These are among the technological obstacles thwarting the transition of these devices into highly commoditized molecular sensors that may be economically deployed at large scales. This presentation highlights these challenges, and shows our research results towards addressing these challenges, and our future work for realizing an embedded solution that would potentially compete with existing sequencing devices.
Bio:
Karim Hammad received his B.Sc. and M.Sc. degrees in Electronics and Communications Engineering from the Arab Academy for Science, Technology and Maritime Transport (AASTMT), Cairo, Egypt, in 2005 and 2009, respectively, and the Ph.D. degree in Electrical and Computer Engineering from the University of Western Ontario, Canada in 2016. Currently, he is an Assistant Professor in the Department of Electronics and Communications Engineering at the AASTMT, Cairo, Egypt, and Postdoctoral Visitor at York University, Toronto, Ontario, Canada. His research interests include wireless networks cross-layer design, physical layer security and digital circuit design.
Abstract:
This session will provide a brief overview of the RISC-V instruction set architecture and describe the CORE-V family of open-source cores that implement the RISC-V ISA. RISC-V (pronounced “risk-five”) is an open, free ISA enabling a new era of processor innovation through open standard collaboration. Born in academia and research, RISC-V ISA delivers a new level of free, extensible software and hardware freedom on architecture, paving the way for the next 50 years of computing design and innovation. Based on the original PULP Platform development at ETH Zurich, CORE-V is a series of RISC-V based open-source processor cores with associated processor subsystem IP, tools and software for electronic system designers. The CORE-V family provides quality core IP in line with industry best practices in both silicon and FPGA optimized implementations. These cores can be used to facilitate rapid design innovation and ensure effective manufacturability of production SoCs. The session will describe barriers to the adoption of open-source IP and opportunities to overcome these barriers.
Bio:
Rick O’Connor is Founder and serves as President & CEO of the OpenHW Group a not-for-profit, global organization driven by its members and individual contributors where hardware and software designers collaborate on open-source cores, related IP, tools and software projects. The OpenHW Group Core-V Family is a series of RISC-V based open-source cores with associated processor subsystem IP, tools and software for electronic system designers. Previously Rick was Executive Director of the RISC-V Foundation. RISC-V (pronounced “risk-five”) is a free and open ISA enabling a new era of processor innovation through open standard collaboration. Founded by Rick in 2015 with the support of over 40 Founding Members, the RISC-V Foundation currently comprises more than 235 members building an open, collaborative community of software and hardware innovators powering processor innovation. Throughout his career, Rick has continued to be at the leading edge of technology and corporate strategy and has held executive positions in many industry standards bodies. With many years of executive-level management experience in semiconductor and systems companies, Rick possesses a unique combination of business and technical skills and was responsible for the development of dozens of products accounting for over $750 million in revenue. With very strong interpersonal skills, Rick is a regular speaker at key industry forums and has built a very strong professional network of key executives at many of the largest global technology firms including Altera (now part of Intel), AMD, ARM, Cadence, Dell, Ericsson, Facebook, Google, Huawei, HP, IBM, IDT, Intel, Microsoft, Nokia, NXP, RedHat, Synopsys, Texas Instruments, Western Digital, Xilinx and many more. Rick holds an Executive MBA degree from the University of Ottawa and is an honours graduate of the faculty of Electronics Engineering Technology at Algonquin College.
A high-level overview of the extensive products and services delivered by the CAD, FAB, and LAB business units. There will be a deeper dive into the CAD offering and the infrastructure that is available to make industry-grade CAD tools and resources available to users across Canada.
Craig has been involved in engineering and IT infrastructure design and management for +20 years. Experienced in both industrial and non-profit organizations, Craig will provide a high-level overview of products and services delivered by CMC to the research community with additional detail about the industry-grade CAD tools and resources available to users across Canada.
- CAD tools and flows for Processor design and prototyping for RISC-V and ASIPs
- FPGA/GPU cluster for machine learning
Why Attend
- To promote innovation, adoption and early access to advanced technologies including silicon and systems for accelerating AI workloads from cloud to the edge.
- To share insights and experiences with others; explore collaboration opportunities and connect leaders from industry to AI researchers and start-ups.
- To Influence technology selection (roadmap) and development activities of emerging AI trends.