Submission deadline for abstract is May 1st, 2018 (AOE)
The last decade has seen blooming emergence in deep learning. With ever-increased problem size to resolve and data size to process, deep learning becomes extremely computation demanding. As a result, migrating deep learning algorithms and applications on modern supercomputers, especially heterogeneous supercomputers that incorporate special accelerators such as GPUs, Xeon-Phi, FPGA, TPU, ASIC, etc. becomes a trend.
This workshop will emphasize novel, disruptive research ideas over incremental advances. We will solicit papers on topics including, but not limited to, the following areas:
In this workshop, we invite novel and recent works including, but not limited to, the following topics:
As a brand new workshop, we decide to make it discussion oriented this year. We invite 2-page double-column submission on novel and recent published works. We also welcome unfinished novel ideas. Please follow the ACM proceeding sigconf template (https://www.acm.org/publications/proceedings-template) using 10-point font for submission. Kindly note that the submission will not appear in proceedings so it can be further developed and submitted to a formal conference or journals. Finished or published works will be given 25 minutes to fully describe their contributions while unfinished work and novel ideas will be given 10 minutes to motivate the audience.
Submission will be accepted through the EasyChair System through this link: (https://easychair.org/conferences/?conf=dlmhs18)
All dates are Anywhere on Earth (AOE)
13:30 to 14:15 AM |
Keynote-1: Parallel and Distributed Deep Learning and HPC Prof. Torsten Hoefler, ETH Zurich |
14:15 to 15:00 AM |
Keynote-2: Challenges in Deep Learning from HPC Perspectives Dr. Jiangming Jin, TuSimple |
15:00 to 15:30 | Coffee Break |
15:30 to 16:00 PM |
Invited Talk-1: Efficient Allocation and Heterogeneous Composition of NVM Crossbars for Deep Learning Acceleration Prof. Lide Duan, University of Texas at San Antonio |
15:30 to 16:00 PM |
Invited Talk-2: ImageNet Training in Minutes Dr. Yang You, UC Berkeley |
15:30 to 16:00 PM | Invited Talk-3: Automated Systolic Array Architecture Synthesis for High Throughput CNN Inference on FPGAs Dr. Xuechao Wei, Peking University |
15:30 to 16:00 PM | Invited Talk-4: Fflow: an FPGA extension for TensorFlow with device placement optimization based on reinforcement learning Dr. Yongbiao Chen, Shanghai Jiao Tong University |
04:45 PM to 04:55 PM | Workshop Closing Comments |
Parallel and Distributed Deep Learning and HPC
Prof. Torsten Hoefler, ETH Zurich
Deep Neural Networks (DNNs) are becoming an important tool in modern computing applications. Accelerating their training is a major challenge and techniques range from distributed algorithms to low-level circuit design. In this overview talk, we describe the problem from a theoretical perspective, followed by approaches for its parallelization. Specifically, we present trends in DNN architectures and the resulting implications on parallelization strategies. We discuss the different types of concurrency in DNNs; synchronous and asynchronous stochastic gradient descent; distributed system architectures; communication schemes; and performance modeling. Based on these approaches, we extrapolate potential directions for parallelism in deep learning.
Torsten is an Associate Professor of Computer Science at ETH Zurich, Switzerland. Before joining ETH, he led the performance modeling and simulation efforts of parallel petascale applications for the NSF-funded Blue Waters project at NCSA/UIUC. He is also a key member of the Message Passing Interface (MPI) Forum where he chairs the “Collective Operations and Topologies” working group. Torsten won best paper awards at the ACM/IEEE Supercomputing Conference SC10, SC13, SC14, EuroMPI’13, HPDC’15, HPDC’16, IPDPS’15, and other conferences. He published numerous peer-reviewed scientific conference and journal articles and authored chapters of the MPI-2.2 and MPI-3.0 standards. He received the Latsis prize of ETH Zurich as well as an ERC starting grant in 2015. His research interests revolve around the central topic of “Performance-centric System Design” and include scalable networks, parallel programming techniques, and performance modeling. Additional information about Torsten can be found on his homepage at htor.inf.ethz.ch.
Challenges in Deep Learning from HPC Perspectives
Dr. Jiangming Jin, TuSimple
Topics about deep learning have been widely discussed, from academics to industries, and from infrastructures to applications. This talk explains the challenges in deep learning from HPC perspectives. With the increased computing resources required for processing deep learning workload, it is attractive to apply HPC techniques to promote the deep learning performance both in the inference and training. In the deploy and inference aspect, with the blooming of edge devices such as GPUs, FPGAs, and ASICs, it is challenging to obtain the performance gains because of the varying architecture characteristics in terms of memory organizations, compute primitives, etc. This talk explains some novel approaches that come from kernel optimization, operator optimization, and graph optimization.
Dr. Jiangming Jin is the Director of HPC Department in TuSimple. He oversees the HPC R&D projects across the autonomous driving system and deep learning framework within TuSimple. Prior to join TuSimple, He worked as an HPC & Quantitative Research Engineer in JP Morgan (Singapore, Beijing). Jiangming received his bachelor’s degree from University of Electronic Science and Technology of China (UESTC) and PhD degree from Nanyang Technological University (NTU) in 2008 and 2013 accordingly.