BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Europe/Stockholm
X-LIC-LOCATION:Europe/Stockholm
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20230831T095746Z
LOCATION:Davos
DTSTART;TZID=Europe/Stockholm:20230627T101900
DTEND;TZID=Europe/Stockholm:20230627T102000
UID:submissions.pasc-conference.org_PASC23_sess110_pos156@linklings.com
SUMMARY:P47 - Parallel Training of Deep Neural Networks
DESCRIPTION:Poster\n\nSamuel Cruz (Università della Svizzera italiana, Uni
 Distance Suisse); Alena Kopanicakova (Brown University, Università della S
 vizzera italiana); Hardik Kothari (Università della Svizzera italiana); an
 d Rolf Krause (Università della Svizzera italiana, UniDistance Suisse)\n\n
 Deep neural networks (DNNs) are used in a wide range of application areas 
 and scientific fields. The accuracy and the expressivity of the DNNs are t
 ightly coupled to the number of parameters of the network as well as the a
 mount of data used for training. As a consequence, the networks and the am
 ount of training data have grown considerably over the last few years. Sin
 ce this growing trend is expected to continue, the development of novel di
 stributed and highly-scalable training methods becomes an essential task. 
 In this work, we propose two distributed-training strategies by leveraging
  nonlinear domain-decomposition methods, which are well-established in the
  field of numerical mathematics. The proposed training methods utilize the
  decomposition of the parameter space and the data space. We show the nece
 ssary algorithmic ingredients for both training strategies. The convergenc
 e properties and scaling behavior of the training methods are demonstrated
  using several benchmark problems. Moreover, a comparison of both proposed
  approaches with the widely-used stochastic gradient optimizer is presente
 d, showing a significant reduction in the number of iterations and the exe
 cution time. In the end, we demonstrate the scalability of our Pytorch-bas
 ed training framework, which leverages CUDA and NCCL technologies in the b
 ackend.\n\nSession Chair: Jibonananda Sanyal (National Renewable Energy La
 boratory)
END:VEVENT
END:VCALENDAR
