Hierarchical Training of Deep Neural Networks Using Early Exiting.

dc.contributor.authorSepehri, Yamin
dc.contributor.authorPad, Pedram
dc.contributor.authorYüzügüler, Ahmet Caner
dc.contributor.authorFrossard, Pascal
dc.contributor.authorDunbar, L. Andrea
dc.date.accessioned2024-07-09T07:13:15Z
dc.date.available2024-07-09T07:13:15Z
dc.date.issued2024-05-14
dc.descriptionThis article is accepted to IEEE Transactions on Neural Networks and Learning Systems, doi: 10.1109/TNNLS.2024.3396628, © 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. https://ieeexplore.ieee.org/document/10530344
dc.description.abstractDeep neural networks (DNNs) provide state-of-the-art accuracy for vision tasks, but they require significant resources for training. Thus, they are trained on cloud servers far from the edge devices that acquire the data. This issue increases communication cost, runtime, and privacy concerns. In this study, a novel hierarchical training method for DNNs is proposed that uses early exits in a divided architecture between edge and cloud workers to reduce the communication cost, training runtime, and privacy concerns. The method proposes a brand-new use case for early exits to separate the backward pass of neural networks between the edge and the cloud during the training phase. We address the issues of most available methods that, due to the sequential nature of the training phase, cannot train the levels of hierarchy simultaneously or they do it with the cost of compromising privacy. In contrast, our method can use both edge and cloud workers simultaneously, does not share the raw input data with the cloud, and does not require communication during the backward pass. Several simulations and on-device experiments for different neural network architectures demonstrate the effectiveness of this method. It is shown that the proposed method reduces the training runtime for VGG-16 and ResNet-18 architectures by 29% and 61% in CIFAR-10 classification and by 25% and 81% in Tiny ImageNet classification, respectively, when the communication with the cloud is done over a low bit rate channel. This gain in the runtime is achieved, while the accuracy drop is negligible. This method is advantageous for online learning of high-accuracy DNNs on sensor-holding low-resource devices such as mobile phones or robots as a part of an edge-cloud system, making them more flexible in facing new tasks and classes of data.
dc.identifier.citationIEEE Transactions on Neural Networks and Learning Systems
dc.identifier.doi10.1109/TNNLS.2024.3396628
dc.identifier.issn2162-2388
dc.identifier.other38743536
dc.identifier.pmid38743536
dc.identifier.urihttps://hdl.handle.net/20.500.12839/1433
dc.identifier.urlhttps://ieeexplore.ieee.org/document/10530344
dc.language.isoen
dc.source.countryUnited States
dc.source.journaltitleIEEE transactions on neural networks and learning systems
dc.source.volumePP
dc.titleHierarchical Training of Deep Neural Networks Using Early Exiting.
dc.typeJournal Article
dc.typeArticle
dc.type.csemdivisionsBU-M
dc.type.csemresearchareasData & AI
dc.type.csemresearchareasIoT & Vision
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Hierarchical_Training_of_DNNs.pdf
Size:
2.15 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.82 KB
Format:
Item-specific license agreed upon to submission
Description: