Decentralized Distributed Deep Learning with Low-Bandwidth Consumption for Smart Constellations
For the space-based remote sensing system, onboard intelligent processing based on deep learning has become an inevitable trend. To adapt to the dynamic changes of the observation scenes, there is an urgent need to perform distributed deep learning onboard to fully utilize the plentiful real-time se...
Guardado en:
Autores principales: | , , , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
American Association for the Advancement of Science (AAAS)
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/afbd71e8a2d041fe99ee2012179f9d16 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Sumario: | For the space-based remote sensing system, onboard intelligent processing based on deep learning has become an inevitable trend. To adapt to the dynamic changes of the observation scenes, there is an urgent need to perform distributed deep learning onboard to fully utilize the plentiful real-time sensing data of multiple satellites from a smart constellation. However, the network bandwidth of the smart constellation is very limited. Therefore, it is of great significance to carry out distributed training research in a low-bandwidth environment. This paper proposes a Randomized Decentralized Parallel Stochastic Gradient Descent (RD-PSGD) method for distributed training in a low-bandwidth network. To reduce the communication cost, each node in RD-PSGD just randomly transfers part of the information of the local intelligent model to its neighborhood. We further speed up the algorithm by optimizing the programming of random index generation and parameter extraction. For the first time, we theoretically analyze the convergence property of the proposed RD-PSGD and validate the advantage of this method by simulation experiments on various distributed training tasks for image classification on different benchmark datasets and deep learning network architectures. The results show that RD-PSGD can effectively save the time and bandwidth cost of distributed training and reduce the complexity of parameter selection compared with the TopK-based method. The method proposed in this paper provides a new perspective for the study of onboard intelligent processing, especially for online learning on a smart satellite constellation. |
---|