Multimodal end-to-end autonomous driving

Xiao, Yi; Codevilla Moraes, Felipe; Gurram, Akhil; Urfalioglu, Onay; López Peña, Antonio M.

doi:10.1109/TITS.2020.3013234

Cita bibliogràfica -- Enllaç permanent: https://ddd.uab.cat/record/274828

Web of Science: 73 cites, Scopus: 87 cites, Google Scholar: cites,

Multimodal end-to-end autonomous driving
Xiao, Yi

(Centre de Visió per Computador (Bellaterra, Catalunya))
Codevilla Moraes, Felipe

(Centre de Visió per Computador (Bellaterra, Catalunya))
Gurram, Akhil

(Huawei Munich Research Center)
Urfalioglu, Onay (Huawei Munich Research Center)
López Peña, Antonio M.

(Universitat Autònoma de Barcelona. Departament de Ciències de la Computació)

Data:	2022
Descripció:	11 pàg.
Resum:	A crucial component of an autonomous vehicle (AV) is the artificial intelligence (AI) is able to drive towards a desired destination. Today, there are different paradigms addressing the development of AI drivers. On the one hand, we find modular pipelines, which divide the driving task into sub-tasks such as perception and maneuver planning and control. On the other hand, we find end-to-end driving approaches that try to learn a direct mapping from input raw sensor data to vehicle control signals. The later are relatively less studied, but are gaining popularity since they are less demanding in terms of sensor data annotation. This paper focuses on end-to-end autonomous driving. So far, most proposals relying on this paradigm assume RGB images as input sensor data. However, AVs will not be equipped only with cameras, but also with active sensors providing accurate depth information (e. g. , LiDARs). Accordingly, this paper analyses whether combining RGB and depth modalities, i. e. using RGBD data, produces better end-to-end AI drivers than relying on a single modality. We consider multimodality based on early, mid and late fusion schemes, both in multisensory and single-sensor (monocular depth estimation) settings. Using the CARLA simulator and conditional imitation learning (CIL), we show how, indeed, early fusion multimodality outperforms single-modality.
Ajuts:	Agencia Estatal de Investigación TIN2017-88709-R Agència de Gestió d'Ajuts Universitaris i de Recerca 2017/FI-B1-00162
Nota:	Altres ajuts: Antonio M. Lopez acknowledges the financial support by ICREA under the ICREA Academia Program. We also thank the Generalitat de Catalunya CERCA Program, as well as its ACCIO agency
Drets:	Tots els drets reservats.
Llengua:	Anglès
Document:	Article ; recerca ; Versió acceptada per publicar
Matèria:	Semantics ; Task analysis ; Laser radar ; Autonomous vehicles ; Cameras
Publicat a:	IEEE Transactions on Intelligent Transportation Systems, Vol. 23, issue 1 (Jan. 2022) , p. 537-547, ISSN 1558-0016

DOI: 10.1109/TITS.2020.3013234

Postprint
11 p, 1.7 MB

El registre apareix a les col·leccions:
Articles > Articles de recerca
Articles > Articles publicats

Registre creat el 2023-05-16, darrera modificació el 2024-02-03

Registres semblants

Afegeix-lo al cistell personal
Anomena i desa Citation, BibTeX, MARC, MARCXML, DC, EDM OpenAire4