3D scene modeling and understanding from image sequences

Tang, Hao

doi:10.5565/rev/elcvia.617

Cita bibliogràfica -- Enllaç permanent: https://ddd.uab.cat/record/119269

Scopus: 0 cites, Google Scholar: cites

3D scene modeling and understanding from image sequences
Tang, Hao

Data:	2014
Resum:	A new method for 3D modeling is proposed, which generates a content-based 3D mosaic (CB3M) representation for long video sequences of 3D, dynamic urban scenes captured by a camera on a mobile platform. In the first phase, a set of parallel-perspective (pushbroom) mosaics with varying viewing directions is generated to capture both the 3D and dynamic aspects of the scene under the camera coverage. In the second phase, a unified patch-based stereo matching algorithm is applied to extract parametric representations of the color, structure and motion of the dynamic and/or 3D objects in urban scenes, where a lot of planar surfaces exist. Multiple pairs of stereo mosaics are used for facilitating reliable stereo matching, occlusion handling, accurate 3D reconstruction and robust moving target detection. The outcome of this phase is a CB3M representation, which is a highly compressed visual representation for a dynamic 3D scene, and has object contents of both 3D and motion information. In the third phase, a multi-layer based scene understanding algorithm is proposed, resulting in a planar surface model for higher-level object representations. Experimental results are given for both simulated and several different real video sequences of large-scale 3D scenes to show the accuracy and effectiveness of the representation. We also show the patch-based stereo matching algorithm and the CB3M representation can be generalized to 3D modeling with perspective views using either a single camera or a stereovision head on a ground mobile platform or a pedestrian. Applications of the proposed method include airborne or ground video surveillance, 3D urban scene modeling, traffic survey, transportation planning and the visual aid for perception and navigation of blind people.
Nota:	Advisors: Dr. Zhigang Zhu, Dr. Ioannis Stamos, Dr. Jizhong Xiao and Dr. Rakesh Kumar. Date and location of PhD thesis defense: 6 December 2012, The City University of New York
Drets:	Aquest document està subjecte a una llicència d'ús Creative Commons. Es permet la reproducció total o parcial i la comunicació pública de l'obra, sempre que no sigui amb finalitats comercials, i sempre que es reconegui l'autoria de l'obra original. No es permet la creació d'obres derivades.
Llengua:	Anglès
Document:	Altres ; recerca ; Versió publicada
Publicat a:	ELCVIA : Electronic Letters on Computer Vision and Image Analysis, Vol. 13, Núm. 2 (2014) , p. 42-44, ISSN 1577-5097

Adreça original: https://elcvia.cvc.uab.es/article/view/v13-n3-tang
Adreça alternativa: https://raco.cat/index.php/ELCVIA/article/view/281632
DOI: 10.5565/rev/elcvia.617

3 p, 364.3 KB

El registre apareix a les col·leccions:
Articles > Articles publicats > ELCVIA
Articles > Articles de recerca

Registre creat el 2014-07-29, darrera modificació el 2023-01-11

Registres semblants

Afegeix-lo al cistell personal
Anomena i desa Citation, BibTeX, MARC, MARCXML, DC, EDM OpenAire4