DEEP-LEARNING BASED OBJECT DETECTION FOR AUTONOMOUS DRIVING: APPLICATIONS AND OPEN CHALLENGES

Authors

DOI:

https://doi.org/10.32782/IT/2024-3-1

Keywords:

object detection, autonomous driving, deep learning, transformers, attention mechanisms, occlusion handling, real-time performance

Abstract

Object detection is a critical component of autonomous driving systems, enabling accurate identification and localization of vehicles, pedestrians, cyclists, traffic signs, and other road objects. Deep learning techniques have revolutionized this field, propelling object detection capabilities to unprecedented levels. This paper presents a survey of state-of-the-art deep learning-based object detection methods tailored for autonomous driving applications using monocular camera input. The purpose of this work is to provide a unified perspective on modern deep learning approaches to object detection tailored for the unique requirements of autonomous driving. The monocular camera modality is chosen for its cost-effectiveness, widespread availability, and compatibility with existing automotive hardware. The focus is solely on deep learning techniques due to their ability to learn rich feature representations directly from data. The methodology involves a systematic review of real-world applications and challenges, including pedestrian detection, traffic sign recognition, low-light conditions, and real-time performance requirements. The scientific novelty. This survey consolidates the latest developments in camera-based object detection for autonomous driving, providing a comprehensive and up-to-date resource for researchers and practitioners. It offers insights into emerging techniques, such as attention mechanisms, multi-scale feature fusion, and model compression, which address critical challenges like occlusion handling, small object detection, and computational efficiency. Furthermore, the survey explores the potential of explainable AI and meta-learning techniques to enhance the transparency, interpretability, and generalization capabilities of object detectors in autonomous driving contexts. Conclusions. Deep learning-based object detection has made significant strides in recent years, enabling robust and accurate perception for autonomous vehicles. However, challenges persist in real-world deployment, including handling diverse lighting conditions, adverse weather scenarios, and ensuring reliable performance under occlusions. This survey highlights promising research directions, such as incorporating attention mechanisms, temporal information, and multi-scale architectures, to address these challenges and pave the way for safer and more reliable autonomous driving systems.

References

Lyssenko M. A safety-adapted loss for pedestrian detection in automated driving / M. Lyssenko, P. Pimplikar, M. Bieshaar, [et al.]. arXiv, 2024.

Mishra S. Real-time pedestrian detection using yolo / S. Mishra, S. Jabin. 2023.

Zuo X. Pedestrian detection based on one-stage yolo algorithm / X. Zuo, J. Li, J. Huang, [et al.]. Journal of Physics: Conference Series. 2021. Vol. 1871, No. 1. P. 012131.

Lan W. Pedestrian detection based on yolo network model. W. Lan, J. Dang, Y. Wang, S. Wang. 2018.

Wei J. Infrared pedestrian detection using improved unet and yolo through sharing visible light domain information. J. Wei, S. Su, Z. Zhao, [et al.]. Measurement. 2023. Vol. 221. P. 113442.

Zhang Y. A lightweight vehicle-pedestrian detection algorithm based on attention mechanism in traffic scenarios / Y. Zhang, A. Zhou, F. Zhao, H. Wu. Sensors (Basel, Switzerland). 2022. Vol. 22, No. 21. P. 8480.

Chen Y. TF-yolo: a transformer–fusion-based yolo detector for multimodal pedestrian detection in autonomous driving scenes / Y. Chen, J. Ye, X. Wan. World Electric Vehicle Journal. 2023. Vol. 14, No. 12. P. 352.

Lin M. DETR for crowd pedestrian detection / M. Lin, C. Li, X. Bu, [et al.]. arXiv, 2021.

Han W. IDPD: improved deformable-detr for crowd pedestrian detection / W. Han, N. He, X. Wang, [et al.]. Signal, Image and Video Processing. 2024. Vol. 18, No. 3. P. 2243–2253.

Yuan J. Effectiveness of vision transformer for fast and accurate single-stage pedestrian detection / J. Yuan, P. Barmpoutis, T. Stathaki. Advances in Neural Information Processing Systems. 2022. Vol. 35. P. 27427–27440. 11. Deng S. Efficient dense pedestrian detection based on transformer / S. Deng, J. Li. 2024.

Siam M. MODNet: motion and appearance based moving object detection network for autonomous driving / M. Siam, H. Mahgoub, M. Zahran, [et al.]. 2018.

Yahiaoui M. FisheyeMODNet: moving object detection on surround-view cameras for autonomous driving / M. Yahiaoui, H. Rashed, L. Mariotti, [et al.]. arXiv, 2019.

Rashed H. BEV-modnet: monocular camera based bird’s eye view moving object detection for autonomous driving / H. Rashed, M. Essam, M. Mohamed, [et al.]. 2021.

Rashed H. VM-modnet: vehicle motion aware moving object detection for autonomous driving / H. Rashed, A. E. Sallab, S. Yogamani. 2021.

Hernandez A. E. G. Recognize moving objects around an autonomous vehicle considering a deeplearning detector model and dynamic bayesian occupancy / A. E. G. Hernandez, O. Erkent, C. Laugier. Shenzhen, China: IEEE, 2020.

Ramzy M. RST-modnet: real-time spatio-temporal moving object detection for autonomous driving / M. Ramzy, H. Rashed, A. El Sallab, S. Yogamani. 2019.

Jha S. Real time object detection and trackingsystem for video surveillance system. S. Jha, C. Seo, E. Yang, G. P. Joshi. Multimedia Tools and Applications. 2021. Vol. 80, No. 3. P. 3981–3996.

Zhou Z. RGB-event fusion for moving object detection in autonomous driving / Z. Zhou, Z. Wu, R. Boutteau, [et al.]. arXiv, 2023.

Liu D. Video object detection for autonomous driving: motion-aid feature calibration / D. Liu, Y. Cui, Y. Chen, [et al.]. Neurocomputing. 2020. Vol. 409. P. 1–11.

Wang J. Improved yolov5 network for real-time multi-scale traffic sign detection / J. Wang, Y. Chen, M. Gao, Z. Dong. arXiv, 2021.

Liu H. ETSR-yolo: an improved multi-scale traffic sign detection algorithm based on yolov5 / H. Liu, K. Zhou, Y. Zhang, Y. Zhang. PLOS ONE. 2023. Vol. 18, No. 12. P. e0295807.

Chen T. MFL-yolo: an object detection model for damaged traffic signs / T. Chen, J. Ren. arXiv, 2023.

You S. Traffic sign detection method based on improved ssd / S. You, Q. Bi, Y. Ji, [et al.]. Information. 2020. Vol. 11, No. 10. P. 475.

Wu J. Traffic sign detection based on ssd combined with receptive field module and path aggregation network / J. Wu, S. Liao. Computational Intelligence and Neuroscience. 2022. Vol. 2022. P. 4285436.

Greer R. Salient sign detection in safe autonomous driving: ai which reasons over full visual context / R. Greer, A. Gopalkrishnan, N. Deo, [et al.]. arXiv, 2023.

Greer R. Robust traffic light detection using salience-sensitive loss: computational framework and evaluations / R. Greer, A. Gopalkrishnan, J. Landgren, [et al.]. arXiv, 2023.

Farzipour A. Traffic sign recognition using local vision transformer / A. Farzipour, O. N. Manzari, S. B. Shokouhi. arXiv, 2023.

Xia J. DSRA-detr: an improved detr for multiscale traffic sign detection / J. Xia, M. Li, W. Liu, X. Chen. Sustainability. 2023. Vol. 15, No. 14. P. 10862.

Chen S. A semi-supervised learning framework combining cnn and multiscale transformer for traffic sign detection and recognition / S. Chen, Z. Zhang, L. Zhang, [et al.]. IEEE Internet of Things Journal. 2024. Vol. 11, No. 11. P. 19500–19519.

Hu S. PseudoProp: robust pseudo-label generation for semi-supervised object detection in autonomous driving systems / S. Hu, C.-H. Liu, J. Dutta, [et al.]. 2022.

Chen W. Robust object detection for autonomous driving based on semi-supervised learning / W. Chen, J. Yan, W. Huang, [et al.]. Security and Safety. 2024. Vol. 3. P. 2024002.

He Y. Pseudo-label correction and learning for semi-supervised object detection / Y. He, W. Chen, K. Liang, [et al.]. arXiv, 2023.

Shehzadi T. Sparse semi-detr: sparse learnable queries for semi-supervised object detection / T. Shehzadi, K. A. Hashmi, D. Stricker, M. Z. Afzal. arXiv, 2024.

Miraliev S. Real-time memory efficient multitask learning model for autonomous driving / S. Miraliev, S. Abdigapporov, V. Kakani, H. Kim. IEEE Transactions on Intelligent Vehicles. 2024. Vol. 9, No. 1. P. 247–258.

Mahaur B. An improved lightweight small object detection framework applied to real-time autonomous driving / B. Mahaur, K. K. Mishra, A. Kumar. Expert Systems with Applications. 2023. Vol. 234. P. 121036.

Zhou Q. DPNet: dual-path network for real-time object detection with lightweight attention / Q. Zhou, H. Shi, W. Xiang, [et al.]. arXiv, 2022.

Wang A. YOLOv10: real-time end-to-end object detection / A. Wang, H. Chen, L. Liu, [et al.]. arXiv, 2024.

Zhao Y. DETRs beat yolos on real-time object detection / Y. Zhao, W. Lv, S. Xu, [et al.]. arXiv, 2024.

Wang X. Low-light traffic objects detection for automated vehicles / X. Wang, D. Wang, S. Li, [et al.]. 2022.

Pham L. H. Low-light image enhancement for autonomous driving systems using driveretinex-net / L. H. Pham, D. N.-N. Tran, J. W. Jeon. 2020.

Qiu Y. IDOD-yolov7: image-dehazing yolov7 for object detection in low-light foggy traffic environments / Y. Qiu, Y. Lu, Y. Wang, H. Jiang. Sensors. 2023. Vol. 23, No. 3. P. 1347.

Liu W. Image-adaptive yolo for object detection in adverse weather conditions / W. Liu, G. Ren, R. Yu, [et al.]. arXiv, 2022.

Guo Z. HawkDrive: a transformer-driven visual perception system for autonomous driving in night scene / Z. Guo, S. Perminov, M. Konenkov, D. Tsetserukou. arXiv, 2024.

Ye L. VELIE: a vehicle-based efficient low-light image enhancement method for intelligent vehicles / L. Ye, D. Wang, D. Yang, [et al.]. Sensors. 2024. Vol. 24, No. 4. P. 1345.

Zhang S. Guided attention in cnns for occluded pedestrian detection and re-identification / S. Zhang, D. Chen, J. Yang, B. Schiele. International Journal of Computer Vision. 2021. Vol. 129, No. 6. P. 1875–1892.

Zou T. Attention guided neural network models for occluded pedestrian detection / T. Zou, S. Yang, Y. Zhang, M. Ye. Pattern Recognition Letters. 2020. Vol. 131. P. 91–97.

Qi M. Exploring reliable infrared object tracking with spatio-temporal fusion transformer / M. Qi, Q. Wang, S. Zhuang, [et al.]. Knowledge-Based Systems. 2024. Vol. 284. P. 111234.

Huang T. Dense pedestrian detection based on multiple link feature pyramid networks / T. Huang, S. Yang, T. Yu, X. Fu. Hangzhou, China: SPIE, 2023.

Yahya M. Object detection and recognition in autonomous vehicles using fast region-convolutional neural network / M. Yahya, R. A. Reddy, K. Al-Attabi, [et al.]. 2023 International Conference on Integrated Intelligence and Communication Systems (ICIICS). 2023. P. 1–5.

Dang J. HA-fpn: hierarchical attention feature pyramid network for object detection / J. Dang, X. Tang, S. Li. Sensors. 2023. Vol. 23, No. 9. P. 4508.

Xie B. FocusTR: focusing on valuable feature by multiple transformers for fusing feature pyramid on object detection / B. Xie, L. Yang, Z. Yang, [et al.]. 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). 2022. P. 518–525.

Liu H. Flexi-compression: a flexible model compression method for autonomous driving / H. Liu, Y. He, F. R. Yu, J. James. Proceedings of the 11th ACM Symposium on Design and Analysis of Intelligent Vehicular Networks and Applications. 2021. P. 19–26.

Youn E. Compressing vision transformers for low-resource visual learning / E. Youn, S. M. J, S. Prabhu, S. Chen. 2023.

Lan Q. Instance, scale, and teacher adaptive knowledge distillation for visual detection in autonomous driving / Q. Lan, Q. Tian. IEEE Transactions on Intelligent Vehicles. 2023. Vol. 8, No. 3. P. 2358–2370.

Agand P. Knowledge distillation from single-task teachers to multi-task student for end-to-end autonomous driving / P. Agand. Proceedings of the AAAI Conference on Artificial Intelligence. 2024. Vol. 38, No. 21. P. 23375–23376.

Li Z. Quasar-vit: hardware-oriented quantization-aware architecture search for vision transformers / Z. Li, A. Lu, Y. Xie, [et al.]. 2024.

Sun E. Transformer-based few-shot object detection in traffic scenarios / E. Sun, D. Zhou, Y. Tian, [et al.]. Applied Intelligence. 2024. Vol. 54, No. 1. P. 947–958.

Dong J. Why did the ai make that decision? towards an explainable artificial intelligence (xai) for autonomous driving systems / J. Dong, S. Chen, M. Miralinaghi, [et al.]. Transportation Research Part C: Emerging Technologies. 2023. Vol. 156. P. 104358.

Cultrera, Luca. Visual attention and explainability in end-to-end autonomous driving / Cultrera, Luca. 2023.

Adom I. RB-xai: relevance-based explainable ai for traffic detection in autonomous systems / I. Adom, M. N. Mahmoud. SoutheastCon 2024. P. 1358–1367.

Kolekar S. Explainable ai in scene understanding for autonomous vehicles in unstructured traffic environments on indian roads using the inception u-net model with grad-cam visualization / S. Kolekar, S. Gite, B. Pradhan, A. Alamri. Sensors. 2022. Vol. 22, No. 24. P. 9677.

Downloads

Published

2024-12-06