Preprint / Version 1

Improved Plane Detection in 3D Reconstruction from 2D Images

##article.authors##

  • Joseph Quan Polygence

DOI:

https://doi.org/10.58445/rars.3138

Keywords:

Plane Detection, 3D Reconstruction

Abstract

3D reconstruction is a fundamental technology with applications in autonomous driving, virtual reality, and game development. Plane detection is a critical component in 3D reconstruction, as planes form the structural backbone of most environments. However, existing plane detection methods have limitations in their accuracy, while machine learning based plane detection is largely limited by its training dataset. This research aims to enhance plane detection in 3D reconstruction from 2D images by integrating monocular depth estimation, point cloud generation, RANSAC, and clustering techniques to better detect and reconstruct planes. In this project, I generated depth maps from 2D images by implementing monocular depth generation. Next, I then utilized pyRANSAC, a plane detection algorithm to detect basic planes. I then improved the algorithm to detect multiple planes and used a clustering algorithm to address the many artifacts induced by the algorithm. My proposed method was compared with existing approaches to show that my method has a more robust and more accurate plane detection.

References

L. Yang, H. Zhao, B. Kang, Z. Huang, Z. Zhao, X. Xu, and J. Feng, “Depth Anything V2,” arXiv preprint arXiv:2406.09414, 2024. doi:10.48550/arXiv.2406.09414

L. Mariga, “PyRANSAC-3D: A Python tool for fitting primitives 3D shapes in point clouds using RANSAC algorithm,” GitHub, 2022. [Online]. Available: https://github.com/leomariga/pyRANSAC-3D

“DBSCAN (clustering) from scikit-learn,” Scikit-learn documentation. [Online]. Available: https://scikit-learn.org/stable/modules/generated/sklearn.cluster.DBSCAN.html

Y. Xi, F. Shu, J. Rambach, A. Pagani, and D. Stricker, “PlaneRecNet: Multi-task learning with cross-task consistency for piece-wise plane detection and reconstruction from a single RGB image,” arXiv preprint arXiv:2110.11219, 2021. doi:10.48550/arXiv.2110.11219

N. Silberman, D. Hoiem, P. Kohli, and R. Fergus, “Indoor segmentation and support inference from RGBD images,” in Proc. ECCV, 2012. [Online]. Available: https://cs.nyu.edu/~fergus/datasets/nyu_depth_v2.html

“MiDaS: Monocular depth estimation,” GitHub, 2024. [Online]. Available: https://github.com/isl-org/MiDaS

Q.-Y. Zhou, J. Park, and V. Koltun, “Open3D: A modern library for 3D data processing,” arXiv preprint arXiv:1801.09847, 2018. [Online]. Available: https://arxiv.org/pdf/1801.09847

M. A. Fischler and R. C. Bolles, “Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography,” Commun. ACM, vol. 24, no. 6, pp. 381–395, 1981. doi:10.1145/358669.358692

D. Borrmann, J. Elseberg, K. Lingemann, and A. Nüchter, “The 3D Hough transform for plane detection in point clouds: A review and a new accumulator design,” 3D Research, vol. 2, no. 2, Jun. 2011. doi:10.1007/3dres.02(2011)3

Ai4ce, “PEAC: Fast plane extraction using agglomerative hierarchical clustering (ICRA 2014),” GitHub, 2018. [Online]. Available: https://github.com/ai4ce/peac

NeurIPS, “Depth Anything V2 Poster,” 2024. [Online]. Available: https://neurips.cc/virtual/2024/poster/94431

“Hough Transforms,” ScienceDirect. [Online]. Available: https://www.sciencedirect.com/topics/computer-science/hough-transforms

M. Ester, H.-P. Kriegel, J. Sander, and X. Xu, “A density-based algorithm for discovering clusters in large spatial databases with noise,” in Proc. 2nd Int. Conf. Knowledge Discovery and Data Mining (KDD-96), 1996. [Online]. Available: https://www.dbs.ifi.lmu.de/Publikationen/Papers/KDD-96.final.frame.pdf

S. Liu, Y. Chen, T. Wu, S. Han, and Y. Furukawa, “PlaneRCNN: 3D plane detection and reconstruction from a single image,” arXiv preprint arXiv:1812.04072, 2018. [Online]. Available: https://arxiv.org/pdf/1812.04072

N. Silberman et al., “NYU Depth v2 dataset,” 2012. [Online]. Available: https://cs.nyu.edu/~fergus/datasets/nyu_depth_v2.html

M. Kholil, et al., “Structure from Motion and Multi-View Stereo-based 3D reconstruction,” IOP Conf. Series: Materials Science and Engineering, vol. 1073, 2021. doi:10.1088/1757-899X/1073/1/012066

Geiger, Andreas, et al. “The KITTI Vision Benchmark Suite.” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012.

Richard Hartley and Andrew Zisserman, “Multiple View Geometry in Computer Vision”, Cambridge University Press, 2003.

Downloads

Posted

2025-09-27