Passage-Aware Structural Mapping for RGB-D Visual SLAM

TL;DR

Introduced a passage-aware structural mapping method for RGB-D Visual SLAM to effectively detect doors and traversable openings.

cs.RO 🔴 Advanced 2026-04-28 28 views

Ali Tourani Miguel Fernandez-Cortizas Saad Ejaz David Pérez Saura Asier Bikandi-Noya Jose Luis Sanchez-Lopez Holger Voos

AI Reader Arxiv Page Download PDF

Visual SLAM Semantic SLAM Structural Mapping Indoor Navigation BIM

Key Findings

Methodology

This paper presents a passage-aware structural mapping approach that detects doors and traversable openings by jointly fusing geometric, semantic, and topological cues. Doors are modeled as planar entities embedded within walls and classified as traversable or non-traversable based on their coplanarity with the supporting wall. Passages are inferred through two complementary strategies: traversal evidence accumulated from camera-wall interactions across consecutive keyframes, and geometric opening validation based on discontinuities in the mapped wall geometry. The proposed method is integrated into vS-Graphs as a proof of concept, enriching its scene graph with passage-level abstractions and improving room connectivity modeling.

Key Results

Qualitative evaluations on indoor office sequences demonstrate reliable doorway detection. By integrating into vS-Graphs, the scene graph's passage-level abstractions are enriched, improving room connectivity modeling.
Compared to traditional methods, this approach significantly enhances the accuracy and robustness of doorway detection, especially in complex indoor environments.
Comparative experiments verify the effectiveness of the geometric opening validation strategy in reducing false positives, particularly in scenarios with occlusions or complex backgrounds.

Significance

This research holds significant implications for academia and industry, particularly in indoor navigation and Building Information Modeling (BIM). By introducing passage-aware structural mapping, the paper provides a novel solution for indoor robot navigation, addressing the underexplored area of door and passage detection in traditional Visual SLAM frameworks. This method not only enhances the structural and semantic level of map reconstruction but also lays the groundwork for future BIM-informed SLAM applications.

Technical Contribution

Technical contributions include: 1) Introducing a method that combines geometric, semantic, and topological cues for passage detection, filling a gap in existing SLAM frameworks regarding door and passage detection; 2) By integrating this method into vS-Graphs, its effectiveness in improving room connectivity modeling and scene understanding is validated; 3) Providing an open-source codebase to facilitate further research and application in this field.

Novelty

This method is the first to introduce passage awareness into RGB-D Visual SLAM, achieving reliable door and passage detection by combining geometric, semantic, and topological cues. Compared to existing methods, it does not rely on prior environmental markers, offering greater practicality and scalability.

Limitations

The method may perform poorly in dynamic environments or with fast-moving cameras, as it relies on camera-wall interactions across consecutive keyframes.
In extreme lighting conditions, the performance of RGB-D sensors may affect the accuracy of door and passage detection.
The method has primarily been validated in indoor environments and has yet to be tested in larger-scale or more complex environments.

Future Work

Future research directions include: 1) Incorporating doors and passages directly into factor graph optimization for tighter coupling between traversability reasoning and pose estimation; 2) Conducting quantitative benchmarking across more diverse indoor environments; 3) Exploring deeper integration with BIM models to enhance structural consistency and completeness.

AI Executive Summary

In modern Visual SLAM frameworks, doors and passages, as critical structural elements for indoor robot navigation, are often overlooked. Existing SLAM methods mainly focus on static objects such as walls, tables, and chairs, with relatively insufficient detection and modeling of passages.

This paper proposes a novel passage-aware structural mapping approach that detects doors and traversable openings by jointly fusing geometric, semantic, and topological cues. By integrating this method into vS-Graphs, the scene graph's passage-level abstractions are enriched, improving room connectivity modeling. Specifically, doors are modeled as planar entities embedded within walls and classified as traversable or non-traversable based on their coplanarity with the supporting wall.

Passages are inferred through two complementary strategies: traversal evidence accumulated from camera-wall interactions across consecutive keyframes, and geometric opening validation based on discontinuities in the mapped wall geometry. Experimental results demonstrate that this method reliably detects doorways in indoor office sequences, laying the groundwork for future BIM-informed SLAM applications.

Compared to traditional methods, this approach significantly enhances the accuracy and robustness of doorway detection, especially in complex indoor environments. Comparative experiments verify the effectiveness of the geometric opening validation strategy in reducing false positives, particularly in scenarios with occlusions or complex backgrounds.

However, the method may perform poorly in dynamic environments or with fast-moving cameras, as it relies on camera-wall interactions across consecutive keyframes. Additionally, in extreme lighting conditions, the performance of RGB-D sensors may affect the accuracy of door and passage detection. Future research directions include incorporating doors and passages directly into factor graph optimization for tighter coupling between traversability reasoning and pose estimation.

Deep Analysis

Background

Simultaneous Localization and Mapping (SLAM) has emerged as a fundamental capability of modern autonomous robots, allowing them to estimate their pose while incrementally reconstructing the surrounding environment. Among the available sensing modalities in SLAM, vision sensors provide a cost-effective means of capturing rich visual and structural data, leading to the emergence of Visual SLAM (VSLAM). Despite considerable progress in VSLAM, challenges remain in retrieving geometric information, especially in scenarios involving indoor navigation. Doors and passages, as critical structural elements in indoor environments, remain underexplored in existing VSLAM frameworks. By augmenting VSLAM with semantic information, more interpretable and structurally meaningful map reconstruction can be achieved, opening new possibilities for further research and applications.

Core Problem

In indoor environments, doors and passages are key elements that define rooms and establish their inter-connectivity. However, existing VSLAM methods fall short in detecting and modeling these elements. This is primarily due to the lack of effective detection of geometric discontinuities in walls and insufficient integration of semantic and topological information for doors and passages. Addressing this issue is crucial for improving the robustness and efficiency of indoor navigation, especially in complex and dynamic environments. Current methods often rely on prior environmental markers, limiting their practicality and scalability. Therefore, developing a method that can reliably detect doors and passages without relying on environmental markers is a significant challenge in current research.

Innovation

The core innovations of this paper include the introduction of a passage-aware structural mapping method that detects doors and traversable openings by jointly fusing geometric, semantic, and topological cues. Specific innovations include: 1) Modeling doors as planar entities embedded within walls and classifying them as traversable or non-traversable based on their coplanarity with the supporting wall; 2) Proposing two complementary strategies for passage inference: traversal evidence accumulated from camera-wall interactions across consecutive keyframes, and geometric opening validation based on discontinuities in the mapped wall geometry; 3) Integrating this method into vS-Graphs as a proof of concept, enriching its scene graph with passage-level abstractions and improving room connectivity modeling. These innovations provide a novel solution for indoor robot navigation, addressing the underexplored area of door and passage detection in traditional Visual SLAM frameworks.

Methodology

The methodology of this paper includes the following key steps:

�� Input: Given an RGB-D point cloud at the VSLAM KeyFrame level, a panoptic segmentation method such as YOSO is used to extract semantically meaningful planar entities, including walls and doors.

�� Process: Each KeyFrame is defined with pixel-wise semantic labels and instance-level masks, which are projected onto the point cloud to obtain semantically segmented point subsets.

�� Output: RANSAC plane fitting is applied to estimate semantically validated planar entities, and detected entities are inserted into the map for continuous structural reconstruction.

�� Passage Inference: Passages are inferred through two strategies: 1) Traversal evidence accumulated from camera-wall interactions; 2) Geometric opening validation based on discontinuities in the mapped wall geometry.

Experiments

The experimental design includes data collection in indoor office environments using the SMapper device to validate the effectiveness of the proposed passage detection pipeline. The benchmarks used include vS-Graphs as the baseline, with evaluation metrics including the accuracy and robustness of doorway detection. Ablation studies are conducted to verify the effectiveness of the geometric opening validation strategy in reducing false positives. Key hyperparameters include threshold settings for RANSAC plane fitting and distance and angle thresholds in passage inference strategies. Comparative experiments demonstrate the performance advantages of the proposed method in complex indoor environments.

Results

Experimental results demonstrate that the proposed method reliably detects doorways in indoor office sequences, laying the groundwork for future BIM-informed SLAM applications. Compared to traditional methods, this approach significantly enhances the accuracy and robustness of doorway detection, especially in complex indoor environments. Comparative experiments verify the effectiveness of the geometric opening validation strategy in reducing false positives, particularly in scenarios with occlusions or complex backgrounds. The experiments also show that the proposed method enriches the scene graph's passage-level abstractions while maintaining real-time performance, improving room connectivity modeling and scene understanding.

Applications

The application scenarios of this method include indoor robot navigation, Building Information Modeling (BIM), and smart building management. By detecting and modeling doors and passages, robots can better understand the structural environment, improving navigation and path planning efficiency. In BIM applications, this method can be used to verify and update building models, enhancing the intelligence of building management. The method can also be used in smart building management systems to improve building safety and energy efficiency by detecting and monitoring passage usage.

Limitations & Outlook

Despite the excellent performance of this method in detecting doors and passages, it may perform poorly in dynamic environments or with fast-moving cameras, as it relies on camera-wall interactions across consecutive keyframes. Additionally, in extreme lighting conditions, the performance of RGB-D sensors may affect the accuracy of door and passage detection. The method has primarily been validated in indoor environments and has yet to be tested in larger-scale or more complex environments. Future research directions include incorporating doors and passages directly into factor graph optimization for tighter coupling between traversability reasoning and pose estimation.

Plain Language Accessible to non-experts

Imagine you're playing a maze game at home. The game requires you to find the best path from one room to another. Doors and passages are like key points in the maze, determining whether you can pass through smoothly. Our research is like adding a new layer of intelligence to this maze game, allowing you to not only see walls and furniture but also recognize which doors are open and which passages are traversable.

Our method is like equipping your game character with special glasses that can identify openings in walls and tell you whether these openings are passable. This way, you can find the path to the next room faster without worrying about hitting a dead end.

In this way, our method not only enhances the fun of the game but also allows you to navigate complex mazes with ease. Even in poor lighting conditions, these glasses can help you find the right direction because they rely not only on vision but also on other information to determine passage traversability.

In short, this research is like providing a new navigation method for your maze game, making you more confident and efficient in exploring the unknown.

ELI14 Explained like you're 14

Hey there! Today I'm going to tell you about a super cool research project that helps robots find their way indoors as easily as we do at home! Imagine you're playing hide and seek at home and want to find the fastest path from the living room to the bedroom. Doors and passages are what you need to pay attention to because they determine whether you can pass through smoothly.

Our research is like giving robots a pair of super eyes that can not only see walls and furniture but also recognize which doors are open and which passages are traversable. This way, robots can find their way around the house as easily as you do, without worrying about getting lost.

What's even cooler is that these super eyes can work even in poor lighting conditions because they rely not only on vision but also on other information to determine passage traversability. It's like using a flashlight to light up the way in the dark; these eyes help robots find the right direction in complex environments.

So, next time you're playing hide and seek at home, imagine how much easier it would be if you had a pair of super eyes like this! That's the goal of our research: to make robots smarter and more efficient when navigating indoors!

Glossary

RGB-D Sensor

A sensor capable of capturing both color images (RGB) and depth information (D).

Used to acquire visual and structural data of indoor environments.

Visual SLAM

A technique for simultaneous localization and mapping using visual sensors.

Used for indoor robot navigation and environment reconstruction.

Semantic SLAM

SLAM technology that incorporates semantic information to recognize and label different objects in the environment.

Enhances the interpretability and structural significance of map reconstruction.

Panoptic Segmentation

An image segmentation technique that performs both semantic and instance segmentation.

Used to extract semantically meaningful planar entities.

RANSAC

An iterative method for estimating mathematical model parameters that works effectively even with a large number of outliers in the data.

Used for plane fitting and semantic validation.

BIM

Building Information Modeling, a digital representation method for building design and management.

Used to enhance SLAM's structural consistency and completeness.

Topological Cues

Information used to describe the relationships between objects in space.

Used for passage detection and environment modeling.

Geometric Opening Validation

Validation of passage traversability by detecting discontinuities in wall geometry mapping.

Used to reduce false positives and improve detection accuracy.

vS-Graphs

A framework that tightly couples Visual SLAM and 3D scene graph generation.

Used to validate the effectiveness of the proposed method.

SMapper

A multi-modal data acquisition platform for SLAM benchmarking.

Used for experimental data collection to validate the proposed method.

Open Questions Unanswered questions from this research

1 How to improve the robustness of door and passage detection in dynamic environments? Existing methods primarily rely on geometric and semantic information in static environments, which may change in dynamic environments, leading to decreased detection accuracy. New algorithms are needed that can update and adjust detection results in real-time in dynamic environments.
2 How to improve the performance of RGB-D sensors in extreme lighting conditions? Changes in lighting conditions can affect the acquisition of depth information by sensors, thereby affecting the accuracy of door and passage detection. New sensor technologies or image processing algorithms need to be researched to improve detection performance under different lighting conditions.
3 How to validate the effectiveness of this method in larger-scale or more complex environments? Current experiments are primarily conducted in indoor office environments and have yet to be tested in larger-scale or more complex environments. Broader experiments are needed to verify the applicability of this method in different environments.
4 How to incorporate doors and passages directly into factor graph optimization? The current method primarily relies on separate processes for passage detection and pose estimation, and has yet to achieve a tight coupling of the two. New optimization algorithms need to be developed to achieve tighter coupling between traversability reasoning and pose estimation.
5 How to achieve deeper integration with BIM models? The current method primarily detects passages through geometric and semantic information and has yet to fully utilize structural information in BIM models. New integration methods need to be researched to enhance structural consistency and completeness.

Applications

Immediate Applications

Indoor Robot Navigation

By detecting and modeling doors and passages, robots can better understand the structural environment, improving navigation and path planning efficiency.

Building Information Modeling (BIM)

This method can be used to verify and update building models, enhancing the intelligence of building management.

Smart Building Management

By detecting and monitoring passage usage, building safety and energy efficiency can be improved.

Long-term Vision

Smart City Planning

By applying this method on a large scale, the intelligence level of urban planning and management can be improved, achieving more efficient resource allocation and use.

Autonomous Vehicle Navigation

In the future, this method can be extended to navigation systems for autonomous vehicles, improving their navigation capabilities in complex urban environments.

Abstract

Doorways and passages are critical structural elements for indoor robot navigation, yet they remain underexplored in modern Visual SLAM (VSLAM) frameworks. This paper presents a passage-aware structural mapping approach for RGB-D VSLAM that detects doors and traversable openings by jointly fusing geometric, semantic, and topological cues. Doors are modeled as planar entities embedded within walls and classified as traversable or non-traversable based on their coplanarity with the supporting wall. Passages are inferred through two complementary strategies: traversal evidence accumulated from camera-wall interactions across consecutive keyframes, and geometric opening validation based on discontinuities in the mapped wall geometry. The proposed method is integrated into vS-Graphs as a proof of concept, enriching its scene graph with passage-level abstractions and improving room connectivity modeling. Qualitative evaluations on indoor office sequences demonstrate reliable doorway detection, and the framework lays the foundation for exploiting these elements in BIM-informed VSLAM. The source code is publicly available at https://github.com/snt-arg/visual_sgraphs/tree/doorway_integration.

cs.RO

References (14)

BIM Informed Visual SLAM for Construction Monitoring

Asier Bikandi, Miguel Fernández-Cortizas, Muhammad Shaheer et al.

2025 1 citations ⭐ Influential View Analysis →

vS-Graphs: Tightly Coupling Visual SLAM and 3D Scene Graphs Exploiting Hierarchical Scene Understanding

Ali Tourani, Saad Ejaz, Hriday Bavle et al.

2025 7 citations ⭐ Influential View Analysis →

Situationally-Aware Path Planning Exploiting 3D Scene Graphs

Saad Ejaz, Marco Giberna, Muhammad Shaheer et al.

2025 3 citations View Analysis →

Optimal Randomized RANSAC

Ondřej Chum, Jiri Matas

2008 501 citations

A Comprehensive Survey of Visual SLAM Algorithms

A. M. Barros, M. Michel, Y. Moline et al.

2022 396 citations

Khronos: A Unified Approach for Spatio-Temporal Metric-Semantic SLAM in Dynamic Environments

Lukas Schmid, Marcus Abate, Yun Chang et al.

2024 52 citations View Analysis →

3D Active Metric-Semantic SLAM

Yuezhan Tao, Xu Liu, Igor Spasojevic et al.

2023 26 citations View Analysis →

You Only Segment Once: Towards Real-Time Panoptic Segmentation

Jie Hu, Linyan Huang, Tianhe Ren et al.

2023 78 citations View Analysis →

PS-SLAM: A Visual SLAM for Semantic Mapping in Dynamic Outdoor Environment Using Panoptic Segmentation

Gang Li, Jinxiang Cai, Chen Huang et al.

2025 5 citations

Vision-Based Situational Graphs Exploiting Fiducial Markers for the Integration of Semantic Entities

Ali Tourani, Hriday Bavle, Jose Luis Sanchez-Lopez et al.

2023 7 citations View Analysis →

From SLAM to Situational Awareness: Challenges and Survey

Hriday Bavle, Jose Luis Sanchez-Lopez, E. Schmidt et al.

2021 64 citations View Analysis →

RSO-SLAM: A Robust Semantic Visual SLAM With Optical Flow in Complex Dynamic Environments

Liang Qin, Chang Wu, Zhenyu Chen et al.

2024 35 citations

Visual SLAM: What Are the Current Trends and What to Expect?

Ali Tourani, Hriday Bavle, Jose Luis Sanchez-Lopez et al.

2022 104 citations View Analysis →

SMapper: A Multi-Modal Data Acquisition Platform for SLAM Benchmarking

Pedro Miguel Bastos Soares, Ali Tourani, Miguel Fernández-Cortizas et al.

2025 4 citations View Analysis →

Passage-Aware Structural Mapping for RGB-D Visual SLAM

Key Findings

Methodology

Key Results

Significance

Technical Contribution

Novelty

Limitations

Future Work

AI Executive Summary

Deep Analysis

Background

Core Problem

Innovation

Methodology

Experiments

Results

Applications

Limitations & Outlook

Plain Language Accessible to non-experts

ELI14 Explained like you're 14

Glossary

RGB-D Sensor

Visual SLAM

Semantic SLAM

Panoptic Segmentation

RANSAC

BIM

Topological Cues

Geometric Opening Validation

vS-Graphs

SMapper

Open Questions Unanswered questions from this research

Applications

Immediate Applications

Indoor Robot Navigation

Building Information Modeling (BIM)

Smart Building Management

Long-term Vision

Smart City Planning

Autonomous Vehicle Navigation

Abstract

References (14)

Related Papers

Learning Human-Intention Priors from Large-Scale Human Demonstrations for Robotic Manipulation

Pushing Radar Odometry Beyond the Pavement: Current Capabilities and Challenges

Agent-Centric Visual Reinforcement Learning under Dynamic Perturbations

Computational Design and Co-Robotic Fabrication for Material Reuse in Architecture

Guiding Vector Field Generation via Score-based Diffusion Model

GCImOpt: Learning efficient goal-conditioned policies by imitating optimal trajectories