DENALI: A Dataset Enabling Non-Line-of-Sight Spatial Reasoning with Low-Cost LiDARs
DENALI dataset enables non-line-of-sight spatial reasoning with low-cost LiDARs, covering 72,000 scenes.
Key Findings
Methodology
This study introduces the DENALI dataset, focusing on non-line-of-sight (NLOS) perception using low-cost LiDARs. By capturing space-time histograms for 72,000 scenes, the research demonstrates how consumer-grade LiDARs can be used for data-driven NLOS inference. The methodology includes using the dataset for object localization, shape classification, and size estimation. Various machine learning models, such as 1D CNN and 3D CNN, were employed to evaluate the NLOS perception capabilities of LiDAR data.
Key Results
- Result 1: Using 1D CNN for object localization achieved an RMSE of 0.046 meters, showcasing the potential of low-cost LiDARs in NLOS perception.
- Result 2: In the shape classification task, the 1D CNN model achieved a macro-F1 score of 0.38, indicating the dataset's effectiveness in recognizing different shapes.
- Result 3: For size prediction, the model achieved an accuracy of 0.95, validating the feasibility of data-driven approaches in NLOS perception.
Significance
This research highlights the potential of low-cost LiDARs in non-line-of-sight imaging through the DENALI dataset, addressing the gap in consumer-grade LiDARs' ability to perform NLOS perception in complex scenes. The study paves the way for more sophisticated perception tasks in mobile devices and robotics, with significant academic and industrial implications.
Technical Contribution
Technical contributions include the first large-scale capture of space-time histograms from low-cost LiDARs, demonstrating the application potential of consumer-grade LiDARs in NLOS perception. The machine learning models used in the study provide new data-driven methods for achieving NLOS perception without relying on laboratory-grade equipment.
Novelty
This study is the first to propose the DENALI dataset, focusing on NLOS perception with low-cost LiDARs. Compared to existing high-cost laboratory-grade equipment, the DENALI dataset demonstrates the application potential of consumer-grade LiDARs in real-world scenarios.
Limitations
- Limitation 1: The dataset was captured under controlled conditions, not fully reflecting the diversity of dynamic real-world environments.
- Limitation 2: The LiDAR model used is limited and may not represent the performance of all consumer-grade LiDARs.
- Limitation 3: The model's performance varies under different lighting conditions, indicating current methods' shortcomings in disentangling object properties, scene geometry, and ambient illumination.
Future Work
Future research directions include expanding the dataset to cover more dynamic scenes, developing more advanced models to better disentangle object, geometry, and lighting factors, and exploring how to optimize NLOS perception capabilities of low-cost LiDARs in practical applications.
AI Executive Summary
In modern technology, LiDAR sensors have become an indispensable part of mobile devices and robotics. However, consumer-grade LiDARs typically provide only a single depth value per pixel, overlooking the rich information carried by multi-bounce light signals. These multi-bounce signals can reveal the presence of hidden objects, but traditional methods find it difficult to achieve such non-line-of-sight (NLOS) perception on consumer-grade devices.
To address this issue, the research team developed the DENALI dataset, the first large-scale real-world dataset focusing on space-time histograms from low-cost LiDARs. The dataset captures 72,000 scenes, covering various object shapes, positions, lighting conditions, and spatial resolutions. Through this data, the study demonstrates how consumer-grade LiDARs can be used for accurate data-driven NLOS perception.
The study employed various machine learning models, such as 1D CNN and 3D CNN, to evaluate the NLOS perception capabilities of LiDAR data. Experimental results show that the 1D CNN achieved an RMSE of 0.046 meters in the object localization task, a macro-F1 score of 0.38 in the shape classification task, and an accuracy of 0.95 in the size prediction task. These results indicate that the histogram signals from low-cost LiDARs are sufficient to support a range of NLOS perception tasks.
The introduction of the DENALI dataset not only fills the gap in consumer-grade LiDARs' ability to perform NLOS perception in complex scenes but also paves the way for more sophisticated perception tasks in mobile devices and robotics. The study emphasizes the potential of low-cost LiDARs in NLOS imaging, with significant academic and industrial implications.
Nevertheless, the study also points out the current methods' limitations, such as the dataset being captured under controlled conditions, not fully reflecting the diversity of dynamic real-world environments. Additionally, the model's performance varies under different lighting conditions, indicating current methods' shortcomings in disentangling object properties, scene geometry, and ambient illumination. Future research directions include expanding the dataset to cover more dynamic scenes, developing more advanced models to better disentangle object, geometry, and lighting factors, and exploring how to optimize NLOS perception capabilities of low-cost LiDARs in practical applications.
Deep Analysis
Background
LiDAR technology has seen widespread application in recent years, particularly in autonomous driving, robotics, and mobile devices. Traditional LiDAR sensors are primarily used for measuring the depth information of a scene by emitting laser pulses and recording the return times of photons. However, this method typically only utilizes the direct return light signals, ignoring the rich information carried by multi-bounce light signals. Multi-bounce light signals can reveal the presence of hidden objects, a principle that underlies non-line-of-sight (NLOS) imaging research. Although laboratory-grade LiDAR equipment has made some progress in NLOS imaging, consumer-grade LiDARs face challenges in achieving similar functionality due to hardware limitations. Thus, achieving NLOS perception on low-cost LiDARs has become an important research topic.
Core Problem
Consumer-grade LiDARs typically output only a single depth value per pixel, ignoring the rich information carried by multi-bounce light signals. Traditional NLOS imaging methods rely on high-cost laboratory-grade equipment, while consumer-grade LiDARs face challenges in achieving similar functionality due to hardware limitations. The core problem lies in how to utilize the space-time histogram data from consumer-grade LiDARs for NLOS perception to reveal the presence of hidden objects. Solving this problem can not only enhance the application potential of LiDARs in mobile devices and robotics but also support more complex perception tasks.
Innovation
The core innovation of this study lies in the introduction of the DENALI dataset, the first large-scale real-world dataset focusing on NLOS perception with low-cost LiDARs. The dataset captures space-time histograms for 72,000 scenes, covering various object shapes, positions, lighting conditions, and spatial resolutions. Through this data, the study demonstrates how consumer-grade LiDARs can be used for accurate data-driven NLOS perception. Compared to traditional NLOS imaging methods, the DENALI dataset showcases the application potential of consumer-grade LiDARs in real-world scenarios, paving the way for more sophisticated perception tasks in mobile devices and robotics.
Methodology
- �� Dataset Construction: Capture space-time histograms for 72,000 scenes, covering various object shapes, positions, lighting conditions, and spatial resolutions.
- �� Model Selection: Employ various machine learning models, such as 1D CNN and 3D CNN, to evaluate the NLOS perception capabilities of LiDAR data.
- �� Data-Driven Methods: Utilize the dataset for object localization, shape classification, and size estimation, demonstrating the potential of consumer-grade LiDARs in NLOS perception.
- �� Experimental Design: Conduct experiments under different lighting conditions and object positions to validate the models' performance in NLOS perception tasks.
Experiments
The experimental design includes testing under different lighting conditions and object positions to validate the models' performance in NLOS perception tasks. The dataset used covers 72,000 scenes, including various object shapes, positions, and spatial resolutions. Various machine learning models, such as 1D CNN and 3D CNN, were employed to evaluate the NLOS perception capabilities of LiDAR data. Key hyperparameters include the models' learning rate, batch size, and number of training epochs. Additionally, ablation studies were conducted to analyze the impact of different model components on NLOS perception tasks.
Results
Experimental results show that the 1D CNN achieved an RMSE of 0.046 meters in the object localization task, a macro-F1 score of 0.38 in the shape classification task, and an accuracy of 0.95 in the size prediction task. These results indicate that the histogram signals from low-cost LiDARs are sufficient to support a range of NLOS perception tasks. Additionally, the study found that the model's performance varies under different lighting conditions, indicating current methods' shortcomings in disentangling object properties, scene geometry, and ambient illumination.
Applications
The introduction of the DENALI dataset provides new possibilities for the application of low-cost LiDARs in non-line-of-sight imaging. Direct application scenarios include object localization, shape classification, and size estimation in mobile devices and robotics. These applications can achieve complex perception tasks without relying on high-cost laboratory-grade equipment, having a significant impact on the industry.
Limitations & Outlook
Despite showcasing the potential of low-cost LiDARs in NLOS imaging, the study also points out the current methods' limitations. The dataset was captured under controlled conditions, not fully reflecting the diversity of dynamic real-world environments. Additionally, the model's performance varies under different lighting conditions, indicating current methods' shortcomings in disentangling object properties, scene geometry, and ambient illumination. Future research directions include expanding the dataset to cover more dynamic scenes, developing more advanced models to better disentangle object, geometry, and lighting factors, and exploring how to optimize NLOS perception capabilities of low-cost LiDARs in practical applications.
Plain Language Accessible to non-experts
Imagine you're in a large room filled with obstacles, and you need to find a hidden object. Normally, you'd use your eyes to see the object, but what if it's blocked by something else? It's like trying to find your way out of a maze, where you rely on echoes or reflections to figure out where things are. LiDAR sensors are like your eyes; they emit lasers and measure the time it takes for the light to return, estimating the distance to objects. Traditional LiDAR only tells you the distance to objects you can see directly, but in reality, the light might bounce multiple times before returning, like echoes helping you locate hidden objects. Researchers have developed a new dataset focused on using these multi-bounce light signals to perceive the location of hidden objects. It's like giving your eyes a new ability, allowing you to know where things are even if you can't see them directly. This method shows how low-cost LiDAR devices can achieve non-line-of-sight perception, paving the way for more complex perception tasks in mobile devices and robotics.
ELI14 Explained like you're 14
Hey there, friends! Did you know that our phones and some robots have a sensor called LiDAR that can measure the distance to objects? Imagine you're playing hide and seek, and your friend is hiding in a place you can't see. Normally, you'd use your eyes to find them, but what if you could use a special kind of light to 'see' those hidden places? LiDAR is like this special light. It sends out lasers and then measures the time it takes for them to come back to figure out the distance to objects. Researchers found that these lasers might bounce multiple times before returning, like echoes, helping us find hidden objects. They developed a new dataset focused on using these multi-bounce light signals to perceive the location of hidden objects. It's like giving our eyes a new ability, allowing us to know where things are even if we can't see them directly. Isn't that amazing? In the future, we can use this method to make our phones and robots smarter, able to find hidden things in complex environments!
Glossary
LiDAR (Light Detection and Ranging)
LiDAR is a technology that estimates object distances by emitting lasers and measuring the return time of photons. It is widely used in autonomous driving, robotics, and mobile devices.
In the paper, LiDAR is used to capture space-time histograms for non-line-of-sight perception.
NLOS (Non-Line-of-Sight)
Non-line-of-sight imaging is a method that uses multi-bounce light signals to perceive occluded objects. It can reveal the presence of hidden objects.
In the paper, NLOS perception is achieved by analyzing LiDAR's space-time histograms.
Space-Time Histogram
A space-time histogram records the temporal distribution of returning photons, including direct and multi-bounce light signals.
In the study, space-time histograms are used to analyze multi-bounce signals for NLOS perception.
1D CNN (One-Dimensional Convolutional Neural Network)
A 1D CNN is a neural network used for processing one-dimensional data, commonly used in time series analysis.
In the paper, 1D CNN is used to analyze LiDAR's space-time histograms for object localization.
3D CNN (Three-Dimensional Convolutional Neural Network)
A 3D CNN is a neural network used for processing three-dimensional data, commonly used in video and 3D image analysis.
In the paper, 3D CNN is used to analyze LiDAR's space-time histograms for shape classification.
RMSE (Root Mean Square Error)
RMSE is a metric used to measure the difference between predicted and true values, with smaller values indicating more accurate predictions.
In the paper, RMSE is used to evaluate the model's performance in object localization tasks.
Macro-F1 Score
Macro-F1 score is a metric used to evaluate the performance of classification models, considering both precision and recall.
In the paper, the macro-F1 score is used to evaluate the model's performance in shape classification tasks.
Ablation Study
An ablation study is a method of evaluating the impact of specific components on overall performance by removing them from the model.
In the paper, ablation studies are used to analyze the impact of different model components on NLOS perception tasks.
Data-Driven Method
A data-driven method is an approach that trains models and makes predictions by analyzing large amounts of data.
In the paper, data-driven methods are used to utilize LiDAR data for NLOS perception.
Consumer-Grade LiDAR
Consumer-grade LiDAR is a cost-effective LiDAR sensor suitable for consumer electronics like mobile devices and robotics.
In the paper, consumer-grade LiDAR is used to capture space-time histograms for NLOS perception.
Open Questions Unanswered questions from this research
- 1 Open question 1: How can low-cost LiDAR achieve NLOS perception in dynamic real-world environments? The current dataset was captured under controlled conditions, not fully reflecting the diversity of dynamic real-world environments.
- 2 Open question 2: How can the spatial resolution of consumer-grade LiDARs be improved without increasing hardware costs? Current consumer-grade LiDARs face challenges in achieving high-resolution NLOS perception due to hardware limitations.
- 3 Open question 3: How can more advanced models be developed to better disentangle object, geometry, and lighting factors? The current models' performance varies under different lighting conditions, indicating shortcomings in disentangling these factors.
- 4 Open question 4: How can the NLOS perception capabilities of low-cost LiDARs be optimized for practical applications? While the study demonstrates the potential of consumer-grade LiDARs in NLOS imaging, further optimization is needed for practical applications.
- 5 Open question 5: How can complex perception tasks be achieved without relying on high-cost laboratory-grade equipment? The DENALI dataset showcases the potential of consumer-grade LiDARs, but many challenges remain in practical applications.
Applications
Immediate Applications
Object Localization in Mobile Devices
Using the DENALI dataset and low-cost LiDAR, object localization can be achieved in mobile devices, enhancing their perception capabilities in complex environments.
Shape Classification in Robotics
By analyzing LiDAR's space-time histograms, robots can achieve shape classification without relying on high-cost equipment, improving navigation in complex environments.
Size Estimation in Augmented Reality
Using data-driven methods, augmented reality devices can achieve size estimation without increasing hardware costs, enhancing user experience.
Long-term Vision
NLOS Perception in Smart Cities
In the future, low-cost LiDAR can be used for NLOS perception in smart cities, enabling real-time monitoring of hidden objects and enhancing urban safety.
Complex Scene Perception in Autonomous Driving
By optimizing the NLOS perception capabilities of low-cost LiDARs, autonomous vehicles can achieve more accurate perception in complex scenes, improving driving safety.
Abstract
Consumer LiDARs in mobile devices and robots typically output a single depth value per pixel. Yet internally, they record full time-resolved histograms containing direct and multi-bounce light returns; these multi-bounce returns encode rich non-line-of-sight (NLOS) cues that can enable perception of hidden objects in a scene. However, severe hardware limitations of consumer LiDARs make NLOS reconstruction with conventional methods difficult. In this work, we motivate a complementary direction: enabling NLOS perception with low-cost LiDARs through data-driven inference. We present DENALI, the first large-scale real-world dataset of space-time histograms from low-cost LiDARs capturing hidden objects. We capture time-resolved LiDAR histograms for 72,000 hidden-object scenes across diverse object shapes, positions, lighting conditions, and spatial resolutions. Using our dataset, we show that consumer LiDARs can enable accurate, data-driven NLOS perception. We further identify key scene and modeling factors that limit performance, as well as simulation-fidelity gaps that hinder current sim-to-real transfer, motivating future work toward scalable NLOS vision with consumer LiDARs.
References (20)
Data-Driven Non-Line-of-Sight Imaging With A Traditional Camera
Matthew Tancik, Tristan Swedish, Guy Satat et al.
A Review of Single-Photon Avalanche Diode Time-of-Flight Imaging Sensor Arrays
F. Piron, Daniel Morrison, M. Yuce et al.
AB3DMOT: A Baseline for 3D Multi-Object Tracking and New Evaluation Metrics
Xinshuo Weng, Jianren Wang, David Held et al.
Optical Non-Line-of-Sight Physics-Based 3D Human Pose Estimation
Mariko Isogawa, Ye Yuan, Matthew O'Toole et al.
Scalability in Perception for Autonomous Driving: Waymo Open Dataset
Pei Sun, Henrik Kretzschmar, Xerxes Dotiwalla et al.
SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences
Jens Behley, Martin Garbade, Andres Milioto et al.
nuScenes: A Multimodal Dataset for Autonomous Driving
Holger Caesar, Varun Bankiti, Alex H. Lang et al.
PointPillars: Fast Encoders for Object Detection From Point Clouds
Alex H. Lang, Sourabh Vora, Holger Caesar et al.
Learned feature embeddings for non-line-of-sight imaging and recognition
Wenzheng Chen, Fangyin Wei, Kiriakos N. Kutulakos et al.
Towards photography through realistic fog
Guy Satat, Matthew Tancik, R. Raskar
Lidar System Architectures and Circuits
Behnam Behroozpour, Phillip A. M. Sandborn, Ming C. Wu et al.
Material Classification Using Raw Time-of-Flight Measurements
Shuochen Su, Felix Heide, Robin Swanson et al.
A light transport model for mitigating multipath interference in Time-of-flight sensors
Nikhil Naik, Achuta Kadambi, Christoph Rhemann et al.
Are we ready for autonomous driving? The KITTI vision benchmark suite
Andreas Geiger, Philip Lenz, R. Urtasun
Recovering three-dimensional shape around a corner using ultrafast time-of-flight imaging
A. Velten, T. Willwacher, O. Gupta et al.
Looking around the corner using transient imaging
Ahmed Kirmani, Tyler Hutchison, James Davis et al.
Toward Dynamic Non-Line-of-Sight Imaging with Mamba Enforced Temporal Consistency
Yue Li, Yi Sun, Shida Sun et al.
NLOST: Non-Line-of-Sight Imaging with Transformer
Yue Li, Jiayong Peng, Juntian Ye et al.
Recovering Parametric Scenes from Very Few Time-of-Flight Pixels
Carter Sifferman, Yiquan Li, Yiming Li et al.
Cited By (1)
Spatial Calibration of Diffuse LiDARs