HumDex:Humanoid Dexterous Manipulation Made Easy

TL;DR

HumDex system uses IMU tracking and learning methods for portable humanoid dexterous manipulation, enhancing data collection efficiency and generalization.

cs.RO πŸ”΄ Advanced 2026-03-13 12 views
Liang Heng Yihe Tang Jiajun Xu Henghui Bao Di Huang Yue Wang
humanoid robots dexterous manipulation imitation learning IMU tracking data collection

Key Findings

Methodology

The HumDex system employs IMU-based tracking combined with a learning-driven hand retargeting method to achieve portable and high-precision humanoid whole-body dexterous manipulation. The system utilizes a two-stage imitation learning framework, pre-training on diverse human motion data to learn generalizable priors, and fine-tuning on robot data to bridge the embodiment gap for precise execution.

Key Results

  • The HumDex system achieved a 90% teleoperation success rate in the Scan&Pack task, significantly improving task execution capability compared to the baseline system's 0%.
  • In tasks like Hang Towel and Open Door, HumDex's teleoperation success rate averaged 74.6%, significantly higher than the baseline system's 57.5%.
  • The two-stage training framework significantly improved the system's generalization to new positions, objects, and backgrounds, reducing data acquisition costs.

Significance

The HumDex system holds significant importance in both academia and industry. It addresses the bottleneck of high-quality demonstration data collection, especially in complex whole-body tasks. By improving data collection efficiency and task execution success rates, HumDex provides stronger support for the practical application of humanoid robots.

Technical Contribution

HumDex offers several technical innovations: 1) a portable IMU tracking system that resolves the portability-precision trade-off of traditional systems; 2) a learning-driven hand retargeting method that achieves smooth and natural hand motions; 3) a two-stage imitation learning framework that enhances system generalization capabilities.

Novelty

HumDex is the first system to utilize IMU tracking and learning methods for portable high-precision humanoid dexterous manipulation. Compared to existing VR or optical tracking systems, HumDex achieves significant breakthroughs in portability and precision.

Limitations

  • The HumDex system still faces embodiment gap issues when handling complex whole-body movements, which may lead to operational failures.
  • The system's performance in highly dynamic environments needs further validation.
  • The generalization capability of the learning methods may be limited under extreme conditions.

Future Work

Future research directions include: 1) enhancing the system's adaptability in dynamic environments; 2) optimizing learning algorithms to improve generalization capabilities; 3) expanding the system's application to more complex task scenarios.

AI Executive Summary

The HumDex system aims to address the bottleneck of high-quality demonstration data collection in humanoid dexterous manipulation. Existing teleoperation systems often suffer from limited portability, occlusion, or insufficient precision, hindering their applicability to complex whole-body tasks. HumDex achieves portable and high-precision humanoid whole-body dexterous manipulation through IMU-based tracking and a learning-driven hand retargeting method.

The HumDex system employs a two-stage imitation learning framework. It first pre-trains on diverse human motion data to learn generalizable priors, then fine-tunes on robot data to bridge the embodiment gap for precise execution. Experimental results demonstrate that this approach significantly improves the system's generalization to new configurations, objects, and backgrounds, with minimal data acquisition costs.

In experiments, HumDex performed exceptionally well in several challenging tasks. For instance, in the Scan&Pack task, HumDex achieved a 90% teleoperation success rate, while the baseline system failed to complete the task. Additionally, in tasks like Hang Towel and Open Door, HumDex's teleoperation success rate averaged 74.6%, significantly higher than the baseline system's 57.5%.

HumDex's technical contributions include: 1) a portable IMU tracking system that resolves the portability-precision trade-off of traditional systems; 2) a learning-driven hand retargeting method that achieves smooth and natural hand motions; 3) a two-stage imitation learning framework that enhances system generalization capabilities. These innovations provide stronger support for the practical application of humanoid robots.

Despite the significant advancements made by HumDex, there are still some limitations. For example, the system still faces embodiment gap issues when handling complex whole-body movements, which may lead to operational failures. Additionally, the system's performance in highly dynamic environments needs further validation. Future research directions include enhancing the system's adaptability in dynamic environments, optimizing learning algorithms to improve generalization capabilities, and expanding the system's application to more complex task scenarios.

Deep Analysis

Background

Humanoid robots hold great promise for performing complex, long-horizon manipulation tasks. However, current robotic systems often rely on imitation learning, which requires high-quality task demonstration data. While significant progress has been made in tabletop robot data collection, teleoperation systems for humanoid robots remain less mature. Existing motion-capture systems, such as optical tracking or exoskeletons, achieve high accuracy but require fixed infrastructure, severely limiting the environments in which data can be collected. In contrast, VR-based alternatives offer greater portability but suffer from reduced accuracy and occlusion issues.

Core Problem

The core problem in humanoid dexterous manipulation is the efficient collection of high-quality demonstration data. Existing teleoperation systems often suffer from limited portability, occlusion, or insufficient precision, hindering their applicability to complex whole-body tasks. This bottleneck is particularly pronounced for humanoid robots with dexterous hands due to their complex morphology.

Innovation

The core innovations of the HumDex system include: 1) a portable IMU tracking system that resolves the portability-precision trade-off of traditional systems; 2) a learning-driven hand retargeting method that achieves smooth and natural hand motions; 3) a two-stage imitation learning framework that enhances system generalization capabilities. These innovations significantly improve data collection efficiency and task execution success rates.

Methodology

  • οΏ½οΏ½ The HumDex system employs IMU-based tracking to achieve portable and high-precision humanoid whole-body dexterous manipulation.
  • οΏ½οΏ½ The system uses a learning-driven hand retargeting method to generate smooth and natural hand motions.
  • οΏ½οΏ½ A two-stage imitation learning framework: first pre-training on diverse human motion data, then fine-tuning on robot data to bridge the embodiment gap.

Experiments

The experimental design includes several challenging tasks, such as Scan&Pack, Hang Towel, and Open Door. The Unitree-G1 humanoid robot and 20-DoF dexterous hands were used for testing. The experiments compared the teleoperation success rate, data collection efficiency, and task execution success rate of the HumDex system with baseline systems.

Results

Experimental results show that the HumDex system performed exceptionally well in several tasks. For instance, in the Scan&Pack task, HumDex achieved a 90% teleoperation success rate, while the baseline system failed to complete the task. Additionally, in tasks like Hang Towel and Open Door, HumDex's teleoperation success rate averaged 74.6%, significantly higher than the baseline system's 57.5%.

Applications

The HumDex system can be applied to various complex task scenarios, such as industrial automation, service robots, and medical assistance. Its portability and high precision make it highly suitable for practical applications.

Limitations & Outlook

Despite the significant advancements made by HumDex, there are still some limitations. For example, the system still faces embodiment gap issues when handling complex whole-body movements, which may lead to operational failures. Additionally, the system's performance in highly dynamic environments needs further validation.

Plain Language Accessible to non-experts

Imagine you're in a kitchen, cooking a meal, and you need to stir a pot of soup while chopping vegetables at the same time. The HumDex system is like an efficient kitchen assistant that can handle multiple tasks simultaneously. By wearing lightweight sensors, HumDex can precisely track your movements, just like an assistant accurately mimicking your every action. Whether it's stirring or chopping, HumDex can perform smoothly and naturally without you having to manually adjust anything. More importantly, this assistant can learn your habits over time, gradually improving its skills and adapting to different kitchen environments and task demands. Even when you're not in the kitchen, it can complete tasks independently based on your usual operating habits.

ELI14 Explained like you're 14

Hey there! Did you know? HumDex is like a super cool humanoid robot that can help you do lots of things, like opening doors, hanging towels, and even scanning items! It's kind of like the character you control in a video game, except it's moving around in the real world. HumDex has a special ability: it can learn and mimic your movements using some tiny sensors. Just like when you're playing a VR game, and you wear a headset and controllers, the game character moves with you. What's even cooler is that HumDex can learn new skills on its own and get smarter over time! So, one day, it might become a great helper in your life!

Glossary

HumDex

HumDex is a portable teleoperation system designed for humanoid whole-body dexterous manipulation. It utilizes IMU-based tracking and a learning-driven hand retargeting method to achieve high-precision motion tracking and smooth, natural hand motions.

In the paper, HumDex is used to address the bottleneck of high-quality demonstration data collection.

IMU Tracking

IMU (Inertial Measurement Unit) tracking is a method of tracking object motion by measuring acceleration and angular velocity. It is commonly used in portable devices to provide high-precision motion capture.

The HumDex system uses IMU tracking technology to achieve portable and high-precision whole-body motion capture.

Hand Retargeting

Hand retargeting is a technique that maps human hand movements to a robot hand, often requiring solutions to embodiment gap issues to ensure accurate execution of human actions.

The HumDex system employs a learning-driven hand retargeting method to achieve smooth and natural hand motions.

Imitation Learning

Imitation learning is a machine learning method that involves learning new skills by observing and mimicking the behavior of others. It is widely used in robotics to learn complex manipulation tasks.

The HumDex system uses a two-stage imitation learning framework to enhance system generalization capabilities.

Embodiment Gap

The embodiment gap refers to the physical structural differences between humans and robots, which can lead to inaccuracies when directly mapping human actions to robots.

HumDex fine-tunes on robot data to bridge the embodiment gap and ensure precise execution.

Generalization Capability

Generalization capability refers to a system's ability to perform well in environments outside of its training data. It is an important metric for evaluating machine learning model performance.

The HumDex system significantly improves generalization to new configurations, objects, and backgrounds through its two-stage training framework.

Teleoperation

Teleoperation refers to the technology of remotely controlling devices to perform tasks. It is widely used in robotics for tasks requiring precise control.

The HumDex system is a portable teleoperation system designed for humanoid whole-body dexterous manipulation.

Data Collection Efficiency

Data collection efficiency refers to the ability to collect high-quality data within a given time frame. Improving data collection efficiency can significantly reduce the training costs of machine learning models.

The HumDex system significantly improves data collection efficiency through IMU tracking technology and learning methods.

Smooth and Natural Hand Motions

Smooth and natural hand motions refer to the ability of a robot hand to mimic human hand movements in a fluid and realistic manner.

The HumDex system achieves smooth and natural hand motions through a learning-driven hand retargeting method.

Two-Stage Imitation Learning Framework

A two-stage imitation learning framework is a learning method that first pre-trains on diverse human motion data and then fine-tunes on robot data.

The HumDex system uses a two-stage imitation learning framework to enhance system generalization capabilities.

Open Questions Unanswered questions from this research

  • 1 How can the HumDex system's adaptability in dynamic environments be improved? The current system's performance in highly dynamic environments needs further validation, and optimizing algorithms and hardware configurations may be necessary to enhance its stability and responsiveness.
  • 2 How can the embodiment gap issue in handling complex whole-body movements be further addressed? While the system bridges the embodiment gap to some extent through fine-tuning, its performance under extreme conditions still needs improvement.
  • 3 How can the generalization capability of the HumDex system be further improved? Although the two-stage training framework significantly enhances the system's generalization capability, there is still room for improvement when facing entirely new tasks and environments.
  • 4 In multi-task scenarios, how can the HumDex system's task-switching capability be optimized? The current system performs well in single tasks but may face challenges when switching between multiple tasks.
  • 5 How can the computational cost of the HumDex system be reduced? While the system performs well in terms of precision and portability, its computational cost may limit its application in resource-constrained environments.
  • 6 How can the stability of the HumDex system during long-term operations be improved? Long-term operations may lead to sensor drift and system fatigue, requiring further research to enhance the system's durability.
  • 7 How can the application scenarios of the HumDex system be expanded? Although the system performs well in several tasks, its practical effectiveness in specific industries and applications still needs verification.

Applications

Immediate Applications

Industrial Automation

The HumDex system can be used for complex tasks in industrial automation, such as fine operations on assembly lines. Its high precision and portability enable it to work efficiently in variable industrial environments.

Service Robots

In the service industry, the HumDex system can help robots perform various tasks, such as delivering items, cleaning, and customer service. Its smooth and natural motions make it easier to interact with humans.

Medical Assistance

The HumDex system can be used in medical assistance to help perform fine surgical operations or rehabilitation training. Its high-precision motion control can improve the safety and effectiveness of medical operations.

Long-term Vision

Smart Homes

In the future, the HumDex system can be integrated into smart homes to help perform daily chores, such as cleaning, cooking, and maintenance. Its autonomous learning ability will make it a valuable assistant in household life.

Space Exploration

In space exploration, the HumDex system can be used for remote operation of complex equipment, performing maintenance and assembly tasks. Its portability and high precision make it suitable for use in space environments.

Abstract

This paper investigates humanoid whole-body dexterous manipulation, where the efficient collection of high-quality demonstration data remains a central bottleneck. Existing teleoperation systems often suffer from limited portability, occlusion, or insufficient precision, which hinders their applicability to complex whole-body tasks. To address these challenges, we introduce HumDex, a portable teleoperation system designed for humanoid whole-body dexterous manipulation. Our system leverages IMU-based motion tracking to address the portability-precision trade-off, enabling accurate full-body tracking while remaining easy to deploy. For dexterous hand control, we further introduce a learning-based retargeting method that generates smooth and natural hand motions without manual parameter tuning. Beyond teleoperation, HumDex enables efficient collection of human motion data. Building on this capability, we propose a two-stage imitation learning framework that first pre-trains on diverse human motion data to learn generalizable priors, and then fine-tunes on robot data to bridge the embodiment gap for precise execution. We demonstrate that this approach significantly improves generalization to new configurations, objects, and backgrounds with minimal data acquisition costs. The entire system is fully reproducible and open-sourced at https://github.com/physical-superintelligence-lab/HumDex.

cs.RO

References (20)

SONIC: Supersizing Motion Tracking for Natural Humanoid Whole-Body Control

Zhengyi Luo, Ye Yuan, Tingwu Wang et al.

2025 32 citations ⭐ Influential View Analysis β†’

TWIST2: Scalable, Portable, and Holistic Humanoid Data Collection System

Yanjie Ze, Siheng Zhao, Weizhuo Wang et al.

2025 29 citations ⭐ Influential View Analysis β†’

TWIST: Teleoperated Whole-Body Imitation System

Yanjie Ze, Zixuan Chen, J. P. Ara'ujo et al.

2025 107 citations ⭐ Influential View Analysis β†’

AMO: Adaptive Motion Optimization for Hyper-Dexterous Humanoid Whole-Body Control

Jialong Li, Xuxin Cheng, Tianshu Huang et al.

2025 66 citations ⭐ Influential View Analysis β†’

HOMIE: Humanoid Loco-Manipulation with Isomorphic Exoskeleton Cockpit

Qingwei Ben, Feiyu Jia, Jia Zeng et al.

2025 104 citations ⭐ Influential View Analysis β†’

GELLO: A General, Low-Cost, and Intuitive Teleoperation Framework for Robot Manipulators

Philipp Wu, Yide Shentu, Zhongke Yi et al.

2023 252 citations View Analysis β†’

ViTacFormer: Learning Cross-Modal Representation for Visuo-Tactile Dexterous Manipulation

Liang He, Haoran Geng, Kaifeng Zhang et al.

2025 20 citations View Analysis β†’

DexMachina: Functional Retargeting for Bimanual Dexterous Manipulation

Zhao Mandi, Yifan Hou, Dieter Fox et al.

2025 19 citations View Analysis β†’

EgoMimic: Scaling Imitation Learning via Egocentric Video

Simar Kareer, Dhruv Patel, Ryan Punamiya et al.

2024 119 citations View Analysis β†’

R3M: A Universal Visual Representation for Robot Manipulation

Suraj Nair, A. Rajeswaran, Vikash Kumar et al.

2022 815 citations View Analysis β†’

Object-Centric Prompt-Driven Vision-Language-Action Model for Robotic Manipulation

Xiaoqi Li, Jing Xu, Mingxu Zhang et al.

2025 11 citations

Real-World Robot Learning with Masked Visual Pre-training

Ilija Radosavovic, Tete Xiao, Stephen James et al.

2022 328 citations View Analysis β†’

Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware

Tony Zhao, Vikash Kumar, S. Levine et al.

2023 1404 citations View Analysis β†’

DexCap: Scalable and Portable Mocap Data Collection System for Dexterous Manipulation

Chen Wang, Haochen Shi, Weizhuo Wang et al.

2024 226 citations View Analysis β†’

Analyzing Key Objectives in Human-to-Robot Retargeting for Dexterous Manipulation

Chendong Xin, Mingrui Yu, Yongpeng Jiang et al.

2025 5 citations View Analysis β†’

DexFlow: A Unified Approach for Dexterous Hand Pose Retargeting and Interaction

Xiaoyi Lin, Kunpeng Yao, Lixin Xu et al.

2025 4 citations View Analysis β†’

A Self-Correcting Vision-Language-Action Model for Fast and Slow System Manipulation

Chenxuan Li, Jiaming Liu, Guanqun Wang et al.

2024 18 citations View Analysis β†’

CLONE: Closed-Loop Whole-Body Humanoid Teleoperation for Long-Horizon Tasks

Yixuan Li, Yutang Lin, Jieming Cui et al.

2025 51 citations View Analysis β†’

MoLe-VLA: Dynamic Layer-skipping Vision Language Action Model via Mixture-of-Layers for Efficient Robot Manipulation

Rongyu Zhang, Menghang Dong, Yuan Zhang et al.

2025 44 citations View Analysis β†’

Masked Visual Pre-training for Motor Control

Tete Xiao, Ilija Radosavovic, Trevor Darrell et al.

2022 299 citations View Analysis β†’