Learning Controlled Separation of Small Objects Between Two Fingers with a Tactile Skin

TL;DR

Pure tactile-based deep RL enables multi-finger robot to control small object separation with success rates close to 94%, validated in real-world transfer.

cs.RO 🔴 Advanced 2026-05-30 91 views

Ulf Kasolowsky Berthold Bäuml

AI Reader Arxiv Page Download PDF

robot manipulation tactile sensing deep reinforcement learning sim-to-real transfer micro-object handling

Key Findings

Methodology

This work introduces a reinforcement learning framework that leverages spatially-resolved tactile skin as the primary sensory input for micro-object separation tasks. The system employs MuJoCo physics simulation to model the multi-finger hand dynamics, with tactile inputs simulated at high or low resolution (4x4 taxels). The policy network, trained via PPO, receives sparse rewards based on whether the target number of objects remains between the fingers. An auxiliary contact point estimator is trained concurrently to improve spatial perception. Domain randomization techniques—such as varying joint offsets, friction, and sensor noise—are used to enhance transferability. After training in simulation, the policy is deployed on a real DLR-Hand II equipped with tactile skin, demonstrating successful sim-to-real transfer and high success rates across different target object counts.

Key Results

In simulation, the ideal high-resolution tactile sensor nearly perfectly solves the task with success rates exceeding 98%. The low-resolution (4x4 taxels) sensor still improves success by up to 20% compared to using only joint sensors, especially in more challenging scenarios with multiple objects.
Incorporating the contact point estimator boosts success rates to 84% in simulation, with significant improvements in localizing contact points and reducing errors in multi-object settings.
Real-world experiments show success rates of 94%, 88%, and 85% for target object counts of 1, 2, and 3 respectively, closely matching simulation outcomes, confirming the robustness of the learned policy.

Significance

This research pioneers a tactile-only approach to micro-object separation, eliminating reliance on vision. It demonstrates that spatially-resolved tactile sensing combined with reinforcement learning can achieve precise micro-manipulation. The ability to transfer policies trained purely in simulation to real hardware significantly reduces development costs and accelerates deployment. This advances the field of autonomous micro-assembly, with broad implications for manufacturing, medical robotics, and delicate handling tasks where visual occlusion or lighting conditions are problematic. The work also highlights the importance of tactile perception in fine manipulation, inspiring new directions in sensor design and learning algorithms for dexterous robots.

Technical Contribution

The main technical innovations include: 1) integrating spatially-resolved tactile sensing with deep RL for micro-object manipulation; 2) designing a sparse reward framework that effectively guides the policy toward target object counts; 3) developing a contact point estimation network that enhances spatial perception; 4) applying domain randomization to improve sim-to-real transfer robustness. The approach enables end-to-end learning of complex finger control strategies solely based on tactile feedback, surpassing traditional rule-based or vision-dependent methods. The combination of tactile sensing, auxiliary estimation, and RL constitutes a novel paradigm for dexterous micro-manipulation.

Novelty

This is the first work to successfully train a purely tactile-driven policy for controlled separation of small objects with multiple fingers, without visual input. The use of spatially-resolved tactile skin, combined with reinforcement learning and sim-to-real transfer, addresses a long-standing challenge in micro-manipulation. Unlike prior work focusing on larger objects or simple reorientation, this research handles objects smaller than 6mm, demonstrating a new level of dexterity and sensing sophistication. The integration of an auxiliary contact point estimator further distinguishes this approach from existing methods.

Limitations

While successful in controlled scenarios, the current approach may struggle with highly irregular objects, shape variations, or dynamic environments where tactile feedback alone might be insufficient. Handling complex contact dynamics and multiple objects simultaneously remains challenging.
The reliance on simulation, despite domain randomization, may still face transfer issues in environments with unmodeled noise or hardware wear. Real-time computational costs for high-resolution tactile processing could limit scalability.
Training requires substantial computational resources and time, limiting rapid adaptation to new tasks or object types. Future work should focus on improving training efficiency and robustness under diverse conditions.

Future Work

Future directions include extending the framework to handle more complex objects like screws or nuts with varied shapes and materials, which pose additional challenges for contact modeling. Incorporating multi-modal sensing, such as force and temperature, could further improve perception accuracy. Developing multi-finger coordination strategies and hierarchical control schemes will enable more sophisticated micro-manipulation tasks. Additionally, optimizing simulation environments and learning algorithms for faster convergence will facilitate real-time deployment in industrial settings. Ultimately, integrating this tactile-based micro-manipulation into autonomous assembly lines and medical robots will be the key goal.

AI Executive Summary

Micro-manipulation of tiny objects remains a formidable challenge in robotics, especially when visual cues are unreliable or unavailable. Traditional approaches often depend heavily on visual sensors, which can be occluded or affected by lighting conditions, limiting their effectiveness in delicate tasks such as assembling microelectronics or handling fragile biological tissues. This paper introduces a novel approach that leverages purely tactile sensing, specifically spatially-resolved tactile skin, combined with deep reinforcement learning, to enable a multi-finger robotic hand to perform controlled separation of small objects.

The core idea is to train a policy in simulation that uses tactile feedback to determine how to manipulate objects with high precision. The simulation employs MuJoCo physics engine to model the hand's dynamics, contact interactions, and tactile sensor responses. The tactile sensor array, either ideal high-resolution or low-resolution (4x4 taxels), provides spatial pressure maps that serve as the primary input. The policy network, trained via PPO, receives sparse rewards based on whether the target number of objects remains between the fingers after manipulation. To improve perception, an auxiliary contact point estimator is trained concurrently, predicting the spatial distribution of contact points, which enhances the policy's spatial awareness.

Training involves domain randomization to simulate real-world variability, including joint offsets, friction coefficients, and sensor noise. This ensures that the learned policy is robust enough to transfer from simulation to the physical robot. The real robot used in experiments is the DLR-Hand II equipped with tactile skin, and the policy is deployed directly after training.

Experimental results demonstrate that the tactile-only approach achieves success rates close to 94% for single-object separation, with performance degrading gracefully for multiple objects. The low-resolution tactile sensor, despite its limited spatial coverage, still significantly improves performance over joint sensors alone. The successful sim-to-real transfer confirms the practicality of the method, opening new avenues for autonomous micro-manipulation in industrial and medical applications.

This work's significance lies in its demonstration that high-precision micro-object handling can be achieved solely through tactile perception, reducing reliance on vision and complex sensing setups. The integration of tactile sensing, reinforcement learning, and simulation-based training marks a substantial step forward in dexterous robotics. Future research will focus on handling more complex objects, multi-modal sensing, and real-time control, aiming to realize fully autonomous micro-assembly systems that operate reliably in unstructured environments.

Deep Analysis

Background

The field of robotic manipulation has seen rapid development with the advent of deep learning and advanced sensing technologies. Early efforts focused on vision-based control, leveraging cameras and depth sensors for object detection and grasping. However, vision-based systems face limitations in occluded or dark environments, especially when dealing with tiny objects. Recent advances introduced tactile sensing as a complementary modality, inspired by human touch, which provides rich spatial and force information. Prior work on tactile manipulation includes the use of high-resolution tactile sensors like GelSight and BioTac, primarily for grasp stability and slip detection. Nonetheless, most studies target larger objects or simple in-hand reorientation tasks. Micro-object manipulation, especially with objects smaller than 6mm, remains underexplored due to challenges in sensing resolution, contact modeling, and control complexity. This paper builds on these foundations, integrating spatial tactile sensing with reinforcement learning to address the precise control of small objects without visual cues.

Core Problem

The core challenge addressed is enabling a robotic hand to perform controlled separation of small objects, specifically pellets of 6mm diameter, using only tactile feedback. The difficulty stems from the limited spatial resolution of tactile sensors, the complex contact dynamics at small scales, and the need for precise finger control to avoid dropping or misplacing objects. Additionally, the task requires the robot to decide which objects to keep or drop based solely on tactile information, without visual guidance. Achieving reliable performance in real-world scenarios is complicated by sensor noise, contact uncertainties, and the variability in object placement. The problem is further exacerbated when scaling to multiple objects, where the decision-making process becomes more intricate, demanding sophisticated perception and control strategies. Addressing these issues is crucial for advancing micro-manipulation capabilities in autonomous robots.

Innovation

This research introduces several key innovations: 1) the use of spatially-resolved tactile skin as the primary sensing modality, providing detailed contact distribution data; 2) a reinforcement learning framework that learns a policy directly from tactile inputs, guided by sparse rewards based on target object count; 3) a contact point estimation network that predicts contact locations, enhancing spatial awareness; 4) the application of domain randomization to improve sim-to-real transfer robustness; 5) end-to-end training of multi-finger control strategies capable of handling objects smaller than 6mm. These innovations collectively address the sensing and control limitations of prior approaches, enabling precise micro-object separation solely through tactile feedback.

Methodology

�� Environment setup: Use MuJoCo physics engine to simulate a multi-finger robotic hand with realistic contact dynamics, modeling the DLR-Hand II equipped with a tactile skin array. • Object initialization: Randomly generate configurations of 12 small pellets (diameter 6mm) on a contact plane, mimicking real stacking scenarios, with added Gaussian noise for variability. • Policy training: Implement PPO algorithm, input includes target object number d, tactile pressure maps (high or low resolution), and auxiliary contact point predictions. • Perception integration: Stack recent observations (joint angles, tactile images, contact estimates) over 0.25s to form the input to the policy network. • Reward design: Sparse reward based on whether the number of contact points matches the target d, with additional stability constraints to prevent slipping or rolling. • Domain randomization: Randomize joint offsets, friction coefficients, sensor noise, and tactile skin elasticity to enhance transfer robustness. • Contact point estimation: Train a neural network with supervised labels from simulation to predict contact distributions, aiding the policy. • Deployment: Transfer the trained policy to the physical DLR-Hand II with tactile skin, perform minimal fine-tuning, and evaluate success rates across multiple trials.

Experiments

The experimental setup involves training in simulation with 160 parallel MuJoCo environments, each with randomized initial pellet configurations. The training runs for approximately 5 hours, with hyperparameters tuned for stability and convergence. Evaluation metrics include success rate (matching target pellet number), contact localization accuracy, and robustness to sensor resolution. Ablation studies compare high-resolution versus low-resolution tactile sensors, with and without the contact point estimator. The real-world validation involves deploying the learned policy on the DLR-Hand II equipped with tactile skin, performing multiple trials for each target number (1-3 pellets). Success rates are recorded and compared to simulation results, analyzing the impact of tactile resolution and estimator accuracy. Additional tests assess the system's robustness to environmental disturbances and sensor noise, ensuring practical viability.

Results

Simulation results show that with an ideal high-resolution tactile sensor, success rates exceed 98%, while low-resolution sensors (4x4 taxels) still improve success by up to 20% over joint sensors alone. Incorporating the contact point estimator boosts success rates to 84% in simulation, with improved localization accuracy. In real-world tests, success rates of 94%, 88%, and 85% are achieved for target object counts of 1, 2, and 3, respectively, closely matching simulation predictions. The tactile-only approach demonstrates robustness against sensor resolution limitations, with performance degradation being gradual rather than abrupt. The experiments confirm that tactile perception, combined with reinforcement learning, can effectively control micro-object separation without visual input, opening new avenues for micro-manipulation in unstructured environments.

Applications

This approach is directly applicable to micro-assembly lines, delicate packaging, and minimally invasive surgical robots, where visual occlusion or lighting conditions hinder vision-based control. It requires a multi-finger robotic platform equipped with spatial tactile sensors and trained policies. The method enables autonomous micro-object handling, reducing manual labor and increasing precision. Long-term, integrating multi-modal sensing and multi-finger coordination could lead to fully autonomous micro-assembly systems capable of handling complex tasks such as microelectronics assembly, biomedical device fabrication, and fragile object sorting, significantly transforming manufacturing and healthcare industries.

Limitations & Outlook

Despite promising results, the current system faces limitations in handling irregularly shaped objects, materials with different frictional properties, and highly dynamic contact scenarios. The reliance on simulation, although mitigated by domain randomization, may still encounter transfer issues in environments with unmodeled noise or wear. The computational cost of tactile data processing and policy inference limits real-time performance for more complex tasks. Additionally, training duration remains substantial, and the approach may require task-specific fine-tuning for different object types. Future work should focus on improving perception robustness, reducing training time, and extending capabilities to handle more diverse and complex micro-manipulation scenarios.

Plain Language Accessible to non-experts

想象你在厨房里做菜，你需要把一些非常细小的调料倒进碗里。因为调料太细，光用眼睛很难看清它们的具体位置，但你可以用手指去感觉。你用手指轻轻触摸调料，感觉到它们在指尖的存在，然后慢慢倒掉多余的部分，只留下你想要的数量。这就像机器人用一种特殊的“皮肤”感受微小物体的位置，然后通过学习知道如何控制手指，把多余的物体倒掉，直到剩下你想要的数量。这个过程不用看，只靠感觉和学习，机器人就像你一样，用触觉完成微妙的操作。

ELI14 Explained like you're 14

想象你在厨房里准备一道菜，有很多细碎的调料，你需要把它们倒到碗里，但又不能倒太多或太少。你用手指轻轻触摸调料，感觉到它们在指尖的触感，然后慢慢倒掉多余的，直到剩下你想要的数量。这就像机器人用一种特殊的“皮肤”感受微小物体的位置，然后学会用手指控制，把多余的倒掉，只留下你想要的数量。它不用看，只用感觉和学习，就能完成这个微妙的任务。这个技术让机器人变得像人一样聪明，能用感觉完成复杂的微操作，就像你用手感受调料一样。

Glossary

PPO (Proximal Policy Optimization)

A reinforcement learning algorithm that optimizes policy distributions to balance exploration and exploitation, ensuring stable and efficient training.

Used for training the robot's control policy.

Sparse Reward

A reward signal that is only given when a specific condition is met, encouraging the agent to reach the goal with minimal feedback during most of the process.

Reward based on whether the target number of objects is achieved.

Spatially-Resolved Tactile Skin

A tactile sensor array capable of providing pressure distribution across its surface, enabling detailed contact localization.

Primary sensory input for the manipulation policy.

MuJoCo

A physics engine used for simulating contact-rich robotic manipulation tasks with high fidelity.

Platform for training and validating policies.

Domain Randomization

A technique that introduces variability in simulation parameters to improve the transferability of learned policies to real-world environments.

Enhances robustness of the trained policy.

Contact Point Estimator

A neural network trained to predict the spatial distribution of contact points between the fingers and objects.

Assists the policy in spatial perception.

Deep Reinforcement Learning

A machine learning approach combining deep neural networks with reinforcement learning to solve complex control tasks.

Framework used for policy training.

Sim-to-Real Transfer

The process of deploying policies trained in simulation onto real robotic hardware with minimal fine-tuning.

Key step for practical application.

Open Questions Unanswered questions from this research

1 While the current approach demonstrates high success rates, it remains uncertain how well it generalizes to objects with irregular shapes, different materials, or in highly dynamic environments. The tactile sensing resolution, although effective for small spherical pellets, might need enhancement for more complex geometries. Additionally, the scalability to multi-object stacking or more complex manipulation sequences remains unexplored. Future research should focus on multi-modal sensing integration, adaptive control strategies, and reducing training time to facilitate broader industrial deployment.

Applications

Immediate Applications

Micro-assembly in electronics

Automated placement and separation of microelectronic components where visual cues are obstructed or unreliable.

Fragile object handling

Delicate packaging or sorting of fragile items like glass beads or biological samples, reducing damage risk.

Medical micro-manipulation

Assisting in minimally invasive surgeries or laboratory procedures requiring high-precision micro-object control.

Long-term Vision

Autonomous micro-assembly lines

Fully robotic micro-assembly systems capable of handling complex tasks without human intervention, transforming manufacturing.

Micro-robotic surgical assistants

Robots capable of performing delicate surgeries or diagnostics inside the human body, leveraging tactile sensing for safety and precision.

Abstract

We introduce and solve the novel task of controlled separation of small objects with two fingers of a multi-purpose robotic hand: after grasping into a box of small objects, the task is to drop as many of them until a desired number remains between the fingers. The objects are small compared to the width of the fingers but also in absolute terms. In our case little pellets with a diameter of only 6mm are handled. We show that the task can be performed purely tactile (no vision) using a spatially-resolved tactile skin on a fingertip. The separation policy is trained in simulation via reinforcement learning using a straightforward sparse reward, which basically checks if the desired number of objects is reached. In simulation experiments, we provide an exhaustive analysis of the benefits of using spatially-resolved tactile feedback: while an ideal (high-resolution) tactile sensor allows solving the task almost perfectly, a sensor with lower spatial resolution (here 4x4 taxels) still leads to an improvement of up to 20% compared to using only the fingers' joint sensors. For this analysis, we further train an estimator alongside the policy that predicts the ground truth contact positions. Finally, we demonstrate the successful sim-to-real transfer for the DLR-Hand II equipped with a tactile skin.

cs.RO

References (16)

Fine Manipulation Using a Tactile Skin: Learning in Simulation and Sim-to-Real Transfer

Ulf Kasolowsky, Berthold Bäuml

2024 5 citations ⭐ Influential View Analysis →

OpenAI Gym

Greg Brockman, Vicki Cheung, Ludwig Pettersson et al.

2016 5545 citations View Analysis →

Composing Dextrous Grasping and In-Hand Manipulation via Scoring with a Reinforcement Learning Critic

Lennart Röstel, Dominik Winkelbauer, Johannes Pitz et al.

2025 6 citations View Analysis →

Solving Rubik's Cube with a Robot Hand

OpenAI, Ilge Akkaya, Marcin Andrychowicz et al.

2019 1449 citations View Analysis →

Inter-finger Small Object Manipulation With DenseTact Optical Tactile Sensor

Won Kyung Do, Bianca Aumann, Camille Chungyoun et al.

2023 17 citations View Analysis →

In-Hand Singulation and Scooping Manipulation with a 5 DOF Tactile Gripper

Yuhao Zhou, Pokuang Zhou, Shaoxiong Wang et al.

2024 8 citations View Analysis →

Learning Purely Tactile In-Hand Manipulation with a Torque-Controlled Hand

Leon Sievers, Johannes Pitz, B. Bäuml

2022 54 citations View Analysis →

DLR-Hand II: next generation of a dextrous robot hand

J. Butterfaß, M. Grebenstein, Hong Liu et al.

2001 920 citations

Agile Justin: An upgraded member of DLR's family of lightweight and torque controlled humanoids

B. Bäuml, Tobias Hammer, R. Wagner et al.

2014 32 citations

Learning to Pick by Digging: Data-Driven Dig-Grasping for Bin Picking from Clutter

Chao Zhao, Zhekai Tong, Juan Rojas et al.

2022 13 citations

Blind Bin Picking of Small Screws Through In-finger Manipulation With Compliant Robotic Fingers

Matthew Ishige, T. Umedachi, Yoshihisa Ijiri et al.

2020 7 citations

MuJoCo: A physics engine for model-based control

E. Todorov, Tom Erez, Yuval Tassa

2012 7244 citations

Vision-Sensorless Bin-Picking System Using Compliant Fingers with Proximity Sensors

Michihisa Ohara, Keisuke Koyama, Kensuke Harada

2025 1 citations

Learning dexterous in-hand manipulation

Marcin Andrychowicz, Bowen Baker, Maciek Chociej et al.

2018 2169 citations View Analysis →

Proximal Policy Optimization Algorithms

John Schulman, Filip Wolski, Prafulla Dhariwal et al.

2017 27889 citations View Analysis →

Stable-Baselines3: Reliable Reinforcement Learning Implementations

A. Raffin, Ashley Hill, A. Gleave et al.

2021 2818 citations

Learning Controlled Separation of Small Objects Between Two Fingers with a Tactile Skin

Key Findings

Methodology

Key Results

Significance

Technical Contribution

Novelty

Limitations

Future Work

AI Executive Summary

Deep Analysis

Background

Core Problem

Innovation

Methodology

Experiments

Results

Applications

Limitations & Outlook

Plain Language Accessible to non-experts

ELI14 Explained like you're 14

Glossary

PPO (Proximal Policy Optimization)

Sparse Reward

Spatially-Resolved Tactile Skin

MuJoCo

Domain Randomization

Contact Point Estimator

Deep Reinforcement Learning

Sim-to-Real Transfer

Open Questions Unanswered questions from this research

Applications

Immediate Applications

Micro-assembly in electronics

Fragile object handling

Medical micro-manipulation

Long-term Vision

Autonomous micro-assembly lines

Micro-robotic surgical assistants

Abstract

References (16)

Related Papers

Increasing Resilience of Continuum Robots via Motion Planning Algorithms

ARC: Adaptive Robust Joint State and Covariance Estimation

Do as I Do: Dexterous Manipulation Data from Everyday Human Videos

Observability and Consistency Analysis for Visual-Inertial Navigation with Anchored Feature Parameterizations

Visual Verification Enables Inference-time Steering and Autonomous Policy Improvement

R2RDreamer: 3D-aware Data Augmentation for Spatially-generalized 2D Manipulation Policies