Jun 01, 2026
Contact-Aware Pick and Place with a 6-DoF Arm
Introduction
I am developing a real-world automated pick-and-place application utilizing a 6-DoF robotic arm, targeting the UR10e and UR5e platforms from Universal Robots. The immediate application is laboratory automation — sample sorting, reagent handling, and repetitive bench-top tasks that are high-value targets for robotic automation but require fine manipulation and reliable contact handling. The long-term goal is a generalized manipulation platform transferable across research and industrial settings.
Approach
The core architecture is a three-layer hierarchy: a collision-free motion planner (RRT-Connect) handles gross arm movement, a phase-aware sequential task framework decomposes the pick-and-place into discrete stages each with its own completion conditions and sub-policy, and residual reinforcement learning is reserved for contact-rich phases (grasp and placement) where classical planning breaks down. This design deliberately minimizes the scope of what RL needs to learn, reducing sample complexity and making sim-to-real transfer more tractable.
Technical Stack
MuJoCo, dm_control, RRT-Connect path planning, multi-seed inverse kinematics, sequential task decomposition, PD control with feedforward gravity compensation, reinforcement learning.
Status
This project is in active development. Completed components:
- Multi-seed inverse kinematics: LM solver with multiple seed configurations, singularity detection, and quaternion double-cover handling
- Sequential task framework: phase-based task decomposition with per-phase completion conditions, clean policy/task boundary, and episode-safe resets
- Sub-policy architecture: per-phase sub-policies with timestep-independent interpolated position control and feedforward gravity compensation
- Physics-accurate simulation: custom UR10e + Robotiq 2F-85 3D-models built with PyMJCF, including TCP site instrumentation and dynamic visualization markers
- Visualization utilities: mocap-based runtime markers, configurable camera rendering, and video export pipeline
To Do:
- Proximity-aware penalty: reward shaping to discourage near-collisions during approach and transfer phases
- Obstacle avoidance with real-time update: dynamic replanning as scene geometry changes
- Feedback linearization: full gravity and Coriolis compensation for improved low-speed trajectory tracking — see TMT-method in Compass-Gait Walker post on formulation
- Residual RL for contact phases: learned corrective policy over grasp descent and placement, trained on top of the classical planner
- Sim-to-real transfer: domain randomization, actuator modelling, and latency compensation for deployment on physical hardware