OCRTOC 24

OCRTOC 2024

Open Cloud

Robot Table Organization Challenge

Software

Discussion

Motivation

OCRTOC is a cloud-based benchmark for robotic grasping and manipulation. The benchmark focuses on the object rearrangement problem, specifically on table organization tasks, which is are essential capabilities for service robots to understand its environment, make long-horizon decisions and perform robust interactions with the physical world. With the OCRTOC benchmark, we aim to lower the barrier of conducting reproducible research and accelerate progress in embodied AI. With the rise of LLM and foundation models we hope participants can leverage the recent technology to make breakthrough in this challenge.

Challenge structure

This year, we are organizing OCRTOC as a two-phase challenge: Qualification and Real-world Evaluation. During the Qualification Phase, we provide a simulation environment implemented in MuJoCo for participants to develop their solutions. After testing with the provided training tasks, participants will submit their code to a Docker registry. We evaluate the solutions by running the submitted code on a set of evaluation tasks on a remote server, which participants do not have access to. The top-ranked teams will qualify for the Real-world Evaluation Phase. In the Real-world Evaluation Phase, we will use objects from the IPA-3D1K dataset to conduct real-world evaluations. We will provide the qualified teams with a dataset collected from the real world to enable Sim2Real transfer. The top 10 teams will be awarded a prize.

Challenge Timeline

Call for participation 15/05/2024

Qualification phase 15/05/2024 - 15/07/2024

Preparation for real world 15/7/2024 - 30/7/2024

Evaluation in real world 1/8/2024 - 15/8/2024

Award ceremony To be announced...

Prizes

Gold : 1 Team Prize: 1000 euro

Silver : 3 Teams Prize: 500 euro per team

Bronze : 5 Teams Prize: 200 euro per team

Language conditioned rearrangement tasks

In the language-conditioned rearrangement task, a natural language instruction that specifies the goal of the manipulation will be given (e.g., “Pick one screwdriver and place it in the top right basket”). We provide a demonstration dataset collected by an expert policy, which includes robot observations, actions, and language annotations, so that participants can use it to train their models.

pick_and_place_fix_camera_2eec5ba1d9e26b8ca973670d75ec779ab2b0a55b.mp4

Third view camera

pick_and_place_realsense_2eec5ba1d9e26b8ca973670d75ec779ab2b0a55b.mp4

Eye in hand camera

Pose conditioned rearrangement tasks

The robot is required to place objects presented in the scene from initial pose configurations to given target pose configurations. The overlapping region between the desired and actual objects is used to calculate the task’s success rate. No demonstration dataset is provided for this task. In the subsequent figures, the transparent objects specify the target pose configurations.