Datasets

We provide datasets containing transitions from the real and simulated RL environment for the Push and Lift tasks. For each task several datasets are available that have been collected with different behavior policies. Possible use cases for these datasets are offline RL, imitation learning or learning of a dynamics model. There are furthermore versions of the datasets including camera images which could be of interest for for computer vision research.

The datasets are provided as part of the gym environments implemented in trifinger_rl_datasets (see TriFingerDatasetEnv). The environments are compatible with the interface used by D4RL. Thus, they can be easily used in other frameworks such as D3RLPY. They are stored in the Zarr format and allow for fast access.

The behavior policies have been trained in simulation and were then used to collect the datasets on the real and simulated robot. For more information about the datasets and benchmarking results for offline RL algorithms we refer to the paper Benchmarking Offline Reinforcement Learning on Real-Robot Hardware.

../_images/datasets_overview.png

Installing the Software

The datasets are provided via the trifinger_rl_datasets package. See Simulation for installation instructions. Note that installing the package does not download the datasets. Instead, the datasets are downloaded on demand when requested in a Python script.

Loading a Dataset

The datasets are provided as part of the gymnasium environments implemented in trifinger_rl_datasets (see TriFingerDatasetEnv) similar to the D4RL environments. When the package is imported, the environments are registered automatically and can be instantiated via gym.make().

A complete dataset can be obtained via get_dataset() as follows (see demo/load_dataset.py):

import gymnasium as gym

import trifinger_rl_datasets

env = gym.make("trifinger-cube-push-sim-expert-v0")
dataset = env.get_dataset()

dataset is a dictionary with keys "observations", "actions", "rewards", "timeouts", "terminals" and "images" (if present in the dataset) that contain the corresponding parts of the transitions as Numpy arrays. Note that the observation corresponding to the state the episode ended in is discarded following the D4RL convention. All arrays therefore have the same first dimension.

When calling get_dataset() for the first time for a specific environment, the dataset is automatically downloaded. By default the datasets are stored in ~/.trifinger_rl_datasets. A different path can be chosen by passing the data_dir argument to gym.make:

env = gym.make(
   "trifinger-cube-push-sim-expert-v0",
   data_dir="/path/to/datasets/directory",
)

If the dataset is already present in the specified directory, it is not downloaded again.

As an alternative to the automatic download on demand, the datasets can also be downloaded manually from the Edmond repository.

Working with Flat and Nested Observations

By default, the observations are provided as a flattened array. You can obtain the index ranges that correspond to parts of the observations via get_obs_indices() as follows (see demo/using_flat_observations.py):

obs_indices, obs_shapes = env.get_obs_indices()

obs_indices is a dictionary with the same structure as the nested observations (see Tasks for information on the observation space). The values are tuples of the form (start, end) that correspond to the start and end index of the corresponding part in the flattened observation array. obs_shapes is a dictionary that also shares the structure of the observations and contains the shapes of the corresponding observation parts.

Obtaining the index ranges is useful when working with the flat observations. For example, to obtain the joint angles of the robot in the first observation of the dataset, you can do the following:

obs = dataset["observations"][0]
position = obs[slice(*obs_indices["robot_observation"]["position"])]

Alternatively, the observations can also be provided as a list of nested dictionaries by passing flatten_obs=False to gym.make:

env = gym.make(
   "trifinger-cube-push-sim-expert-v0",
   flatten_obs=False,
)

This makes it easier to work with the observations but incurs an overhead when loading the dataset from disk. The observations can then also be filtered when loading the datasets (and when obtaining them from the simulated environment) by passing a nested dictionary to obs_to_keep (see demo/load_filtered_dicts.py):

obs_to_keep = {
     "robot_observation": {
         "position": True,
         "velocity": True,
         "fingertip_force": False,
     },
     "camera_observation": {"object_keypoints": True},
 }

env = gym.make(
     "trifinger-cube-push-sim-expert-v0",
     flatten_obs=False,
     obs_to_keep=obs_to_keep,
 )

This will only put the selected parts of “robot_observation” and “camera_observation” in the observation. To transform the observation back to a flat array after filtering, simply set the keyword argument flatten_obs to true. Note that the step and reset functions in the simulated environment will transform observations in the same manner as the get_dataset method to ensure compatibility.

Loading Parts of a Dataset

It is also possible to load only parts of a dataset into memory. This is particularly useful for the datasets containing camera images as they are up to 100 GB in size. To load only a range of transitions into memory, you can use the rng argument of get_dataset() (see demo/load_dataset_part.py):

dataset = env.get_dataset(rng=(start, end))

It is also possible to load data corresponding to specific timesteps by passing a list of indices to get_dataset() via the indices argument (see demo/random_access.py):

import numpy as np
indices = np.random.randint(0, 1000, size=10)
dataset = env.get_dataset(indices=indices)

Note that accessing many small parts of the dataset is slower than loading bigger contiguous parts, however.

Terminals and Timeouts

To be compatible with the D4RL interface, the dataset contains terminals and timeouts. By default, terminals are always set to False whereas timeouts are set to True only at the end of episodes. This behavior can be changed by passing set_terminals=True to gym.make when instantiating the environment. This will set the terminals to True at the end of an episode and the timeouts to False.

Getting the Statistics of a Dataset

When working with a dataset, it is often useful to know how many transitions it contains. You can use the get_dataset_stats() method for this:

stats = env.get_dataset_stats()

stats is a dictionary that contains the keys "n_timesteps", "obs_size" and "action_size".

Accessing the Camera images

All datasets come in two versions, with and without camera images (see table below). The datasets without camera images are much smaller and more practical when no image data is needed. Datasets with camera images contain -image- in their name.

The camera images are not part of the observations but are contained in a separate Numpy array under the key images:

dataset = env.get_dataset("trifinger-cube-push-sim-expert-image-v0")
images = dataset["images"]
n_timesteps, n_cameras, n_channels, height, width = images.shape

The first dimension of the array corresponds to timesteps in the environment, the second one to the camera ID, the third one to the channel and the last two dimensions to the image height and width.

As the control loop is running at 50 Hz while the cameras are capturing images at 10 Hz, an image is repeated for 5 timesteps on average in images. This makes it easy to associate images with the corresponding remaining observations and actions but introduces a redundancy.

Alternatively, the sequence of camera images can be obtained without any repetitions by calling the method get_image_data():

images = env.get_image_data()
images = env.get_image_data(rng=(0, n_images))
n_camera_timesteps, n_cameras, n_channels, height, width = images.shape

The shape of the returned array is as described in the code snippet above unless timestep_dimension=False is passed. In this case the shape is (n_images, n_channels, height, width) and the images from all cameras are given one after another in the first dimension. For timestep_dimension=False, the first dimension corresponds to camera time steps unlike with get_dataset(). The method get_image_data() also supports the indices argument to obtain only a subset of the images.

The method get_image_stats() can be used to get the statistics of the image data:

image_stats = env.get_image_stats()

It returns a dictionary with keys "n_images", "n_cameras", "n_channels", "image_shape", and "reorder_pixels" (which is related to internal compression in the dataset file).

Specifying a Path to a Dataset

When calling the methods get_dataset(), get_dataset_stats(), get_image_data(), and get_image_stats() the path to the dataset can also be specified directly via the zarr_path argument.

Available Datasets

We currently provide four types of datasets:

  • Expert: All trajectories were collected with the expert policies included in the trifinger-rl-example package.

  • Weak&Expert: Half of the trajectories are collected with the expert policy and the other half with a weaker policy (an early training checkpoint) with added Gaussian noise on the actions.

  • Half-Expert: Only the expert trajectories from the corresponding Weak&Expert dataset are included. Comparing the performance on Weak&Expert and Half-Expert isolates the impact of additional suboptimal trajectories.

  • Mixed: A range of training checkpoints was used for data collection.

The datasets follow the naming convention

trifinger-cube-<task>-<sim/real>-<dataset_type>[-image]-v0.zarr

where

  • <task> can be either push or lift,

  • <sim/real> can be either sim or real,

  • <dataset_type> can be expert, smooth-expert, weak-n-expert, half-expert or mixed, and

  • -image is included if the dataset contains the camera images in addition to the pose estimates.

The trifinger-cube-lift-real-smooth-expert-v0 dataset was furthermore collected with a moving exponential average on the actions (which was also applied during training). This resulted in smoother trajectories and slightly higher returns but not in a higher success rate.

The trifinger_rl_datasets package contains the following environments/datasets:

Datasets

Task-Sim/Real

Environment name

File size

Push-Real

trifinger-cube-push-real-expert-v0

935M

trifinger-cube-push-real-expert-image-v0

80G

trifinger-cube-push-real-weak-n-expert-v0

859M

trifinger-cube-push-real-weak-n-expert-image-v0

80G

trifinger-cube-push-real-half-expert-v0

449M

trifinger-cube-push-real-half-expert-image-v0

46G

trifinger-cube-push-real-mixed-v0

935M

trifinger-cube-push-real-mixed-image-v0

80G

Push-Sim

trifinger-cube-push-sim-expert-v0

811M

trifinger-cube-push-sim-expert-image-v0

30G

trifinger-cube-push-sim-weak-n-expert-v0

859M

trifinger-cube-push-sim-weak-n-expert-image-v0

31G

trifinger-cube-push-sim-half-expert-v0

449M

trifinger-cube-push-sim-half-expert-image-v0

17G

trifinger-cube-push-sim-mixed-v0

811M

trifinger-cube-push-sim-mixed-image-v0

30G

Lift-Real

trifinger-cube-lift-real-smooth-expert-v0

1,3G

trifinger-cube-lift-real-smooth-expert-image-v0

99G

trifinger-cube-lift-real-expert-v0

1,2G

trifinger-cube-lift-real-expert-image-v0

100G

trifinger-cube-lift-real-weak-n-expert-v0

1,2G

trifinger-cube-lift-real-weak-n-expert-image-v0

100G

trifinger-cube-lift-real-half-expert-v0

1,2G

trifinger-cube-lift-real-half-expert-image-v0

100G

trifinger-cube-lift-real-mixed-v0

1,2G

trifinger-cube-lift-real-mixed-image-v0

100G

Lift-Sim

trifinger-cube-lift-sim-expert-v0

1,2G

trifinger-cube-lift-sim-expert-image-v0

38G

trifinger-cube-lift-sim-weak-n-expert-v0

1,2G

trifinger-cube-lift-sim-weak-n-expert-image-v0

38G

trifinger-cube-lift-sim-half-expert-v0

668M

trifinger-cube-lift-sim-half-expert-image-v0

20G

trifinger-cube-lift-sim-mixed-v0

1,2G

trifinger-cube-lift-sim-mixed-image-v0

38G

  • Pushing Task

    Each episode has a length of 750 steps. Each dataset contains approximately 2880000 transitions.

  • Lifting Task

    Each episode has a length of 1500 steps. Each dataset contains approximately 3600000 transitions.

Note

Before May 15, 2023 there were errors in the weak-n-expert and half-expert datasets and in the trifinger-cube-lift-sim-expert-* datasets. If you downloaded before this date, please delete the local files and download again.

For more details on the datasets like average success rate and return we refer to the paper Benchmarking Offline Reinforcement Learning on Real-Robot Hardware.