Datasets
We provide datasets containing transitions from the real and simulated RL environment for the Push and Lift tasks. For each task several datasets are available that have been collected with different behavior policies. Possible use cases for these datasets are offline RL, imitation learning or learning of a dynamics model. There are furthermore versions of the datasets including camera images which could be of interest for for computer vision research.
The datasets are provided as part of the gym environments implemented
in trifinger_rl_datasets (see TriFingerDatasetEnv).
The environments are compatible with the interface used by
D4RL. Thus, they can be easily used
in other frameworks such as D3RLPY. They are stored in the Zarr format and allow for
fast access.
The behavior policies have been trained in simulation and were then used to collect the datasets on the real and simulated robot. For more information about the datasets and benchmarking results for offline RL algorithms we refer to the paper Benchmarking Offline Reinforcement Learning on Real-Robot Hardware.
Installing the Software
The datasets are provided via the trifinger_rl_datasets package. See Simulation for installation instructions. Note that installing the package does not download the datasets. Instead, the datasets are downloaded on demand when requested in a Python script.
Loading a Dataset
The datasets are provided as part of the gymnasium environments implemented
in trifinger_rl_datasets (see TriFingerDatasetEnv)
similar to the D4RL environments. When the package is imported, the environments
are registered automatically and can be instantiated via gym.make().
A complete dataset can be obtained via get_dataset() as follows (see demo/load_dataset.py):
import gymnasium as gym
import trifinger_rl_datasets
env = gym.make("trifinger-cube-push-sim-expert-v0")
dataset = env.get_dataset()
dataset is a dictionary with keys "observations", "actions", "rewards", "timeouts", "terminals"
and "images" (if present in the dataset) that contain the corresponding parts of the transitions as Numpy arrays.
Note that the observation corresponding to the state the episode ended in is discarded following the D4RL convention. All
arrays therefore have the same first dimension.
When calling get_dataset() for the first time for a specific environment, the dataset is automatically
downloaded. By default the datasets are stored in ~/.trifinger_rl_datasets. A different path can be chosen
by passing the data_dir argument to gym.make:
env = gym.make(
"trifinger-cube-push-sim-expert-v0",
data_dir="/path/to/datasets/directory",
)
If the dataset is already present in the specified directory, it is not downloaded again.
As an alternative to the automatic download on demand, the datasets can also be downloaded manually from the Edmond repository.
Working with Flat and Nested Observations
By default, the observations are provided as a flattened array. You can obtain the index
ranges that correspond to parts of the observations via
get_obs_indices() as follows
(see demo/using_flat_observations.py):
obs_indices, obs_shapes = env.get_obs_indices()
obs_indices is a dictionary with the same structure as the nested observations (see
Tasks for information on the observation space). The values
are tuples of the form (start, end) that correspond to the start and end index of the
corresponding part in the flattened observation array. obs_shapes is a dictionary
that also shares the structure of the observations and contains the shapes of the
corresponding observation parts.
Obtaining the index ranges is useful when working with the flat observations. For example, to obtain the joint angles of the robot in the first observation of the dataset, you can do the following:
obs = dataset["observations"][0]
position = obs[slice(*obs_indices["robot_observation"]["position"])]
Alternatively, the observations can also be provided as a list of nested dictionaries by passing
flatten_obs=False to gym.make:
env = gym.make(
"trifinger-cube-push-sim-expert-v0",
flatten_obs=False,
)
This makes it easier to work with the observations but incurs an overhead when
loading the dataset from disk. The observations can then also be filtered when
loading the datasets (and when obtaining them from the simulated environment) by
passing a nested dictionary to obs_to_keep (see demo/load_filtered_dicts.py):
obs_to_keep = {
"robot_observation": {
"position": True,
"velocity": True,
"fingertip_force": False,
},
"camera_observation": {"object_keypoints": True},
}
env = gym.make(
"trifinger-cube-push-sim-expert-v0",
flatten_obs=False,
obs_to_keep=obs_to_keep,
)
This will only put the selected parts of “robot_observation” and “camera_observation” in the observation. To transform the observation back to a flat array after filtering, simply set the keyword argument flatten_obs to true. Note that the step and reset functions in the simulated environment will transform observations in the same manner as the get_dataset method to ensure compatibility.
Loading Parts of a Dataset
It is also possible to load only parts of a dataset into memory. This is particularly useful
for the datasets containing camera images as they are up to 100 GB in size. To load only
a range of transitions into memory, you can use the rng argument of get_dataset() (see
demo/load_dataset_part.py):
dataset = env.get_dataset(rng=(start, end))
It is also possible to load data corresponding to specific timesteps by passing a list of indices
to get_dataset() via the indices argument (see demo/random_access.py):
import numpy as np
indices = np.random.randint(0, 1000, size=10)
dataset = env.get_dataset(indices=indices)
Note that accessing many small parts of the dataset is slower than loading bigger contiguous parts, however.
Terminals and Timeouts
To be compatible with the D4RL interface, the dataset contains terminals and
timeouts. By default, terminals are always set to False whereas timeouts are set to True only
at the end of episodes. This behavior can be changed by passing set_terminals=True to
gym.make when instantiating the environment. This will set the terminals to True at the
end of an episode and the timeouts to False.
Getting the Statistics of a Dataset
When working with a dataset, it is often useful to know how many transitions it contains. You can
use the get_dataset_stats() method for this:
stats = env.get_dataset_stats()
stats is a dictionary that contains the keys "n_timesteps", "obs_size" and "action_size".
Accessing the Camera images
All datasets come in two versions, with and without camera images (see table below). The datasets
without camera images are much smaller and more practical when no image data is needed. Datasets with
camera images contain -image- in their name.
The camera images are not part of the observations but are contained in a separate Numpy array under
the key images:
dataset = env.get_dataset("trifinger-cube-push-sim-expert-image-v0")
images = dataset["images"]
n_timesteps, n_cameras, n_channels, height, width = images.shape
The first dimension of the array corresponds to timesteps in the environment, the second one to the camera ID, the third one to the channel and the last two dimensions to the image height and width.
As the control loop is running at 50 Hz while the cameras are capturing images at 10 Hz, an image is
repeated for 5 timesteps on average in images. This makes it easy to associate images with the
corresponding remaining observations and actions but introduces a redundancy.
Alternatively, the sequence of camera images can be obtained without any repetitions by calling the
method get_image_data():
images = env.get_image_data()
images = env.get_image_data(rng=(0, n_images))
n_camera_timesteps, n_cameras, n_channels, height, width = images.shape
The shape of the returned array is as described in the code snippet above unless timestep_dimension=False is passed.
In this case the shape is (n_images, n_channels, height, width) and the images from all cameras
are given one after another in the first dimension. For timestep_dimension=False, the first dimension corresponds to
camera time steps unlike with get_dataset(). The method
get_image_data() also supports the indices argument
to obtain only a subset of the images.
The method get_image_stats() can be used to get the statistics
of the image data:
image_stats = env.get_image_stats()
It returns a dictionary with keys "n_images", "n_cameras", "n_channels", "image_shape",
and "reorder_pixels" (which is related to internal compression in the dataset file).
Specifying a Path to a Dataset
When calling the methods get_dataset(),
get_dataset_stats(), get_image_data(),
and get_image_stats() the path to the dataset can also be specified directly via the
zarr_path argument.
Available Datasets
We currently provide four types of datasets:
Expert: All trajectories were collected with the expert policies included in the trifinger-rl-example package.
Weak&Expert: Half of the trajectories are collected with the expert policy and the other half with a weaker policy (an early training checkpoint) with added Gaussian noise on the actions.
Half-Expert: Only the expert trajectories from the corresponding Weak&Expert dataset are included. Comparing the performance on Weak&Expert and Half-Expert isolates the impact of additional suboptimal trajectories.
Mixed: A range of training checkpoints was used for data collection.
The datasets follow the naming convention
trifinger-cube-<task>-<sim/real>-<dataset_type>[-image]-v0.zarr
where
<task> can be either push or lift,
<sim/real> can be either sim or real,
<dataset_type> can be expert, smooth-expert, weak-n-expert, half-expert or mixed, and
-image is included if the dataset contains the camera images in addition to the pose estimates.
The trifinger-cube-lift-real-smooth-expert-v0 dataset was furthermore collected with a moving exponential average on the actions (which was also applied during training). This resulted in smoother trajectories and slightly higher returns but not in a higher success rate.
The trifinger_rl_datasets package contains the following environments/datasets:
Task-Sim/Real |
Environment name |
File size |
|---|---|---|
Push-Real |
trifinger-cube-push-real-expert-v0 |
935M |
trifinger-cube-push-real-expert-image-v0 |
80G |
|
trifinger-cube-push-real-weak-n-expert-v0 |
859M |
|
trifinger-cube-push-real-weak-n-expert-image-v0 |
80G |
|
trifinger-cube-push-real-half-expert-v0 |
449M |
|
trifinger-cube-push-real-half-expert-image-v0 |
46G |
|
trifinger-cube-push-real-mixed-v0 |
935M |
|
trifinger-cube-push-real-mixed-image-v0 |
80G |
|
Push-Sim |
trifinger-cube-push-sim-expert-v0 |
811M |
trifinger-cube-push-sim-expert-image-v0 |
30G |
|
trifinger-cube-push-sim-weak-n-expert-v0 |
859M |
|
trifinger-cube-push-sim-weak-n-expert-image-v0 |
31G |
|
trifinger-cube-push-sim-half-expert-v0 |
449M |
|
trifinger-cube-push-sim-half-expert-image-v0 |
17G |
|
trifinger-cube-push-sim-mixed-v0 |
811M |
|
trifinger-cube-push-sim-mixed-image-v0 |
30G |
|
Lift-Real |
trifinger-cube-lift-real-smooth-expert-v0 |
1,3G |
trifinger-cube-lift-real-smooth-expert-image-v0 |
99G |
|
trifinger-cube-lift-real-expert-v0 |
1,2G |
|
trifinger-cube-lift-real-expert-image-v0 |
100G |
|
trifinger-cube-lift-real-weak-n-expert-v0 |
1,2G |
|
trifinger-cube-lift-real-weak-n-expert-image-v0 |
100G |
|
trifinger-cube-lift-real-half-expert-v0 |
1,2G |
|
trifinger-cube-lift-real-half-expert-image-v0 |
100G |
|
trifinger-cube-lift-real-mixed-v0 |
1,2G |
|
trifinger-cube-lift-real-mixed-image-v0 |
100G |
|
Lift-Sim |
trifinger-cube-lift-sim-expert-v0 |
1,2G |
trifinger-cube-lift-sim-expert-image-v0 |
38G |
|
trifinger-cube-lift-sim-weak-n-expert-v0 |
1,2G |
|
trifinger-cube-lift-sim-weak-n-expert-image-v0 |
38G |
|
trifinger-cube-lift-sim-half-expert-v0 |
668M |
|
trifinger-cube-lift-sim-half-expert-image-v0 |
20G |
|
trifinger-cube-lift-sim-mixed-v0 |
1,2G |
|
trifinger-cube-lift-sim-mixed-image-v0 |
38G |
Pushing Task
Each episode has a length of 750 steps. Each dataset contains approximately 2880000 transitions.
Lifting Task
Each episode has a length of 1500 steps. Each dataset contains approximately 3600000 transitions.
Note
Before May 15, 2023 there were errors in the weak-n-expert and half-expert datasets and in the trifinger-cube-lift-sim-expert-* datasets. If you downloaded before this date, please delete the local files and download again.
For more details on the datasets like average success rate and return we refer to the paper Benchmarking Offline Reinforcement Learning on Real-Robot Hardware.