Simulation

We provide a simulated version of the TriFinger platform based on PyBullet. It can be useful for getting started, debugging your policies and for identifying differences between rigid body simulations and real robots.

Get the Software

The simulated environment is included in the trifinger_rl_datasets Python package which also contains methods for downloading and working with the datasets (see Datasets for details on the datasets). The RL environments follow the interface defined by the gymnasium package.

Installation

Option 1: Using Apptainer

We are using Apptainer to provide a portable, reproducible environment for running code on the robots and in simulation. All you need is a Linux computer with Apptainer installed and our Apptainer image which contains all required packages.

You can pull the image with

$ apptainer pull oras://ghcr.io/open-dynamic-robot-initiative/trifinger_singularity/trifinger_user:rrc2022

For more information on Apptainer, see Apptainer/Singularity.

Note

When you pull the image with the command above, the resulting file will be named trifinger_user_rrc2022.sif. In the following, we will only use rrc2022.sif to keep the commands shorter. You can either rename the file or adjust the commands accordingly.

Option 2: Installing via pip

If you want to develop without Apptainer, you can also install the software locally using pip. We recommend using venv to have a controlled environment:

# setup venv
$ python3 -m venv path/to/trifinger_rl_venv
$ . path/to/trifinger_rl_venv/bin/activate
(trifinger_rl_venv) $ pip install -U pip  # make sure you use the latest version of pip

# install the trifinger_rl_datasets package
(trifinger_rl_venv) $ pip install trifinger_rl_datasets

Note, however, that on the robots code is run in the Apptainer image, so before submitting to the real robots, please make sure that your code is working with the Apptainer setup. See Real Robot for more details.

Instantiating and using the Environment

The environments can be instantiated via gymnasium where the environment names coincide with the dataset names (see also demo/simulation_rollout.py in the trifinger_rl_datasets repository).

import gymnasium as gym

import trifinger_rl_datasets

env = gym.make("trifinger-cube-push-sim-expert-v0")
obs, info = env.reset()
truncated = False

while not truncated:
    obs, rew, terminated, truncated, info = env.step(env.action_space.sample())

More details on the dataset environments can be found in Datasets. Alternatively, the simulation environment can also be instantiated directly.

from trifinger_rl_datasets.sim_env import SimTriFingerCubeEnv

env = SimTriFingerCubeEnv()
obs, info = env.reset()
truncated = False

while not truncated:
    obs, rew, terminated, truncated, info = env.step(env.action_space.sample())

This gives you more control over the environment parameters but does not automatically align the parameters with those used for data collection.

Starting from the Example Package

If you want to deploy your policy on the real robots at some point then we recommend starting from the trifinger-rl-example package which you can fork on GitHub. This package also contains the expert policies we used to record the expert datasets for the Push and Lift tasks.

All you have to do is to implement a class that follows the interface of trifinger_rl_datasets.PolicyBase:

class PolicyBase(ABC):
    """Base class defining interface for policies."""

    def __init__(
        self, action_space: gym.Space, observation_space: gym.Space, episode_length: int
    ):
        """
        Args:
            action_space:  Action space of the environment.
            observation_space:  Observation space of the environment.
            episode_length:  Number of steps in one episode.
        """
        # Note: you may not use some (or all) of the arguments but your
        # class still needs to accept them in.
        ...

    @staticmethod
    def get_policy_config() -> PolicyConfig:
        """Returns the policy configuration.

        This specifies what kind of observations the policy expects.
        """
        # If flatten_obs is False the observation is returned in the
        # form of a dictionary. image_obs determines whether the
        # observation includes the camera images.
        return PolicyConfig(flatten_obs=True, image_obs=True)

    def reset(self) -> None:
        """Will be called at the beginning of each episode."""
        ...  # may be left empty if not needed

    @abstractmethod
    def get_action(self, observation: ObservationType) -> np.ndarray:
        """Returns action that is executed on the robot.

        Args:
            observation: Observation of the current time step.

        Returns:
            Action that is sent to the robot.
        """
        ...

Add your code and final policies to this package and make sure all files are installed properly (e.g. using package_data for non-Python files like models). For more information on how to run the policies on the real robot, see Real Robot

By default the policy is expected to work with flattened observations (i.e. all values in one flat array). You can also use observations that come as structured dictionaries. For this, simply set flatten_obs to False when in creating the PolicyConfig object in get_policy_config(). To include camera images in the observation, set image_obs to True. If flatten_obs is True then the observation will be a tuple containing the flattened observation except for the camera images, and the camera images in a numpy array. If flatten_obs is False then the observation will be a dictionary which also contains the camera images. See Datasets for more details on the observation space.

Important

At the moment, the GPUs in the robot PCs running the user policies are not enabled. This will change during the next months, however.

Adding Dependencies

If you need any third-party libraries that are not yet installed in the container, there are two options for adding them:

Any Python dependencies that are pip-installable can simply be listed as requirements of your package (under install_requires in the setup.cfg, if your package is based on the example package).
If you have dependencies that cannot be installed with pip, you can extend the Apptainer image. See Add Custom Dependencies to the Container.

Evaluate Policy

Run Evaluation

Once you have your policy ready, you can perform rollouts in the environment. Using the provided evaluation script also serves as a way to test whether your policy complies with the required interface before submitting it to the real robot. To evaluate your policy locally, run the following commands with the venv activated:

# for the push task:
(trifinger_rl_venv) python3 -m trifinger_rl_datasets.evaluate_sim push trifinger_rl_example.example.TorchPushPolicy --output results_push.json

# for the lift task:
(trifinger_rl_venv) python3 -m trifinger_rl_datasets.evaluate_sim lift trifinger_rl_example.example.TorchLiftPolicy --output results_lift.json

Simply replace the specified policy class with your own. The policy is expected to implement the interface of trifinger_rl_datasets.PolicyBase. If your policy class is loading a model from a separate file and inted to run it on the real robot, make sure this file is installed with your package (i.e. do not rely on local files). See the example package on how this can be done.

The script will run multiple episodes with the specified policy and in the end print the averaged results to the terminal. If --output is set, the results are also written to the specified file.

There are a few more options you may use during testing (e.g. to enable visualisation or change the number of episodes), see --help for a complete list.

Set up Evaluation Environment using Apptainer

To test whether your policy will run on the robot PC, you can run it inside the apptainer container. In the following we describe how to set up a Python venv in your workspace and use it from within the container to install your package for evaluation.

The file system of the container itself read-only, but by default your home directory is mounted into the container (i.e. if you read or write files from your home directory inside the container, this will access your actual home!). We use this to create a venv in the workspace in your home directory which will be used inside the container.

In the following, it is assumed that your code is in a package that is located in ~/workspace/trifinger_rl_example. Adjust paths accordingly, if your setup differs.

# Go to your workspace.
$ cd ~/workspace

# Open a shell inside the container.  Since your home directory is automatically
# bound into the container, your workspace is accessible from there.
$ apptainer shell --cleanenv trifinger_rl.sif

# Some necessary environment setup inside the container (needs to be called
# everytime you start a new apptainer shell).
Apptainer> source /setup.bash

# Create venv with access to system packages (you may choose any other name)
Apptainer> python3 -m venv --system-site-packages ./trifinger_rl_venv
Apptainer> source ./trifinger_rl_venv/bin/activate
(trifinger_rl_venv) Apptainer> pip install -U pip  # make sure newest version of pip is used

# With activated venv, install your package (adjust path, if you called it differently)
(trifinger_rl_venv) Apptainer> cd ./trifinger_rl_example
(trifinger_rl_venv) Apptainer> pip install .

Note

This assumes that you are working inside your home directory, which is automatically bound with read-write-access into the Container. In case you are working in some directory outside of your home, you need to bind it into the container using --bind (see help of Apptainer for more information).

Once set up (see above), the following steps are enough (of course you need to re-install your package after you made changes):

$ apptainer shell --cleanenv trifinger_rl.sif
Apptainer> source /setup.bash  # setup environment inside the container
Apptainer> source ./trifinger_rl_venv/bin/activate  # always source this _after_ /setup.bash!
(trifinger_rl_venv) Apptainer> # now you can run training, tests, evaluation, etc.

Once you have your evaluation environment set up, you can run the following command inside the Apptainer shell (note that trifinger_rl_datasets is already installed inside the container, so no need to install it manually):

# for the push task:
(trifinger_rl_venv) Apptainer> python3 -m trifinger_rl_datasets.evaluate_sim push trifinger_rl_example.example.TorchPushPolicy --output results_push.json

# for the lift task:
(trifinger_rl_venv) Apptainer> python3 -m trifinger_rl_datasets.evaluate_sim lift trifinger_rl_example.example.TorchLiftPolicy --output results_lift.json