********** Simulation ********** .. image:: ../images/trifingerpro_sim_with_cube.png :alt: TriFingerPro robot in simulation. We provide a simulated version of the TriFinger platform based on PyBullet. It can be useful for getting started, debugging your policies and for identifying differences between rigid body simulations and real robots. Get the Software ================ The simulated environment is included in the `trifinger_rl_datasets`_ Python package which also contains methods for downloading and working with the datasets (see :doc:`../datasets/index` for details on the datasets). The RL environments follow the interface defined by the `gymnasium`_ package. Installation ------------ Option 1: Using Apptainer ~~~~~~~~~~~~~~~~~~~~~~~~~ We are using Apptainer to provide a portable, reproducible environment for running code on the robots and in simulation. All you need is a Linux computer with Apptainer installed and our Apptainer image which contains all required packages. You can pull the image with .. code-block:: bash $ apptainer pull oras://ghcr.io/open-dynamic-robot-initiative/trifinger_singularity/trifinger_user:rrc2022 .. TODO: Adapt URL of image and make sure it has trifinger_rl_datasets. For more information on Apptainer, see :doc:`../singularity`. .. note:: When you pull the image with the command above, the resulting file will be named ``trifinger_user_rrc2022.sif``. In the following, we will only use ``rrc2022.sif`` to keep the commands shorter. You can either rename the file or adjust the commands accordingly. Option 2: Installing via pip ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ If you want to develop without Apptainer, you can also install the software locally using pip. We recommend using *venv* to have a controlled environment: .. code-block:: bash # setup venv $ python3 -m venv path/to/trifinger_rl_venv $ . path/to/trifinger_rl_venv/bin/activate (trifinger_rl_venv) $ pip install -U pip # make sure you use the latest version of pip # install the trifinger_rl_datasets package (trifinger_rl_venv) $ pip install trifinger_rl_datasets Note, however, that on the robots code is run in the Apptainer image, **so before submitting to the real robots, please make sure that your code is working with the Apptainer setup**. See :doc:`../real_robot/index` for more details. .. TODO: Adapt URL of image and make sure it has trifinger_rl_datasets. Instantiating and using the Environment --------------------------------------- The environments can be instantiated via gymnasium where the environment names coincide with the dataset names (see also demo/simulation_rollout.py in the `trifinger_rl_datasets`_ repository). .. code-block:: python import gymnasium as gym import trifinger_rl_datasets env = gym.make("trifinger-cube-push-sim-expert-v0") obs, info = env.reset() truncated = False while not truncated: obs, rew, terminated, truncated, info = env.step(env.action_space.sample()) More details on the dataset environments can be found in :doc:`../datasets/index`. Alternatively, the simulation environment can also be instantiated directly. .. code-block:: python from trifinger_rl_datasets.sim_env import SimTriFingerCubeEnv env = SimTriFingerCubeEnv() obs, info = env.reset() truncated = False while not truncated: obs, rew, terminated, truncated, info = env.step(env.action_space.sample()) This gives you more control over the environment parameters but does not automatically align the parameters with those used for data collection. .. _starting_from_the_example_package: Starting from the Example Package ================================= If you want to deploy your policy on the real robots at some point then we recommend starting from the `trifinger-rl-example`_ package which you can fork on GitHub. This package also contains the **expert policies** we used to record the expert datasets for the Push and Lift tasks. All you have to do is to implement a class that follows the interface of :class:`trifinger_rl_datasets.PolicyBase`: .. code-block:: python class PolicyBase(ABC): """Base class defining interface for policies.""" def __init__( self, action_space: gym.Space, observation_space: gym.Space, episode_length: int ): """ Args: action_space: Action space of the environment. observation_space: Observation space of the environment. episode_length: Number of steps in one episode. """ # Note: you may not use some (or all) of the arguments but your # class still needs to accept them in. ... @staticmethod def get_policy_config() -> PolicyConfig: """Returns the policy configuration. This specifies what kind of observations the policy expects. """ # If flatten_obs is False the observation is returned in the # form of a dictionary. image_obs determines whether the # observation includes the camera images. return PolicyConfig(flatten_obs=True, image_obs=True) def reset(self) -> None: """Will be called at the beginning of each episode.""" ... # may be left empty if not needed @abstractmethod def get_action(self, observation: ObservationType) -> np.ndarray: """Returns action that is executed on the robot. Args: observation: Observation of the current time step. Returns: Action that is sent to the robot. """ ... Add your code and final policies to this package and make sure all files are installed properly (e.g. using ``package_data`` for non-Python files like models). For more information on how to run the policies on the real robot, see :doc:`../real_robot/index` By default the policy is expected to work with flattened observations (i.e. all values in one flat array). You can also use observations that come as structured dictionaries. For this, simply set ``flatten_obs`` to ``False`` when in creating the ``PolicyConfig`` object in ``get_policy_config()``. To include camera images in the observation, set ``image_obs`` to ``True``. If ``flatten_obs`` is ``True`` then the observation will be a tuple containing the flattened observation except for the camera images, and the camera images in a numpy array. If ``flatten_obs`` is ``False`` then the observation will be a dictionary which also contains the camera images. See :doc:`../datasets/index` for more details on the observation space. .. important:: At the moment, the GPUs in the robot PCs running the user policies are not enabled. This will change during the next months, however. .. TODO: Remove this once the GPUs are enabled. Adding Dependencies ------------------- If you need any third-party libraries that are not yet installed in the container, there are two options for adding them: 1. Any Python dependencies that are pip-installable can simply be listed as requirements of your package (under ``install_requires`` in the ``setup.cfg``, if your package is based on the example package). 2. If you have dependencies that cannot be installed with pip, you can extend the Apptainer image. See :ref:`singularity_extend_container`. .. _simulation_phase_evaluate_policy: Evaluate Policy =============== Run Evaluation -------------- Once you have your policy ready, you can perform rollouts in the environment. Using the provided evaluation script also serves as a way to test whether your policy complies with the required interface before submitting it to the real robot. To evaluate your policy locally, run the following commands with the venv activated: .. code-block:: bash # for the push task: (trifinger_rl_venv) python3 -m trifinger_rl_datasets.evaluate_sim push trifinger_rl_example.example.TorchPushPolicy --output results_push.json # for the lift task: (trifinger_rl_venv) python3 -m trifinger_rl_datasets.evaluate_sim lift trifinger_rl_example.example.TorchLiftPolicy --output results_lift.json Simply replace the specified policy class with your own. The policy is expected to implement the interface of :class:`trifinger_rl_datasets.PolicyBase`. If your policy class is loading a model from a separate file and inted to run it on the real robot, make sure this file is installed with your package (i.e. do not rely on local files). See the example package on how this can be done. The script will run multiple episodes with the specified policy and in the end print the averaged results to the terminal. If ``--output`` is set, the results are also written to the specified file. There are a few more options you may use during testing (e.g. to enable visualisation or change the number of episodes), see ``--help`` for a complete list. .. _setup_workspace_with_apptainer: Set up Evaluation Environment using Apptainer --------------------------------------------- To test whether your policy will run on the robot PC, you can run it inside the apptainer container. In the following we describe how to set up a Python venv in your workspace and use it from within the container to install your package for evaluation. The file system of the container itself read-only, but by default your home directory is mounted into the container (i.e. if you read or write files from your home directory inside the container, this will access your actual home!). We use this to create a *venv* in the workspace in your home directory which will be used inside the container. In the following, it is assumed that your code is in a package that is located in ``~/workspace/trifinger_rl_example``. Adjust paths accordingly, if your setup differs. .. code-block:: bash # Go to your workspace. $ cd ~/workspace # Open a shell inside the container. Since your home directory is automatically # bound into the container, your workspace is accessible from there. $ apptainer shell --cleanenv trifinger_rl.sif # Some necessary environment setup inside the container (needs to be called # everytime you start a new apptainer shell). Apptainer> source /setup.bash # Create venv with access to system packages (you may choose any other name) Apptainer> python3 -m venv --system-site-packages ./trifinger_rl_venv Apptainer> source ./trifinger_rl_venv/bin/activate (trifinger_rl_venv) Apptainer> pip install -U pip # make sure newest version of pip is used # With activated venv, install your package (adjust path, if you called it differently) (trifinger_rl_venv) Apptainer> cd ./trifinger_rl_example (trifinger_rl_venv) Apptainer> pip install . .. TODO: Adjust name of container once it is available. .. note:: This assumes that you are working inside your home directory, which is automatically bound with read-write-access into the Container. In case you are working in some directory outside of your home, you need to bind it into the container using ``--bind`` (see help of Apptainer for more information). Once set up (see above), the following steps are enough (of course you need to re-install your package after you made changes): .. code-block:: bash $ apptainer shell --cleanenv trifinger_rl.sif Apptainer> source /setup.bash # setup environment inside the container Apptainer> source ./trifinger_rl_venv/bin/activate # always source this _after_ /setup.bash! (trifinger_rl_venv) Apptainer> # now you can run training, tests, evaluation, etc. .. TODO: Adjust name of container once it is available. Once you have your evaluation environment set up, you can run the following command inside the Apptainer shell (note that trifinger_rl_datasets is already installed inside the container, so no need to install it manually): .. code-block:: bash # for the push task: (trifinger_rl_venv) Apptainer> python3 -m trifinger_rl_datasets.evaluate_sim push trifinger_rl_example.example.TorchPushPolicy --output results_push.json # for the lift task: (trifinger_rl_venv) Apptainer> python3 -m trifinger_rl_datasets.evaluate_sim lift trifinger_rl_example.example.TorchLiftPolicy --output results_lift.json .. _gymnasium: https://gymnasium.farama.org/ .. _trifinger_rl_datasets: https://github.com/rr-learning/trifinger_rl_datasets .. _trifinger-rl-example: https://github.com/rr-learning/trifinger-rl-example