**********
Simulation
**********

.. image:: ../images/trifingerpro_sim_with_cube.png
   :alt: TriFingerPro robot in simulation.
   
We provide a simulated version of the TriFinger platform based on PyBullet.
It can be useful for getting started, debugging your policies and for 
identifying differences between rigid body simulations and real robots.

Get the Software
================

The simulated environment is included in the `trifinger_rl_datasets`_ Python package
which also contains methods for downloading and working with the datasets (see 
:doc:`../datasets/index` for details on the datasets). The RL environments follow 
the interface defined by the `gymnasium`_ package.

Installation
------------

Option 1: Using Apptainer
~~~~~~~~~~~~~~~~~~~~~~~~~

We are using Apptainer to provide a portable, reproducible environment for
running code on the robots and in simulation.  All you need is a Linux computer
with Apptainer installed and our Apptainer image which contains all required
packages.

You can pull the image with

.. code-block:: bash

    $ apptainer pull oras://ghcr.io/open-dynamic-robot-initiative/trifinger_singularity/trifinger_user:rrc2022
    
.. 
    TODO: Adapt URL of image and make sure it has trifinger_rl_datasets. 

For more information on Apptainer, see :doc:`../singularity`.

.. note::

   When you pull the image with the command above, the resulting file will be
   named ``trifinger_user_rrc2022.sif``.  In the following, we will only use
   ``rrc2022.sif`` to keep the commands shorter.  You can either rename the file
   or adjust the commands accordingly.


Option 2: Installing via pip
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

If you want to develop without Apptainer, you can also install the software
locally using pip.  We recommend using *venv* to have a controlled environment:

.. code-block:: bash

    # setup venv
    $ python3 -m venv path/to/trifinger_rl_venv
    $ . path/to/trifinger_rl_venv/bin/activate
    (trifinger_rl_venv) $ pip install -U pip  # make sure you use the latest version of pip

    # install the trifinger_rl_datasets package
    (trifinger_rl_venv) $ pip install trifinger_rl_datasets

Note, however, that on the robots code is run in the Apptainer image, **so before
submitting to the real robots, please make sure that your code is working with the Apptainer
setup**. See :doc:`../real_robot/index` for more details.

.. 
    TODO: Adapt URL of image and make sure it has trifinger_rl_datasets. 
    

Instantiating and using the Environment
---------------------------------------

The environments can be instantiated via gymnasium where the environment names
coincide with the dataset names (see also demo/simulation_rollout.py in 
the `trifinger_rl_datasets`_ repository).

.. code-block:: python

    import gymnasium as gym

    import trifinger_rl_datasets
    
    env = gym.make("trifinger-cube-push-sim-expert-v0")
    obs, info = env.reset()
    truncated = False

    while not truncated:
        obs, rew, terminated, truncated, info = env.step(env.action_space.sample())
    
More details on the dataset environments can be found in :doc:`../datasets/index`.
Alternatively, the simulation environment can also be instantiated directly.

.. code-block:: python

    from trifinger_rl_datasets.sim_env import SimTriFingerCubeEnv
    
    env = SimTriFingerCubeEnv()
    obs, info = env.reset()
    truncated = False

    while not truncated:
        obs, rew, terminated, truncated, info = env.step(env.action_space.sample())
    
This gives you more control over the environment parameters but does not
automatically align the parameters with those used for data collection.

    
.. _starting_from_the_example_package:
    
Starting from the Example Package
=================================

If you want to deploy your policy on the real robots at some point then we recommend starting 
from the `trifinger-rl-example`_ package which you can fork on GitHub. This package also contains
the **expert policies** we used to record the expert datasets for the Push and Lift tasks. 

All you have to do is to implement a class that follows the interface of :class:`trifinger_rl_datasets.PolicyBase`:

.. code-block:: python

    class PolicyBase(ABC):
        """Base class defining interface for policies."""

        def __init__(
            self, action_space: gym.Space, observation_space: gym.Space, episode_length: int
        ):
            """
            Args:
                action_space:  Action space of the environment.
                observation_space:  Observation space of the environment.
                episode_length:  Number of steps in one episode.
            """
            # Note: you may not use some (or all) of the arguments but your
            # class still needs to accept them in.
            ...

        @staticmethod
        def get_policy_config() -> PolicyConfig:
            """Returns the policy configuration.
            
            This specifies what kind of observations the policy expects.
            """
            # If flatten_obs is False the observation is returned in the
            # form of a dictionary. image_obs determines whether the 
            # observation includes the camera images.
            return PolicyConfig(flatten_obs=True, image_obs=True)

        def reset(self) -> None:
            """Will be called at the beginning of each episode."""
            ...  # may be left empty if not needed

        @abstractmethod
        def get_action(self, observation: ObservationType) -> np.ndarray:
            """Returns action that is executed on the robot.

            Args:
                observation: Observation of the current time step.

            Returns:
                Action that is sent to the robot.
            """
            ...


Add your code and final policies to this package and make sure all files are
installed properly (e.g. using ``package_data`` for non-Python files like
models). For more information on how to run the policies on the real robot,
see :doc:`../real_robot/index`

By default the policy is expected to work with flattened observations (i.e. all
values in one flat array).  You can also use observations that come as
structured dictionaries.  For this, simply set ``flatten_obs`` to ``False`` when
in creating the ``PolicyConfig`` object in ``get_policy_config()``. To include 
camera images in the observation, set ``image_obs`` to ``True``. If ``flatten_obs``
is ``True`` then the observation will be a tuple containing the flattened observation
except for the camera images, and the camera images in a numpy array. If ``flatten_obs``
is ``False`` then the observation will be a dictionary which also contains the camera
images. See :doc:`../datasets/index` for more details on the observation space.

.. important::

   At the moment, the GPUs in the robot PCs running the user policies are not enabled.
   This will change during the next months, however.
..  
  TODO: Remove this once the GPUs are enabled.

Adding Dependencies
-------------------

If you need any third-party libraries that are not yet installed in the
container, there are two options for adding them:

1. Any Python dependencies that are pip-installable can simply be listed as
   requirements of your package (under ``install_requires`` in the
   ``setup.cfg``, if your package is based on the example package).
2. If you have dependencies that cannot be installed with pip, you can extend
   the Apptainer image.  See :ref:`singularity_extend_container`.


.. _simulation_phase_evaluate_policy:

Evaluate Policy
===============

Run Evaluation
--------------

Once you have your policy ready, you can perform rollouts in the environment.
Using the provided evaluation script also serves as a way to test whether 
your policy complies with the required interface before submitting it to the
real robot. To evaluate your policy locally, run the following commands with 
the venv activated:

.. code-block:: bash

   # for the push task:
   (trifinger_rl_venv) python3 -m trifinger_rl_datasets.evaluate_sim push trifinger_rl_example.example.TorchPushPolicy --output results_push.json

   # for the lift task:
   (trifinger_rl_venv) python3 -m trifinger_rl_datasets.evaluate_sim lift trifinger_rl_example.example.TorchLiftPolicy --output results_lift.json

Simply replace the specified policy class with your own.  The policy is expected
to implement the interface of :class:`trifinger_rl_datasets.PolicyBase`.  If your
policy class is loading a model from a separate file and inted to run it on the
real robot, make sure this file is installed with your package (i.e. do not rely
on local files).  See the example package on how this can be done.

The script will run multiple episodes with the specified policy and in the end
print the averaged results to the terminal.  If ``--output`` is set, the results
are also written to the specified file.

There are a few more options you may use during testing (e.g. to enable
visualisation or change the number of episodes), see ``--help`` for a complete
list. 

.. _setup_workspace_with_apptainer:

Set up Evaluation Environment using Apptainer
---------------------------------------------

To test whether your policy will run on the robot PC, you can run it inside
the apptainer container. In the following we describe how to set up a Python
venv in your workspace and use it from within the container to install your
package for evaluation.

The file system of the container itself read-only, but by default your home
directory is mounted into the container (i.e. if you read or write files from
your home directory inside the container, this will access your actual home!).
We use this to create a *venv* in the workspace in your home directory which
will be used inside the container.

In the following, it is assumed that your code is in a package that is located
in ``~/workspace/trifinger_rl_example``.  Adjust paths accordingly, if your setup differs.

.. code-block:: bash

    # Go to your workspace.
    $ cd ~/workspace

    # Open a shell inside the container.  Since your home directory is automatically
    # bound into the container, your workspace is accessible from there.
    $ apptainer shell --cleanenv trifinger_rl.sif

    # Some necessary environment setup inside the container (needs to be called
    # everytime you start a new apptainer shell).
    Apptainer> source /setup.bash  

    # Create venv with access to system packages (you may choose any other name)
    Apptainer> python3 -m venv --system-site-packages ./trifinger_rl_venv
    Apptainer> source ./trifinger_rl_venv/bin/activate
    (trifinger_rl_venv) Apptainer> pip install -U pip  # make sure newest version of pip is used

    # With activated venv, install your package (adjust path, if you called it differently)
    (trifinger_rl_venv) Apptainer> cd ./trifinger_rl_example
    (trifinger_rl_venv) Apptainer> pip install .

..
    TODO: Adjust name of container once it is available.

.. note::

   This assumes that you are working inside your home directory, which is
   automatically bound with read-write-access into the Container.  In case you
   are working in some directory outside of your home, you need to bind it into
   the container using ``--bind`` (see help of Apptainer for more information).

Once set up (see above), the following steps are enough (of course you need to
re-install your package after you made changes):

.. code-block:: bash

    $ apptainer shell --cleanenv trifinger_rl.sif
    Apptainer> source /setup.bash  # setup environment inside the container
    Apptainer> source ./trifinger_rl_venv/bin/activate  # always source this _after_ /setup.bash!
    (trifinger_rl_venv) Apptainer> # now you can run training, tests, evaluation, etc.

..
    TODO: Adjust name of container once it is available.

Once you have your evaluation environment set up, you can run the following
command inside the Apptainer shell (note that trifinger_rl_datasets is already
installed inside the container, so no need to install it manually):

.. code-block:: bash

   # for the push task:
   (trifinger_rl_venv) Apptainer> python3 -m trifinger_rl_datasets.evaluate_sim push trifinger_rl_example.example.TorchPushPolicy --output results_push.json

   # for the lift task:
   (trifinger_rl_venv) Apptainer> python3 -m trifinger_rl_datasets.evaluate_sim lift trifinger_rl_example.example.TorchLiftPolicy --output results_lift.json


.. _gymnasium: https://gymnasium.farama.org/
.. _trifinger_rl_datasets: https://github.com/rr-learning/trifinger_rl_datasets
.. _trifinger-rl-example: https://github.com/rr-learning/trifinger-rl-example