Attention
This challenge has ended!
This documentation is only for the Real Robot Challenge 2020 which has ended. Following challenges have their own documentation, see the challenge website for more information.
The CubeEnv Gym Environment¶
The CubeEnv
environment is registered as
rrc_simulation.gym_wrapper:real_robot_challenge_phase_1-v1
and can be created using gym.make()
using that name.
-
class
rrc_simulation.gym_wrapper.envs.cube_env.
CubeEnv
(initializer, action_type=<ActionType.POSITION: 2>, frameskip=1, visualization=False)[source]¶ Gym environment for moving cubes with simulated TriFingerPro.
-
__init__
(initializer, action_type=<ActionType.POSITION: 2>, frameskip=1, visualization=False)[source]¶ Initialize.
- Parameters
initializer – Initializer class for providing initial cube pose and goal pose. See
RandomInitializer
andFixedInitializer
.action_type (ActionType) – Specify which type of actions to use. See
ActionType
for details.frameskip (int) – Number of actual control steps to be performed in one call of step().
visualization (bool) – If true, the pyBullet GUI is run for visualization.
-
compute_reward
(achieved_goal, desired_goal, info)[source]¶ Compute the reward for the given achieved and desired goal.
- Parameters
achieved_goal (dict) – Current pose of the object.
desired_goal (dict) – Goal pose of the object.
info (dict) – An info dictionary containing a field “difficulty” which specifies the difficulty level.
- Returns
The reward that corresponds to the provided achieved goal w.r.t. to the desired goal. Note that the following should always hold true:
ob, reward, done, info = env.step() assert reward == env.compute_reward( ob['achieved_goal'], ob['desired_goal'], info, )
- Return type
float
-
reset
()[source]¶ Resets the environment to an initial state and returns an initial observation.
Note that this function should not reset the environment’s random number generator(s); random variables in the environment’s state should be sampled independently between multiple calls to reset(). In other words, each call of reset() should yield an environment suitable for a new episode, independent of previous episodes.
- Returns
the initial observation.
- Return type
observation (object)
-
seed
(seed=None)[source]¶ Sets the seed for this env’s random number generator.
Note
Spaces need to be seeded separately. E.g. if you want to sample actions directly from the action space using
env.action_space.sample()
you can set a seed there usingenv.action_space.seed()
.- Returns
List of seeds used by this environment. This environment only uses a single seed, so the list contains only one element.
-
step
(action)[source]¶ Run one timestep of the environment’s dynamics.
When end of episode is reached, you are responsible for calling
reset()
to reset this environment’s state.- Parameters
action – An action provided by the agent (depends on the selected
ActionType
).- Returns
observation (dict): agent’s observation of the current environment.
reward (float) : amount of reward returned after previous action.
done (bool): whether the episode has ended, in which case further step() calls will return undefined results.
info (dict): info dictionary containing the difficulty level of the goal.
- Return type
tuple
-
-
class
rrc_simulation.gym_wrapper.envs.cube_env.
ActionType
(value)[source]¶ Different action types that can be used to control the robot.
-
TORQUE
¶ Use pure torque commands. The action is a list of torques (one per joint) in this case.
-
POSITION
¶ Use joint position commands. The action is a list of angular joint positions (one per joint) in this case. Internally a PD controller is executed for each action to determine the torques that are applied to the robot.
-
TORQUE_AND_POSITION
¶ Use both torque and position commands. In this case the action is a dictionary with keys “torque” and “position” which contain the corresponding lists of values (see above). The torques resulting from the position controller are added to the torques in the action before applying them to the robot.
-
-
class
rrc_simulation.gym_wrapper.envs.cube_env.
RandomInitializer
(difficulty)[source]¶ Initializer that samples random initial states and goals.
-
class
rrc_simulation.gym_wrapper.envs.cube_env.
FixedInitializer
(difficulty, initial_state, goal)[source]¶ Initializer that uses fixed values for initial pose and goal.
-
__init__
(difficulty, initial_state, goal)[source]¶ Initialize.
- Parameters
difficulty (int) – Difficulty level of the goal. This is still needed even for a fixed goal, as it is also used for computing the reward (the cost function is different for the different levels).
initial_state (move_cube.Pose) – Initial pose of the object.
goal (move_cube.Pose) – Goal pose of the object.
- Raises
Exception – If initial_state or goal are not valid. See
move_cube.validate_goal –
-