Getting Started¶
The TriFingerPlatform Class¶
The TriFingerPlatform
class contains the pyBullet simulation of the
TriFinger robot and provides an interface that is mostly identical with that of
the real robot. This way, transitioning from simulation to real robot later
should only require very few changes in your code.
#!/usr/bin/env python3
"""Simple demo on how to use the TriFingerPlatform interface."""
import time
import trifinger_simulation
if __name__ == "__main__":
platform = trifinger_simulation.TriFingerPlatform(visualization=True)
# Move the fingers to random positions
while True:
position = platform.spaces.robot_position.gym.sample()
finger_action = platform.Action(position=position)
# apply action for a few steps, so the fingers can move to the target
# position and stay there for a while
for _ in range(100):
t = platform.append_desired_action(finger_action)
# sleep after each step so that the visualization happens in real
# time
time.sleep(platform.get_time_step())
# show the latest observations
robot_observation = platform.get_robot_observation(t)
print("Finger0 Joint Positions: %s" % robot_observation.position[:3])
# the cube pose is part of the camera observation
camera_observation = platform.get_camera_observation(t)
cube_pose = camera_observation.object_pose
print("Cube Position (x, y, z): %s" % cube_pose.position)
For a documentation on the full API see The TriFingerPlatform Class.
Gym Environment¶
We provide a basic Gym environment
SimCubeTrajectoryEnv
for the
challenge task.
Below is a simple demo showing how the environment is used to run a “random policy” for one episode.
from rrc_example_package import cube_trajectory_env
class RandomPolicy:
"""Dummy policy which uses random actions."""
def __init__(self, action_space):
self.action_space = action_space
def predict(self, observation):
return self.action_space.sample()
if __name__ == "__main__":
env = cube_trajectory_env.SimCubeTrajectoryEnv(
goal_trajectory=None, # sample random goal
action_type=cube_trajectory_env.ActionType.POSITION,
step_size=100,
visualize=True,
)
policy = RandomPolicy(env.action_space)
observation = env.reset()
is_done = False
while not is_done:
action = policy.predict(observation)
observation, reward, is_done, info = env.step(action)
print("reward:", reward)
The environment uses the same reward function that will also be used to evaluate submissions at the end of this stage of the challenge.
See the API documentation for more details.
You may use this environment as is for your training but you may also modify it or implement your own one (or go a completely different way, there is no obligation to use Gym).