kindredresearch / senseact Goto Github PK

View Code? Open in Web Editor NEW

208.0 11.0 41.0 89.09 MB

SenseAct: A computational framework for developing real-world robot learning tasks

Home Page: https://www.kindred.ai/SenseAct

License: BSD 3-Clause "New" or "Revised" License

Python 99.38% Shell 0.62%

reinforcement-learning robot dynamixel learning-agents robot-learning

senseact's People

Contributors

Stargazers

Watchers

Forkers

armahmood amoliu nunofernandes-plight williampma virens13117 yinlang832 danielsnider kairproject angranli porkpy superjeary blondinen dti-research wanghuimu punith112 lpetrich hzm2016 ghx-0228 geonhee-lee rohancalum zwbgood6 abigsunnyboy muyangren499 psyche-mia xuhuahaoren rajasgs alan963852741 shanyhe qlfx9999 sarahboufelja robertmccarthy97 anson-leung-kindred danzimmerman ne-v0y ruturajsambhusvt laurencelamarche zhangchneyu homayoonfarrahi a2i2-robotics yccckid

senseact's Issues

How to set seed for random number generator

I want to repeat the same experiment with the same exact starting parameters to see how repeatable the learning curve is. However, I couldn't find the way to set the seed for the random number generator. Can you please point me in the right direction? I'm using Dynamixel servos.

Thank you!

Aggregate shared code in DXL and Create2 envs

The DXL and Create environments have a lot of duplicated code. It would be great to combine these.

How to achieve replicable results in SenseAct?

Sorry for the simple question.
I'm trying to replicate my results across runs in SenseAct using PPO. I've set a constant seed to get a fixed random state and have verified that the state is the same across runs. However, the simulator seems to still be randomly generating both the initial network and targets/resets (the returns are very inconsistent, as are the observations).

In Appendix A.5 of Benchmarking Reinforcement Learning Algorithms on Real-World Robots, it is mentioned that "For the agent, randomization is used to initialize the network and sample actions. For the environment, randomization is used to generate targets and resets. By using the same randomization seed across multiple experiments in this set of experiments, we ensure that the environment generates the same sequence of targets and resets, the agent is initialized with the same network, and it generates the same or similar sequence of actions for a particular task. "

Could someone please clarify how this is done? I have attempted to set numpy and tf with some fixed seed. Furthermore, I have attempted to set each tensorflow operation to use some fixed seed.

Thanks!

How do I implement this?

I'm happy I've finally found an implementation of reinforcement learning for universal robots.
I've had an ongoing issue with trying to interface my learning algorithm with a real robot. I've looked at rllab with ros but couldn't get anything to work. I don't know enough to rewrite an algorithm in a ros node. I've even looked at using opcua to try and import/export data between ros and my learning algorithm.

I've installed SenseAct ok and can run the example inverted pendulum.
However, I can't seem to find any documentation on how to implement anything on a real robot. I have a ur5 and I'm using ubuntu16.04.
Where is the communication between the ur5 and the algorithm? How do I initiate this communication?
Is there a tutorial on building an environment with a real ur5 and testing different algorithms?
I'm 18 months into a 3-year phd on robotics and reinforcement learning and I still haven't figured out how to interface between some algorithm and my robot.
Please help.

Add check for latency for all serial connections

To ensure good performance, all serial connections should be set to low latency. We should add a check for this for Dynamixels and Creates.

To check latency of a particular serial port:
cat /sys/bus/usb-serial/devices/ttyUSB0/latency_timer
To change the latency we can do

sudo apt install setserial
setserial /dev/ttyUSB0 low_latency

At present the default value is 16, but it should be set to 1 by the setserial command. If the computer is restarted or the device removed and reinserted it will reset to the default. Not sure how to correct this.

This works on Debian variants, but I don't know about OSX or Windows.

Remove custom PID controller from DXL envs

Both DxlReacher1DEnv and DxlTracker1DEnv use a custom PID controller to return the motor to the starting position. This is unnecessary - it is straightforward to switch to position control.

However, to implement this change would require reworking the DXLCommunicator and providing a mechanism for specifying different control modes as the communicator assumes torque control.

Implement a consistent mechanism for specifying random seeds for all example scripts.

One of the big selling points of SenseAct is that results are meant to be reproducible given the same seed. However, in the example experiment scripts (at least dxl_reacher) there's actually no way to specify the seed and I had to hack the file by hand.

dxl_reacher sensor_dt time exceeded

I've got an MX64AT connected and running using the dxl_reacher.py script. The motor is configured to 1000000 Baud and I've set the Return Delay Time to zero as per: https://github.com/kindredresearch/SenseAct/blob/master/senseact/devices/dxl/README.md

I'm continually getting these warnings:
Warning: Iteration time exceeded sensor_dt 10.0ms by 22.915592193603512ms

Bumping gripper_dt up gets rid of the warnings, but I'm not sure if that's a good thing.

UR Reacher on UR e-series

System: I'm using PR #29 on Ubuntu 18.04 with both real UR5s and the URSim in Virtualbox

Intro: I've successfully run SenseAct on the CB3 (v.3.7.0) and the new e-series (v.5.1.2) simulators.

Problem: I'm trying to move from UR offline simulators to the real UR5s. This time the newest e-series v.5.1.2. However when I run the example code I get an error stating

AttributeError: 'ReacherEnv' object has no attribute 'angle_return_point'

This does not occur when connecting to the offline simulator.

Change from RT Interface to RTDE to support future e-series versions - UR Reacher

UR Statement: The RealTime interface has been deprecated since version 3.5. It is recommended to use the RTDE Interface, instead.

See sheet "RealTime_3.5" in excel document (Client_Interface_V3.7andV5.1.xlsx) attached at site: https://www.universal-robots.com/how-tos-and-faqs/how-to/ur-how-tos/remote-control-via-tcpip-16496/

Refactor dxl_basic_functions

dxl_basic_functions.py is written in such a way the you have to go in and tweak the code by hand to set the port, motor id, etc.
I suggest it should be refactored into a class to which you can pass in the communications information.
Also, I noticed that there were a number of calls in there that were specific to the MX64, like accessing the registers. This could easily be refactored so that the appropriate motor is passed into the class when it is instantiated.

Soft-Q learning implementation

Hi, I can't seem to find the implementation for SQL (as mentioned in the paper). Help would be appreciated.

Wiki feedback: additional packages

I'm just going to dump a number of points I had written down about the wiki.

I had to install a number of additional packages:

python3-pip
setuptools
wheel
psutil
python3-tk

Additionally, it wasn't clear which experiments I needed to install baselines for. So the first time I ran dxl_reacher.py I got an error about missing the baselines package.

DDPG + HER to replace TRPO

I want to replace the TRPO with DDPG + HER and am having difficulties. The combination only works with a task that is registered with Gym. How did TRPO avoid that?

Plotting Crashes on `create2_docker.py` and `create2_mover.py` Scripts

Plotting on both the create2_docker.py and create2_mover.py scripts crashes after running for a few minutes with the following error:

Process Process-4:
Traceback (most recent call last):
  File "/usr/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap
    self.run()
  File "/usr/lib/python3.5/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "create2_mover.py", line 186, in plot_create2_mover
    ax2.set_ylim([np.min(rets), np.max(rets) + 1])
  File "/home/pi/.local/lib/python3.5/site-packages/numpy/core/fromnumeric.py", line 2442, in amin
    initial=initial)
  File "/home/pi/.local/lib/python3.5/site-packages/numpy/core/fromnumeric.py", line 83, in _wrapreduction
    return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
ValueError: zero-size array to reduction operation minimum which has no identity

For reference this is running on a Raspberry Pi 3 and the plotting is being sent via X-forwarding through SSH to a MacBook.

Train on specific points? Discrete action_space [with distinct points]!

Hi, Thanks a lot for sharing,

The observation and actions space are defined as gym Box objects, as seen here self._action_space and self._observation_space.

This moves the end-effector (during training) to random target points [within limited space].

Is it possible to train the end-effector on moving from specific start-point to specific end-point?
like define the observation and actions space with discrete action_space space [with distinct points], not continuous as Box objects.

Any help would be appreciated...

the meaning of Hyperparameters

Hi,
I do not understand the meaning of hyperparameters, like
(ps, I only find there is an explanation for γ)
Could you guys explain others?

Thanks a lot!

Update dxl_reacher and dxl_tracker to use new baselines function call format

There was a refactor in baselines which changed the function signatures:
openai/baselines@8c2aea2

I've only looked at the dxl code, but it still uses the previous arguments and won't run with the latest baselines.

dxl_basic_functions: bulk_read_test fails block length assertion

MX64AT.

Try this:

bulk_read_test([1])

Result:

Traceback (most recent call last):
File "/usr/lib/python3.5/code.py", line 91, in runcode
exec(code, self.locals)
File "", line 1, in
File "/home/craig/workspace/SenseAct/senseact/devices/dxl/dxl_basic_functions.py", line 239, in bulk_read_test
vals_dict = dxl_commv1.bulk_read(port, [goal_block, pos_block], dxl_ids)
File "/home/craig/workspace/SenseAct/senseact/devices/dxl/dxl_driver_v1.py", line 298, in bulk_read
assert len(blocks)==len(dxl_ids)
AssertionError

Checking the blocks returned there are two blocks returned for the single motor. The first has the name "goal_pos" and the second has the name "present_pos". Surprisingly the blocks are neither marked with their corresponding motor id, nor are they organized by id.

Desktop crashed!?

I was running dxl_reacher and my desktop crashed and rebooted. I was running other things as well though so I don't know if it was the senseact software, but I figured I'd add this issue in case anyone else sees something similar.

UR Reacher on CB2

System: I'm using PR #29 on Ubuntu 18.04 with both real UR5s and the URSim in Virtualbox

Intro: I've successfully run SenseAct on the CB3 (v.3.7.0) and the new e-series (v.5.1.2) simulators. I've furthermore tested the code "as is", and on a real CB2 which didn't work.

Problem: I think it's because of the _compute_sensation function in reacher_env.py, uses the following commands 'i_control', 'v_actual', 'safety_mode', which is unavailable on the CB2.

Is the three above-mentioned commands actually used? It doesn't seem like it, so I'll try to download the CB2 (v.1.8.14035) simulator and try it out with the three commands removed.

Refactor dxl_reacher and dxl_tracker to allow user to specify port and baud.

The port is not specified and the driver goes looking for /dev/ttyACM*, my device only shows up as /dev/ttyUSB0.
The port and baud should be specified from the command arguments.

UR 3.5.4.10845 ur5_reacher.py Error

I want to reach the target using DRL.
So, I changed the host ip and ran the python code "ur5_reacher.py".
but it isn't worked having like following warning and error:

Warning: incomplete packet from UR`

ERROR:root:One of the environment subprocess has died, closing all processes.
Traceback (most recent call last):
File "ur5_reacher.py", line 185, in
main()
File "ur5_reacher.py", line 84, in main
callback=kindred_callback
File "/home/geonhee-ml/rl_ws/src/SenseAct/baselines/baselines/trpo_mpi/trpo_mpi.py", line 199, in learn
seg = seg_gen.next()
File "/home/geonhee-ml/rl_ws/src/SenseAct/baselines/baselines/trpo_mpi/trpo_mpi.py", line 56, in traj_segment_generator
ob, rew, new, _ = env.step(ac)
File "/home/geonhee-ml/anaconda2/envs/senseact/lib/python3.5/site-packages/senseact/utils.py", line 150, in step
wrapped_step = self._wrapped_env.step(scaled_action)
File "/home/geonhee-ml/anaconda2/envs/senseact/lib/python3.5/site-packages/senseact/rtrl_base_env.py", line 243, in step
self.act(action)
File "/home/geonhee-ml/anaconda2/envs/senseact/lib/python3.5/site-packages/senseact/rtrl_base_env.py", line 235, in act
raise e
File "/home/geonhee-ml/anaconda2/envs/senseact/lib/python3.5/site-packages/senseact/rtrl_base_env.py", line 232, in act
self._write_action(action)
File "/home/geonhee-ml/anaconda2/envs/senseact/lib/python3.5/site-packages/senseact/rtrl_base_env.py", line 401, in _write_action
raise Exception("Environment has been shutdown due to subprocess error.")
Exception: Environment has been shutdown due to subprocess error.
Process Process-4:
Traceback (most recent call last):
File "/home/geonhee-ml/anaconda2/envs/senseact/lib/python3.5/multiprocessing/managers.py", line 709, in _callmethod
conn = self._tls.connection
AttributeError: 'ForkAwareLocal' object has no attribute 'connection'

During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/geonhee-ml/anaconda2/envs/senseact/lib/python3.5/multiprocessing/process.py", line 252, in _bootstrap
self.run()
File "/home/geonhee-ml/anaconda2/envs/senseact/lib/python3.5/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "ur5_reacher.py", line 125, in plot_ur5_reacher
old_size = len(shared_returns['episodic_returns'])
File "", line 2, in getitem
File "/home/geonhee-ml/anaconda2/envs/senseact/lib/python3.5/multiprocessing/managers.py", line 713, in _callmethod
self._connect()
File "/home/geonhee-ml/anaconda2/envs/senseact/lib/python3.5/multiprocessing/managers.py", line 700, in _connect
conn = self._Client(self._token.address, authkey=self._authkey)
File "/home/geonhee-ml/anaconda2/envs/senseact/lib/python3.5/multiprocessing/connection.py", line 487, in Client
c = SocketClient(address)
File "/home/geonhee-ml/anaconda2/envs/senseact/lib/python3.5/multiprocessing/connection.py", line 614, in SocketClient
s.connect(address)
FileNotFoundError: [Errno 2] No such file or directory
`

I think the software version is diffrent.
Let me know what is the problem.
Thanks for reply you comments.

Warning: incomplete packet from UR

So I can get my UR5 robot moving using the example reacher scripts and also the pre-trained model
python3 examples/advanced/ur5_reacher.py ./examples/advanced/pre_trained_models/ur_reacher_2_trpo.pkl

The problem is; the robot makes the same movement repeatedly. There is no difference between the untrained and the pre-trained models. In both situations, I get a recurring error saying
Warning: incomplete packet from UR
I suspect this is the joint feedback to SenseAct, so the agent doesn't know where it is maybe??
I'm using URSoftware 3.8.0.61336.
I will try to downgrade my URSoftware to 3.3.4.310 and retest.

I have now downgraded my UR software to 3.3.4.310 and it is working brilliantly! I'll put a video on Twitter ATdommckean.
I do still have some questions though.
In the console output, what is meant by;
Hiccup of 1.48ms overhead between UR packets) ?
and;
WARNING:root:Agent has over-run its allocated dt, it has been 0.4206395149230957 since the last observation, 0.3806395149230957 more than allowed
Should I be concened?

Thanks again for this awesome package.

Simulating UR in ROS

Hello, I have been looking for such a repository for a long time. I am currently working on a project which involves stimulating and coordinating two UR5 in ROS/Gazebo for a pick and place task i.e. (the first robot picking up an object from a given position and moving to a certain position and the second robot moving to that position, taking the object from the first robot and going to the goal position). I am using Robotiq85 grippers for picking up an object and I am successfully able to control both the robots. till now I am successfully able to perform this task by hard-coding it, you can find the video here (https://www.youtube.com/watch?v=n6Vk9lIxKkg) but I want to perform this task using PPO for which I need to create an environment such that when the agent takes an action that actions is performed into the simulated world using Moveit Python interface (library which is used to control motions to the robot in Gazebo-simulated world). What sort of modification do I need to make in the environment and the agent in order to train the robot in the simulation?

Extract communicators for Create environments

The DXL and UR5 environments have been updated to move the communicator setups outside the environments, instead they are now passed in as a parameter. The same needs to be done for the Create2 envs as well.

Affected files:
senseact/envs/create2/create2_docker_env.py
senseact/envs/create2/create2_mover_env.py
examples/advanced/create2_docker.py
examples/advanced/create2_mover.py

Display visualization of simulator mujoco's virtual world

I'm glad you have created the simulation example, but I would like to see the action happening in the simulated world. For the double pendulum simulation, examples/sim_double_pendulum.py, it would be very nice to have a configuration option that if enabled will display the visualization of the pendulum swinging in simulation.

I can try to implement this if you point me to documentation on how to render the virtual world.

Thanks!

dxl_basic_functions.py: rewrite to allow configurable port, id, baud, etc.

This file offers some useful helper functions that are higher level than the driver, but not wrapped in SenseAct's Communicator. However, they are tied to a specific port, baud, etc. Since these functions are actually being duplicated by the DXLCommunicator it would probably make sense to have these be shared functions.

Where does the experiment data get saved to?

Sorry for the silly question but,
Can you please tell me where the experiment results are saved to?
Once the policy has been trained where is that policy saved to, if at all? Is it only the weights that are saved?

Thanks.