Comments (22)
Hello,
I have the same problem during the training process,Do you solve the problem?
Can you give me some advice?
Thank you in advance.
Lu
from chapter16-robot-learning-in-simulation.
Sorry, i switched on vrep 3.6.2, but it seems still did not work.
from chapter16-robot-learning-in-simulation.
Me,too.
and if we switch another control mode "end_position" or giving a random target in the control mode”joint_velocity“,the SAC net is broken and the reward function value always unstable
from chapter16-robot-learning-in-simulation.
Hi,
The sawyer simulation in V-REP seems to be unstable sometimes, which leads to a broken gripper during exploration. This is the reason why I have the code to restart the environment every 20 episodes during training, so that from my side the agent can smoothly finish the training process with thousands of episodes.
So could you @jianye0428 check if the gripper is still broken after the restarting code line above, via visualization of the robot scene?
I'm not sure if this problem is caused by the package version. To make sure this project works well, we recommend to use V-REP 3.6.2 and a compatible PyRep that we forked here, rather than directly installing the latest version.
As for the "end_position" mode @luweiqing, this project is not a solution for that. You may need to change the code a bit and fine-tune it to make that work.
Best,
Zihan
from chapter16-robot-learning-in-simulation.
hello,
thanks for reply. I'll have a try and feedback later.
best,
Jian
from chapter16-robot-learning-in-simulation.
Hello,
i have tried with vrep 3.6.2 and pyrep package forked, but i still did not work.
Best,
Jian
from chapter16-robot-learning-in-simulation.
Hello,
i have tried with vrep 3.6.2 and pyrep package forked, but i still did not work.Best,
Jian
Can you check if the gripper is broken during exploration and it's still broken after restarting the environment with code ?
from chapter16-robot-learning-in-simulation.
Hi,
Thank you for your sincere reply,
I have solved the error "Gripper position is nan" during thousands of eposides
I have trained 80000s eposide but the reward value is always unstable and is not convergence。 the success rate is very low .
can I need more eposide training??
can you give me some advice about how many eposides do I need to train the value function to be stable.
Best
Lu
from chapter16-robot-learning-in-simulation.
as for the error "Gripper position is nan",it is because the output of policynetwork is [nan,nan,nan,nan,nan,nan,nan], it cause that
if math.isnan(ax): # capture the broken gripper cases during exploration
print('Gripper position is nan.')
self.reinit()
from chapter16-robot-learning-in-simulation.
Hello,
i have tried with vrep 3.6.2 and pyrep package forked, but i still did not work.
Best,
JianCan you check if the gripper is broken during exploration and it's still broken after restarting the environment with code ?
I think it still broken when the environment is reinitialized. In my training process the reward turned out to be zero when the episodes exceeds 20. And I get the same error.
best,
Jian
from chapter16-robot-learning-in-simulation.
Hi,
Thank you for your sincere reply,
I have solved the error "Gripper position is nan" during thousands of eposides
I have trained 80000s eposide but the value is always unstable and is not convergence。 the success rate is very low .
can I need more eposide training??
can you give me some advice about how many eposides do I need to train the value function to be stable.Best
Lu
Hello,
I tried with the forked pyrep package but still with the same error.
Can I ask how did you solve the problem? Did you just use the forked pyrep or have you changed other things ?
Bests,
Jian
from chapter16-robot-learning-in-simulation.
Hi.
I changed robot from sawyer to baxter
the reward value is always unstable and is not convergence even though I trained above 80000 eposides
from chapter16-robot-learning-in-simulation.
Hi.
I changed robot from sawyer to baxter
the reward value is always unstable and is not convergence even though I trained above 80000 eposides
Did you change the environment script after you change the robot from Sawyer to Baxter? Since the environment is basically customized for Sawyer, I'm not sure if it could directly work with Baxter. As for Sawyer, I only take thousands of episodes to have some primary learning results as the learning curve in Readme.
from chapter16-robot-learning-in-simulation.
Hello,
i have tried with vrep 3.6.2 and pyrep package forked, but i still did not work.
Best,
JianCan you check if the gripper is broken during exploration and it's still broken after restarting the environment with code ?
I think it still broken when the environment is reinitialized. In my training process the reward turned out to be zero when the episodes exceeds 20. And I get the same error.
best,
Jian
If so, I would say different cases happen after the reinitialization from your side and my side. The gripper is complete and works well after reinitialization even if it's broken before in my tests. I currently do not know what causes this difference.
from chapter16-robot-learning-in-simulation.
Today that error "Gripper position is nan“ appeared again,I am sure that I changed the enviroment script. on the other hand, I set the target object target for random , is this project a solution for that?
or what I should make any other changes to this network?
from chapter16-robot-learning-in-simulation.
I find the reason why the error comes out.
the code is -= not =-
from chapter16-robot-learning-in-simulation.
I find that reducing the number of threads can avoid the error :"Gripper position is nan“
from chapter16-robot-learning-in-simulation.
I find that reducing the number of threads can avoid the error :"Gripper position is nan“
You mean the process? How many did you use when you meet the error?
from chapter16-robot-learning-in-simulation.
I find that reducing the number of threads can avoid the error :"Gripper position is nan“
You mean the process? How many did you use when you meet the error?
yes,I use more than 4 processes when I meet the error,And as the number of training sessions increases, the value function will converge when I use only 1 process
from chapter16-robot-learning-in-simulation.
And I have a question, how to summarize the final training results of multi-threaded training? A3C is multi-threaded sampling, but still single-threaded training。
from chapter16-robot-learning-in-simulation.
And I have a question, how to summarize the final training results of multi-threaded training? A3C is multi-threaded sampling, but still single-threaded training。
If you use multi-threading, the variables and objects can be shared across threads within a process, in which case you can log the results easily by reading these shared objects; if you use multi-process, a queue can be used for sending information across processes.
from chapter16-robot-learning-in-simulation.
Hi, I've been trying to train the sac_learn file but I was getting the "Gripper position is nan" error. I tried the suggestions here, I was using 4 parallel process, then 2, both cases crashed with the gripper position thing, now I've been running the training with just 1 process, has been 13 hrs by now, episode 24k+ and the episode reward still around -3 ~ -2, sometimes there's a 7 but is quite rare.
I'm using Ubuntu 18.04 as OS, python3.6.9, v-rep pro edu v3.6.2, this github pyrep version and pytorch 1.8.1 with CUDA 11.1 with a RTX 2080 super as gpu and a ryzen 7 3800x as cpu
from chapter16-robot-learning-in-simulation.
Related Issues (4)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from chapter16-robot-learning-in-simulation.