pccproject / pcc-rl Goto Github PK
View Code? Open in Web Editor NEWReinforcement learning resources for PCC.
License: Other
Reinforcement learning resources for PCC.
License: Other
the most important part of this project(use gym model as CC in UDT) can not be run!
There is no PCCProject/PCC-Uspace.git repo with the deep-learning branch checked out.
I wonder if you really finish the experience.
Hi,
I started to work with your code and first I must to say it is amazing and easy to understand.
ut, I stiil have to questions:
Thanks,
Ido
i can not understand the Readme clear ,could you please write a complete step that i can make a online training and testing (like uspace and pcc-rl i can not know how can i use them for training and testing )
Which version of stable-baselines does the repository support? I could not get it to run with the latest version.
Traceback (most recent call last):
File "stable_solve.py", line 21, in <module>
from stable_baselines import PPO1
ImportError: cannot import name 'PPO1'
I tested the trained model by using ./app/pccclient send 127.0.0.1 9000 --pcc-rate-control=python -pyhelper=loaded_client -pypath=/path/to/pcc-rl/src/udt-plugins/testing/ --history-len=10 --pcc-utility-calc=linear --model-path=/path/to/your/model/
But the send rate can't exceed 0.5 Mbps.While I changed the MIN_RATE in loaded_client.py to 5, the send rate can't exceed 5Mbps then. I wonder what's wrong with it,thanks.
Hi.
I checked your model on #5 and found that it works well
when i run your code in default setting ( nothing changed ), rewards don't grow.
how to train model like yours with stable_solve.py?
Thanks.
This error comes when trying to load the model from stable_solve.py (attached below)
icml_paper_model.zip
Error comes from loading model like this:
with tf.io.gfile.GFile(pbFile, "rb") as f:
graph_def = tf.compat.v1.GraphDef()
graph_def.ParseFromString(f.read())
To successfully load the model; change the way the model is saved to:
with model.graph.as_default():
tf.train.write_graph(tf.get_default_graph(), "path-to-folder", 'saved_model.pb', as_text=False)
Hi, thanks for the clearly written testbed for RL congestion control.
After going through the codes, I notice that here the range of action space is set to [-1e12, 1e12]
. Even though later a DELTA_SCALE
(default to 0.25) is applied to the action, it's still a fairly large action space. I wonder why we use this range here without any normalization, could you please kindly give some explanations? Thanks a lot.
Dear author,
Thank you for your efforts in putting the source code online. I had a problem when recreating PCC-RL. The built environment allows the client and server to ping, according to the steps of DeepLearning_Readme.md. When using shim_solver training, iter is always 0, can't interact with the environment. Where should I view this problem?
I look forward to your answer. Thank you.
Hi,
I can not use the setting with stablebaselines because of the TensorFlow version issue. I tried to use the PPO in stable-baseline3 instead. However, both the reward and ewma reward are super unstable and can not reach the 7000 reported in the paper. May I ask if the print(reward..., ewma...) in the reset() function is the reported reward in the paper? If yes, which one is the training reward reported (reward or ewma reward)?
As a side point, did you find the learning performance sensitive to the parameters of PPO(e.g. timesteps_per_actorbatch, schedule, etc.)? The new version of PPO in stable-baseline3 does not have these parameters. So it's hard to 100% copy the training process as in this repo.
Thank you very much!
Bests,
Hello Authors,
Thank you for your efforts in putting the source code online. But I have a question. I could not find the trained model to test it nor was Aurora added to pantheon as PCC and PCC Vivace. So may you provide us with the trained model at least?
Thank you in advance
hello ,i want to test the Auraro performance in pantheon,but i meet some question when i use local model.
this is my pantheon_report
pantheon_report.pdf
i have the same question with #12 and #16 when i use one machine,
but when i test the remote model ,it seems work well.
pantheon_report.pdf
i wonder whether you could give me some advice .
thanks for your help!
Hello,
I seem to have encountered the same issue here, I am wondering how you resolved this issue?
Thank you very much!
I am using python3.7, tensorflow==1.15, stable-baselines==2.10.0, gym==0.18.0
/usr/local/lib/python3.7/site-packages/sklearn/utils/validation.py:37: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
LARGE_SPARSE_SUPPORTED = LooseVersion(scipy_version) >= '0.14.0'
WARNING: Logging before flag parsing goes to stderr.
W0409 20:02:53.006458 4432291328 lazy_loader.py:50]
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
Traceback (most recent call last):
File "stable_solve.py", line 21, in
from stable_baselines import PPO1
ImportError: cannot import name 'PPO1' from 'stable_baselines' (/usr/local/lib/python3.7/site-packages/stable_baselines/init.py)
Best,
Juexiao
Originally posted by @jessewjx in #10 (comment)
Hi Author,
Thanks for your work.
Can I know that in network_sim.py, what is the unit (Mbps or Kbps)of the throughput. In the code it is 40-4000.
Also, in the online training/testing, what is the unit of the throughput to compute the reward?
Looking forward for your reply.
Thanks,
Tianbo
Hi:
When I try to run stable_solve.py in PCC_RL/src/gym. I found that rewards don't grow or converg. How long does it need to converge?
Best wishes
Hello,
I am conducting some experiments based on the model attached by @r02b #5 . However, the throughput cannot exceed more than 0.5 Mbps. Could you please help me to figure out it? Thank you in advance.
tergel@congestion-control:~/RL/PCC-Uspace/src$ app/pccclient send ulsan 5003 --pcc-rate-control=python -pyhelper=loaded_client -pypath=/home/tergel/Test/RL/PCC-RL/src/udt-plugins/testing/ --history-len=10 --pcc-utility-calc=linear --model-path=/home/tergel/RL/PCC-Uspace/src/ICML_training_model/icml_paper_model/
Starting sending rate = 2.24e+06
2021-03-02 15:19:00.117791: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/tergel/RL/PCC-Uspace/src/core
2021-03-02 15:19:00.117820: E tensorflow/stream_executor/cuda/cuda_driver.cc:318] failed call to cuInit: UNKNOWN ERROR (303)
2021-03-02 15:19:00.117837: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (congestion-control): /proc/driver/nvidia/version does not exist
2021-03-02 15:19:00.118395: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2021-03-02 15:19:00.143709: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2900000000 Hz
2021-03-02 15:19:00.144754: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55eed27df4c0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-03-02 15:19:00.144772: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
WARNING:tensorflow:From /home/tergel/Test/RL/PCC-RL/src/udt-plugins/testing/loaded_agent.py:25: load (from tensorflow.python.saved_model.loader_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.loader.load or tf.compat.v1.saved_model.load. There will be a new function for importing SavedModels in Tensorflow 2.0.
connect
finished connect
Rate (Mbps) RTT (ms) Sent Lost
Begining of the data send
1 0 10000 0 0
2 0 10000 0 0
End of the data send
0 1410065408
145600000 1410065408
Begining of the data send
3 3.62224 5.24698 311 0
4 1.49082 5.24526 439 0
5 0.885169 5.24785 515 0
6 0.617292 5.22444 568 0
7 0.500824 5.29394 611 0
8 0.489168 5.28796 653 0
9 0.477526 5.25476 694 0
10 0.489174 5.23419 736 0
11 0.489176 5.21565 778 0
12 0.47753 5.28155 819 0
13 0.489176 5.23707 861 0
14 0.489179 5.20155 903 0
15 0.477531 5.26266 944 0
16 0.489178 5.26317 986 0
17 0.489178 9.91258 1028 0
18 0.477531 10.8524 1069 0
19 0.489176 5.75902 1111 0
20 0.489176 5.80497 1153 0
21 0.477531 6.59316 1194 0
22 0.489176 8.03863 1236 0
23 0.489179 8.08277 1278 0
24 0.477532 8.11315 1319 0
25 0.48918 8.70412 1361 0
26 0.489179 5.87617 1403 0
27 0.477533 6.12219 1444 0
28 0.489177 7.04131 1486 0
29 0.489177 7.99061 1528 0
30 0.477531 10.4724 1569 0
31 0.489177 5.26468 1611 0
32 0.489177 5.2363 1653 0
33 0.477529 5.62193 1694 0
34 0.489181 7.06206 1736 0
...
thanks for sharing the code,but we can not find PCCProject/PCC-Uspace.git/Deep_Learning_Readme.md.
There is no readme there ./src/gym/online/README.md for detailed instructions on how to load trained models. Could the author tell how to test the trained model by using the PCC-Uspace
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.