Code Monkey home page Code Monkey logo

ccm_madrl_mec's People

Contributors

tesfayz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

ccm_madrl_mec's Issues

Inquiry about the per action DQN in the code and Huawei's dataset

Hello, I have read your paper and open-source code and feel that I have benefited a lot. But as I have just started, there are still some areas that I don't understand that I would like to ask you for advice. Firstly, regarding the combinatorial optimization problem mentioned in the paper, which involves deploying agents through the madddpg algorithm on devices and having a DQN agent in the edge network. The DQN agent outputs offloading scheduling actions based on network status and device actions, but I have not observed any part about DQN in open-source code. In addition, the article mentioned using Huawei's task dataset for validation, and I did not understand how this dataset can be applied to the MADRL environment. Thank you for your reply amidst your busy schedule. Wishing you smooth research.

Dear author, I have some questions about this code.

In line 276 of CCM_MADDPG.py, I wonder why " newactor_action_var = self.actors[agent_id](states_var[:, agent_id, :]" instead of "newactor_action_var = self.actors[agent_id](next_states_var[:, agent_id, :])" when calculating the next target actions for each agent? Hope to get your answer, thank you very much.

On the problem of reward function

Dear author:
First of all, thank you for your project which has brought me a lot of inspiration. At the same time, in the part of reward function, I have a problem.
I tried to print the energy, energy penalty, time and time penalty, and I found that the energy data and time data were several orders of magnitude different. Time is ≈10^-1 and energy is ≤10^-3. Both of their weight are 0.5 and 0.5.(step10). I want to know the rationality of designing rewards in this way. I wonder if it is necessary to normalize or standardize the time delay and energy before designing the reward function.
Beside, regarding the weight of reward function, whether it is 0.5 and 0.5 or 1 and 5, are you experimenting constantly? Is there a better way to set their weights scientifically?
These are some of my questions, and I would appreciate it if you could give me some guidance.

Experiment time

Dear author, I am very interested in your work, may I ask how long you run an experiment?

代码运行报错

您好,我是一名新手,我在运行run.py的时候出现 InfdexofResult = sys.argv[1] # set run runnumber for indexing results,
IndexError: list index out of range,请问是什么原因呢?

Regarding code issues

Hello author, it is stated in the paper that the master will decide whether to accept the offloading task based on the state information and actions of all client agents. However, I don't seem to have found any relevant operation in the code. Can you please indicate it?

Overfitting in reproduction

I think your work is interesting and have tried to reproduce your work, but I have experienced overfitting many times. I noticed that you mention in your paper that this is normal, but I can't quite understand how overfitting can prove that your algorithm can outperform a benchmark algorithm if it occurs.This is because under the same environmental parameters, sometimes overfitting occurs and sometimes it is able to train and optimise normally.

env

Sorry to bother you, I want to ask what 'env' is.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.