pemami4911 / neural-combinatorial-rl-pytorch Goto Github PK

PyTorch implementation of Neural Combinatorial Optimization with Reinforcement Learning https://arxiv.org/abs/1611.09940

License: MIT License

Python 95.35% Shell 4.65%

neural-combinatorial-optimization pytorch reinforcement-learning seq2seq

neural-combinatorial-rl-pytorch's People

Contributors

Stargazers

Watchers

Forkers

ml-lab g-wang benjamesbabala hedgefair sxdkxgwan xwhfut jokerwhy233 derylucio shubhampachori12110095 montana kyriechin davidtranno1 zafarali orion-chen kamillegault alejandroe-sanchezv binderwang afcarl gumplus sunshinejunfu wizquierdo geyal anthonysu huangyanfei wasd12345 benikah anhhuyalex rahulptel amrmkayid kimdoanh89 xsingit zdebou sigmam rongkaizhang jiaodaxiaozi willsmithsky andrewjiangxw chrisliu007 alexluoqk anticold mengbinghen hyzcn sean0719 shaelynw scorpjd kercker obitoquilt jinglong92 btbujiangjun o7s8r6 zhangyxyyx pengzheng9045 yangmeiyi pepsalehi jialewang97 ssitb xx-fighting kuonanhong networkanddatasciencelab jiaxin96 kejiejiang catezi c8pan titaneric bwantan shaoyn0817 nykabhishek windyli-h nuaa-codemonkey sugarsword zongkailiu fenghaif giilde jiawlu drwxyh neteasefans giabarbou yoon-th tslnihaogit yangcan29 daunfamily insane-42 wangshgeo tingyingwu2010 nifengfeixiang rui-ranyu zzzqi nicotan1997 hanbaoan123 jinqiuzhao hao155 daringpig amhijazi paulorocosta eru1206 xrosliang fjcufinaleteam10 gaosz0755 xihushui599 1520292247

neural-combinatorial-rl-pytorch's Issues

Error when executing main.sh

I ran main.sh (bash . /main.sh), I got this error.

~~Validating~~

0%| | 0/1000 [00:00<?, ?it/s]
Traceback (most recent call last):
File "./trainer.py", line 297, in
R, probs, actions, action_idxs = model(bat)
File "/home/riko/anaconda3/envs/pytorch2/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/riko/notebook/neural-combinatorial-rl-pytorch/neural-combinatorial-rl-pytorch/neural_combinatorial_rl.py", line 512, in forward
probs_, action_idxs = self.actor_net(embedded_inputs)
File "/home/riko/anaconda3/envs/pytorch2/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/riko/notebook/neural-combinatorial-rl-pytorch/neural-combinatorial-rl-pytorch/neural_combinatorial_rl.py", line 372, in forward
enc_h)
File "/home/riko/anaconda3/envs/pytorch2/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/riko/notebook/neural-combinatorial-rl-pytorch/neural-combinatorial-rl-pytorch/neural_combinatorial_rl.py", line 219, in forward
embedded_inputs, beam, batch_size, n_best, i)
File "/home/riko/notebook/neural-combinatorial-rl-pytorch/neural-combinatorial-rl-pytorch/neural_combinatorial_rl.py", line 295, in decode_beam
if all_idxs.size(0) > 1:
RuntimeError: dimension specified as 0 but tensor has no dimensions

I changed the argument in line 253 of neural-combratoral-ri.py to 1.
Before: idxs = probs.multinomial().squeeze(1)
After: idxs = probs.multinomial(1).squeeze(1)

In addition, I removed the $ mark from RANDOM_SEED=$1 because there was also an error in main.sh.

TSP_20 task reward function

Hi, thanks for sharing your code.

In the file "tsp_task.py", I see the TSP reward defined in the usual way as the sum of distances between consecutive vertices in the cylce. But it looks like you also started to generalize this, or maybe had to manually tune things a bit for tsp_20? I see this:

# For TSP_20 - map to a number between 0 and 1
# min_len = 3.5
# max_len = 10.
# TODO: generalize this for any TSP size
#tour_len = -0.1538*tour_len + 1.538 
#tour_len[tour_len < 0.] = 0.
return tour_len

So for the results you showed for the tsp_20 and tsp_50 tasks, did you use the usual sum of distances that would run as is in the current uncommented code, or did you find it helped to do those affine then clip operations you have commented out? Are these values derived theoretically or empirically?

Any other thoughts here in terms of generalizing the reward to work with arbitrary tour length?

Thanks!

How to run the program

It was a paper I was interested in and I am very grateful for your program. I have a question about the program. Sorry for the rudimentary question, I have read LEADME but I don't know how to run the program because I have no knowledge of networking, shell commands, etc. Can you please tell me how to learn and test the program?

a question in the mwm2D problem

Could the input of the mwm2D be changed into a weight value matrix？I really need your help

Subset selection

Is this implementation useable for tasks where not all of the input is selected, but some subset of the input? I have seen partial implementation (such as the finish_symbol variable), but I believe it's not finished?

Meet lots of Deprecated warning with higher version of Pytorch. Do you have any idea?

Hi, thank you for your repo! It's an awesome implement and helps me a lot. However when running it with higher version of pytorch, it raises lots of deprecated warning as this:

IndexingUtils.h:20: UserWarning: indexing with dtype torch.uint8 is now deprecated, please use a dtype torch.bool instead.

I searched around the project but find uint8 nowhere. Do you have any idea how to fix this? Or do I have to run it at version of 0.4. (I use 1.2 currently since the API didn't change a lot since 0.4)

request return 400

run download_google_drive_file function,the result of request is 400.

variable length inputs

Hi, thanks a lot for your code. Haven't had a chance to look through it in detail yet but it seems like a pretty solid implementation. A question though: in your to do, I see an item:

"Add support for variable length inputs"

Just wanted to clarify what this means since in general a pointer net could deal with arbitrary size inputs, so what inputs is it referring to which apparently have a fixed length?

Thanks!

beam search

The To Do mentions "Finish implementing beam search decoding to support > 1 beam".

It looks like you have various beam width variables in different places of the code. Was your note because:

you implemented beam search but it didn't seem to work as well when width>1 and you wanted to investigate further, or
there are still some places in the code that would need to be modified to before using beam search?

If 2), then which places would need to be modified?

Thanks!

a question

Hi,

first thank you for your repo!

I just want to ask you, is there a source from where I can understand more about the architecture of your work, besides the mentioned paper?
I mean, maybe you wrote a blog post about your algorithm or something like this.

Kind regards,
Paddy

update:

Also, could you please tell how I can reuse the model which I've trained. I mean I finished the training and want to apply it for sorting a list, f.e.:

python model.py [3,4,6,5,1,2]

And the output would be a sorted list:

[1,2,3,4,5,6]

Why is the progress bar not moving in line for batch_id, sample_batch in enumerate(tqdm(training_dataloader, disable=args['disable_progress_bar'])):?

plot_reward.py: reward_csv?

Hi,
plot_reward.py looks like the script you used to create the figures in the README showing the average TSP tour length as a function of training progress.
In that script, there are two arguments which are required (paths to two things):

data
reward_csv

But in trainer.py I don't see these things being saved out (for example I don't see any pandas import or csv saving).
So then maybe you had another script to parse the logfiles and make this reward_csv? If so, could you provide it?

Thanks!

mask

Hello @pemami4911,

The problem really was with the mask. I've fixed it and the network started to learn. My Decoder now is:

class Decoder(nn.Module):
    def __init__(self, feactures_dim,hidden_size, n_layers=1):
        super(Decoder, self).__init__()
                
        self.W1 = Var(hidden_size, hidden_size)
        self.W2 = Var(hidden_size, hidden_size)
        self.b2 = Var(hidden_size)
        self.V = Var(hidden_size)
        
        self.lstm = nn.LSTM(hidden_size, hidden_size, batch_first=True, num_layers=n_layers)

    def forward(self, input, hidden, enc_outputs,mask, prev_idxs):
        
        w1e = torch.matmul(enc_outputs,self.W1)
        w2h = (torch.matmul(hidden[0][-1], self.W2) + self.b2).unsqueeze(1)
        
        u = F.tanh(w1e + w2h)
        a = torch.matmul(u, self.V)
        
        a, mask = self.apply_mask( a, mask, prev_idxs)
        a = F.softmax(a)
        res, hidden = self.lstm(input, hidden)          
        return a, hidden, mask
    
    def apply_mask(self, attentions, mask, prev_idxs):    
        if mask is None:
            mask = Variable(torch.ones(attentions.size())).cuda()
            
        maskk = mask.clone()

        if prev_idxs is not None:

            for i,j in zip(range(attentions.size(0)),prev_idxs.data):
                maskk[i,j[0]] = 0
                
            masked= maskk*attentions + maskk.log()
        else:
            masked = attentions  

        return masked, maskk

For n=10, I'm obtaining the following during training the AC version:


Step 0
Average train model loss:  -81.07959747314453
Average train critic loss:  29.443866729736328
Average train pred-reward:  -0.08028505742549896
Average train reward:  5.2866997718811035
Average loss:  -15.106804847717285
------------------------
Step  1000
Average train model loss:  -0.7814755792869255
Average train critic loss:  0.7740849611759186
Average train pred-reward:  4.219553744537756
Average train reward:  4.272005982398987
Average loss:  -6.201663847446442
------------------------

(...)

Step  19000
Average train model loss:  -0.06441724334075116
Average train critic loss:  0.1361817416474223
Average train pred-reward:  3.0573679950237276
Average train reward:  3.0583059163093567
Average loss:  -1.5689961900115013

I've checked and the returned solutions are all feasible so it seems that it is really converging.
I will clean my code and hope that by the end of the week I will have some training history plots and test set validation.
If you like, I can share my notebook.

Best regards

is this project still open?

can't find the SGD in criticnetwork and cant find the temperture in softmax

how to define a grouped knapsack task?

RuntimeError

Hi, when I run
python3 trainer.py
There comes an error:

Traceback (most recent call last):
File "trainer.py", line 203, in
R, probs, actions, actions_idxs = model(bat)
File "/home/maroon/.local/lib/python3.5/site-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/home/maroon/PycharmProjects/neural-combinatorial-rl-pytorch/neural_combinatorial_rl.py", line 539, in forward
R = self.objective_fn(actions, self.use_cuda)
File "/home/maroon/PycharmProjects/neural-combinatorial-rl-pytorch/sorting_task.py", line 49, in reward
current += res.float()
RuntimeError: invalid argument 1: the number of sizes provided must be greater or equal to the number of dimensions in the tensor at /pytorch/torch/lib/TH/generic/THTensor.c:299

How can I solve it? Thank you.

n_epochs > 1 : multinomial sampling error probs<0

Hi, for both the sorting task and the TSP task, when I run with any n_epochs > 1, toward the beginning of the 2nd epoch I get the following error in the stochastic decoder in neural_combinatorial_rl.py related to multinomial distribution sampling. When I print out "probs" I see that they very quickly converge to all 0's and a 1 distribution, then the following iteration are all nans. Any idea what's going on here, or did you just end up avoiding this by running with 1 epoch with many iterations?

Thanks!

(using the pytorch-0.4 branch)

(below is shown for the sort_10 task)

probs
tensor([[0.0000, 0.0000, 0.0000, 0.5416, 0.4584, 0.0000],
[0.0000, 0.0000, 0.5079, 0.4921, 0.0000, 0.0000],
[0.0000, 0.5320, 0.0000, 0.0000, 0.0000, 0.4680],
[0.0000, 0.0000, 0.0000, 0.5146, 0.4854, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000, 0.5275, 0.4725],
[0.0000, 0.0000, 0.4794, 0.0000, 0.0000, 0.5206],
[0.4945, 0.5055, 0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.5195, 0.0000, 0.4805, 0.0000],
[0.5135, 0.0000, 0.0000, 0.0000, 0.4865, 0.0000],
[0.4877, 0.0000, 0.5123, 0.0000, 0.0000, 0.0000],
[0.5207, 0.0000, 0.0000, 0.0000, 0.0000, 0.4793],
[0.0000, 0.5231, 0.0000, 0.0000, 0.0000, 0.4769]],
grad_fn=)
probs
tensor([[0., 0., 0., 1., 0., 0.],
[0., 0., 0., 1., 0., 0.],
[0., 0., 0., 0., 0., 1.],
[0., 0., 0., 1., 0., 0.],
[0., 0., 0., 0., 0., 1.],
[0., 0., 1., 0., 0., 0.],
[1., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 1., 0.],
[0., 0., 0., 0., 1., 0.],
[0., 0., 1., 0., 0., 0.],
[0., 0., 0., 0., 0., 1.],
[0., 0., 0., 0., 0., 1.]], grad_fn=)
probs
tensor([[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan]], grad_fn=)

... leads to...

Traceback (most recent call last):
File "trainer.py", line 295, in
R, probs, actions, actions_idxs = model(bat)
File "C:\Users\me\Anaconda3\envs\myenv\lib\site-packages\torch\nn\modules
\module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "C:\Users\me\Desktop\neural-combinatorial-rl-pytorch\neural_combinatorial_rl.py", line 514, in forward
probs_, action_idxs = self.actor_net(embedded_inputs)
File "C:\Users\me\Anaconda3\envs\myenv\lib\site-packages\torch\nn\modules
\module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "C:\Users\me\Desktop\neural-combinatorial-rl-pytorch\neural_combinatorial_rl.py", line 374, in forward
enc_h)
File "C:\Users\me\Anaconda3\envs\myenv\lib\site-packages\torch\nn\modules
\module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "C:\Users\me\Desktop\neural-combinatorial-rl-pytorch\neural_combinatorial_rl.py", line 189, in forward
selections)
File "C:\Users\me\Desktop\neural-combinatorial-rl-pytorch\neural_combinatorial_rl.py", line 255, in decode_stochastic
idxs = c.sample()
File "C:\Users\me\Anaconda3\envs\myenv\lib\site-packages\torch\distributions\categorical.py", line 90, in sample
sample_2d = torch.multinomial(probs_2d, 1, True)
RuntimeError: invalid argument 2: invalid multinomial distribution (encountering
probability entry < 0) at c:\programdata\miniconda3\conda-bld\pytorch_153309062
3466\work\aten\src\th\generic/THTensorRandom.cpp:407

pemami4911 / neural-combinatorial-rl-pytorch Goto Github PK

neural-combinatorial-rl-pytorch's People

Contributors

Stargazers

Watchers

Forkers

neural-combinatorial-rl-pytorch's Issues

Recommend Projects

Recommend Topics

Recommend Org