alexfrom0815 / online-3d-bpp-drl Goto Github PK

This repository contains the implementation of paper Online 3D Bin Packing with Constrained Deep Reinforcement Learning.

Python 100.00%

3d-packing bin-packing online-packing packing-algorithm reinforcement-learning

online-3d-bpp-drl's Introduction

Online 3D Bin Packing with Constrained Deep Reinforcement Learning

Online-3D-BPP-DRL

Video link of our project: YouTube, bilibili

This repository contains the implementation of the paper Online 3D Bin Packing with Constrained Deep Reinforcement Learning.

Install

To make this project work, there are two things you should do:
* Install Python packages in 'requirements.py' (by 'pip install -r requirements.txt').
* (This code works on Python 3.7)

Run

We provide a unified interface in 'main.py'. There are examples of running our project.

For training：

Example: Train a new model on sequences generated randomly.
You can run 'python main.py --mode train --use-cuda --item-seq rs'.
It will take about one day to get a model with satisfying performance.

You can run 'python main.py --help' for some information of common parameters.
There are many other parameters of our project in 'arguments.py', and all of them are given default values. You can change it if you like.

For test:

Example:
If you want to test a model trained on sequences generated by CUT-2 Algorithm(get more details in our article).
You can run 'python main.py --mode test --load-model --use-cuda --data-name cut_2.pt --load-name default_cut_2.pt'.

If you want to see how the model works in a lookahead setting,
You can run 'python main.py --mode test --load-model --use-cuda --data-name cut_2.pt --load-name default_cut_2.pt --preview x', x is the lookahead number.

Codes of user-study applications, multi-bin algorithm, and MCTS for comparison are also provided,
Please check 'user_study/', 'multi_bin/', 'MCTS/' for details.

Tips

* Different input state sizes need different kinds of CNN for encoding, you can adjust the network architecture in ./acktr/model.py to satisfy your needs. 

* Predicted mask is mainly for reducing MCTS computing costs. If you only need the BPP-1 model, you can replace the predicted mask with a ground-truth mask during the training and it will be easy for training.

* If you relax the constraint of stability rules, you may get a better result, but it may be dangerous in practice.

* The computing overhead of our implementation is sensitive to the length of the network layer, you should avoid a large network layer appearing in your network architecture. 

* Bin packing problem's difficulty is related to its item set. The trained model's performance is also affected by it.

Statement

Hang Zhao and Qijin She are co-authors of this repository.

Some codes are modified from the open-source project 'pytorch-a2c-ppo-acktr-gail' (https://github.com/ikostrikov/pytorch-a2c-ppo-acktr-gail).

License

Note that this source code is released only for academic use. Please do not use it for commercial purposes without authorization of the authors. The method is being patent protected. For commercial use, please contact Kai Xu ([email protected]).

Citation

If you are interested, please cite the following paper:

@inproceedings{DBLP:conf/aaai/ZhaoS0Y021,
  author    = {Hang Zhao and
               Qijin She and
               Chenyang Zhu and
               Yin Yang and
               Kai Xu},
  title     = {Online 3D Bin Packing with Constrained Deep Reinforcement Learning},
  booktitle = {Thirty-Fifth {AAAI} Conference on Artificial Intelligence, {AAAI}
               2021, Thirty-Third Conference on Innovative Applications of Artificial
               Intelligence, {IAAI} 2021, The Eleventh Symposium on Educational Advances
               in Artificial Intelligence, {EAAI} 2021, Virtual Event, February 2-9,
               2021},
  pages     = {741--749},
  publisher = {{AAAI} Press},
  year      = {2021},
  url       = {https://ojs.aaai.org/index.php/AAAI/article/view/16155},
  timestamp = {Wed, 02 Jun 2021 18:09:11 +0200},
  biburl    = {https://dblp.org/rec/conf/aaai/ZhaoS0Y021.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

online-3d-bpp-drl's People

Contributors

Stargazers

Watchers

Forkers

arastogi1997 hanrw doancongbang1991 f2wang ultraopxt while4 aliveho alifurkankamanli jdrew1303 zhou097 walter0002 penetrate18 zyh1994 q-ian jhc975 roscstr sharathtudelft pa-wan shitou208 szx971122 fallahim vanlam02 15280909820 bluedatawilson akchoi ai-research-group-publication yeu-kshire zhaocka spaceooooo t-will19 sirius-learning sylvanhuang dongchen117 lyujz hxlmyt zhaockab xiaochunyu costazd gabrielaramirez601 mbaranpeker haitaozhao shimaashabani peterzs gyyer asdlei99 gsisinna 22826165 zhibindeng chenhuayou hangj11 vincentzero wangyan-hlab parkjiyeoung8297 hong3731

online-3d-bpp-drl's Issues

RuntimeError: Error(s) in loading state_dict for Policy: size mismatch for base.critic_linear.bias

When I tried to test the trained model (using a linear schedule on the learning rate in config.py),there is an error:
RuntimeError: Error(s) in loading state_dict for Policy:
size mismatch for base.critic_linear.bias: copying a param with shape torch.Size([]) from checkpoint, the shape in current model is torch.Size([1]).

How to run the code with a2c

Since choosing the default acktr algorithm results in “RuntimeError: symeig_cuda: the algorithm failed to converge”, I chose to run the a2c algorithm, but it still stops with an error. The command to run a2c that I used is "python main.py --mode train --use-cuda --item-seq rs --algorithm a2c --lr 1e-6 --eps 1e-5 --alpha 0.99", and the error message is as follows:
File "main.py", line 183, in train_model
value_loss, action_loss, dist_entropy, prob_loss, graph_loss = agent.update(rollouts)
File "/mnt/Online-3D-BPP-DRL-main/acktr/algo/acktr_pipeline.py", line 59, in update
mask_len = self.args.container_size[0]*self.args.container_size[1]
AttributeError: 'NoneType' object has no attribute 'container_size'

By the way, before I ran into the above problem, I had changed “parser.add_argument( '--learning_rate' ...)” to “parser.add_argument( '--lr' )” in the arguments.py, thus avoiding the problem of learning_rate being inaccessible.

Unable to access or Execute the env files bin3D.py

After registering my envs i am trying to see why I am unable to run the code
Below is the screen shots when I try to figure about the environment and the command to execute the algorithm works

python3
Python 3.6.9 (default, Jan 26 2021, 15:33:00) 
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import gym
>>> import envs
>>> gym.make("Bpp-v0")
[2021-05-10 12:07:01,457] Making new env: Bpp-v0
/home/abc/.local/lib/python3.6/site-packages/gym/envs/registration.py:17: PkgResourcesDeprecationWarning: Parameters to load are deprecated.  Call .resolve and .require separately.
  result = entry_point.load(False)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/abc/.local/lib/python3.6/site-packages/gym/envs/registration.py", line 161, in make
    return registry.make(id)
  File "/home/abc/.local/lib/python3.6/site-packages/gym/envs/registration.py", line 119, in make
    env = spec.make()
  File "/home/abc/.local/lib/python3.6/site-packages/gym/envs/registration.py", line 86, in make
    env = cls(**self._kwargs)
  File "/home/abc/Online-3D-BPP-DRL/envs/bpp0/bin3D.py", line 20, in __init__
    assert box_set is not None
AssertionError

python3 main.py --mode train --load-model --use-cuda --item-seq sample
continue training model "default_cut_2.pt"
the dataset used:  cut_2.pt
the range of item size:   (2, 2, 2, 5, 5, 5)
the size of bin:   (10, 10, 10)
the number of known items:   1
item sequence generator:   sample
enable_rotation:  False
use cuda:   True
item set:  [(2, 2, 2), (2, 2, 3), (2, 2, 4), (2, 2, 5), (2, 3, 2), (2, 3, 3), (2, 3, 4), (2, 3, 5), (2, 4, 2), (2, 4, 3), (2, 4, 4), (2, 4, 5), (2, 5, 2), (2, 5, 3), (2, 5, 4), (2, 5, 5), (3, 2, 2), (3, 2, 3), (3, 2, 4), (3, 2, 5), (3, 3, 2), (3, 3, 3), (3, 3, 4), (3, 3, 5), (3, 4, 2), (3, 4, 3), (3, 4, 4), (3, 4, 5), (3, 5, 2), (3, 5, 3), (3, 5, 4), (3, 5, 5), (4, 2, 2), (4, 2, 3), (4, 2, 4), (4, 2, 5), (4, 3, 2), (4, 3, 3), (4, 3, 4), (4, 3, 5), (4, 4, 2), (4, 4, 3), (4, 4, 4), (4, 4, 5), (4, 5, 2), (4, 5, 3), (4, 5, 4), (4, 5, 5), (5, 2, 2), (5, 2, 3), (5, 2, 4), (5, 2, 5), (5, 3, 2), (5, 3, 3), (5, 3, 4), (5, 3, 5), (5, 4, 2), (5, 4, 3), (5, 4, 4), (5, 4, 5), (5, 5, 2), (5, 5, 3), (5, 5, 4), (5, 5, 5)]
please input the test name: test
Traceback (most recent call last):
  File "main.py", line 248, in <module>
    main(args)
  File "main.py", line 40, in main
    train_model()
  File "main.py", line 81, in train_model
    envs = make_vec_envs(env_name, config.seed, config.num_processes, config.gamma, config.log_dir, device, False)
  File "/home/abc/Online-3D-BPP-DRL/acktr/envs.py", line 103, in make_vec_envs
    data_name = None)
TypeError: make() got an unexpected keyword argument '_adjust_ratio'

Are there something wrong with readme?

Firstly, the code here with argement.py not config.py
Secondly, the train parameter with load model doesn't get true or false.
Also, the code here isn't accept (0,1,2,3) for Cuda device. (same as [0,1,2,3])

ValueError: cannot reshape array of size 1600 into shape (10,10)

Traceback (most recent call last):
File "E:/User002/Online-3D-BPP-DRL-main1/main.py", line 248, in
main(args)
File "E:/User002/Online-3D-BPP-DRL-main1/main.py", line 40, in main
train_model()
File "E:/User002/Online-3D-BPP-DRL-main1/main.py", line 135, in train_model
box_mask = get_possible_position(observation, config.container_size)
File "E:\User002\Online-3D-BPP-DRL-main1\acktr\utils.py", line 47, in get_possible_position
plain = box_info[0].reshape((container_size[0], container_size[1]))
ValueError: cannot reshape array of size 1600 into shape (10,10)

[Paper] Is Figure 2 (left) correct?

Hi, @alexfrom0815
Thank you for your excellent contribution to the online BPP research!

I met a problem when reading your paper.
As I can understand, you parameterize the bin as an LxW grid and calculate the height map accordingly.
Then you set the FLB point of an item as the reference to match the load point, and a feasible mask is calculated based on the constraints (1.enough space 2.stability).

However, when I try to reproduce the mask in Figure 2 with your code, the result was very different.

The points I'd like to verify:

Is the green item size 3 x 3 x3 in Figure 2?
I couldn't find the exact item size in your paper, but I use 3 x 3 x 3 when reproducing the mask.
Are the L and W symbols on the grid wrong (should be W to X and L to Y)?
In Figure 2, I think it shows X axis corresponds to L and Y axis to W, but I find in the code ./acktr/utils.py that x is to width and y to length.
Is the mask in Figure 2 correct?
The mask is confusing because I still can't reproduce it after adjusting the item size and x,y order.

I would appreciate it if you could kindly answer my questions.

the training speed problem

when the bin size and box size increase, the training is very slow. is there any way to improve it?

Learning Online-3D-BPP-DRL - Get amount of the used containers

Hello,

I'm studying the 3D Bin packing with this repository: [https://github.com/alexfrom0815/Online-3D-BPP-DRL].
I have some troubles with the evaluation.py file. I'm using my own data by generating a ".pt" but I don't know how to retrieve the number of used containers and the number of items packed in each one.
Could you please help with this?

Thanks in advance

Learning Online-3D-BPP-DRL - Get amount of the used containers

Hello and thank you for this very interesting article and the resources it provides.
I'm trying to reorient my career and I'm very interested in reinforcement learning,
I am studying your algorithms for solving containment problems and I would like to know the meaning of the terms (LASH, OnlineBPH, BR, MACS...etc) which are models of Deep Reinforcement Learning hyper Heuristic developed in your algorithm.

When I enabled the rotation, program: "RuntimeError: CUDA error: device-side assert triggered

Hi, alexfrom0815,

When I enabled the rotation , got errors like following.
Do you know what caused it?

Traceback (most recent call last):
  File "main.py", line 258, in <module>
    main(args)
  File "main.py", line 42, in main
    train_model()
  File "main.py", line 184, in train_model
    obs, reward, done, infos = envs.step(action)
  File "/mnt/.../baselines/common/vec_env/vec_env.py", line 107, in step
    self.step_async(actions)
  File "/mnt/.../Online-3D-BPP-DRL/acktr/envs.py", line 188, in step_async
    actions = actions.cpu().numpy()
RuntimeError: CUDA error: device-side assert triggered
/opt/conda/conda-bld/pytorch_1556653114079/work/aten/src/THC/THCTensorRandom.cuh:193: void sampleMultinomialOnce(long *, long, int, T *, T *, int, int) [with T = float, AccT = float]: block: [15,0,0], thread: [0,0,0] Assertion `THCNumerics<T>::ge(val, zero)` failed.
 ...
/opt/conda/conda-bld/pytorch_1556653114079/work/aten/src/THC/THCTensorRandom.cuh:193: void sampleMultinomialOnce(long *, long, int, T *, T *, int, int) [with T = float, AccT = float]: block: [15,0,0], thread: [191,0,0] Assertion `THCNumerics<T>::ge(val, zero)` failed.

ModuleNotFoundError: No module named 'baselines.common.vec_env.shmem_vec_env'

EOFError at connection.py

@alexfrom0815

During training I meet with a problem:

...
    (critic): Sequential(
      (0): Conv2d(64, 4, kernel_size=(1, 1), stride=(1, 1))
      (1): ReLU()
      (2): Flatten()
      (3): Linear(in_features=400, out_features=256, bias=True)
      (4): ReLU()
    )
    (critic_linear): Linear(in_features=256, out_features=1, bias=True)
  )
  (dist): Categorical(
    (linear): Linear(in_features=256, out_features=100, bias=True)
  )
)
Rotation: False
Process ForkProcess-1:
Traceback (most recent call last):
  File "/home/yhx/anaconda3/envs/online3dbpp/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/home/yhx/anaconda3/envs/online3dbpp/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/mnt/HDD_4T/Reps/baselines/baselines/common/vec_env/shmem_vec_env.py", line 123, in _subproc_worker
    cmd, data = pipe.recv()
  File "/home/yhx/anaconda3/envs/online3dbpp/lib/python3.7/multiprocessing/connection.py", line 250, in recv
    buf = self._recv_bytes()
  File "/home/yhx/anaconda3/envs/online3dbpp/lib/python3.7/multiprocessing/connection.py", line 407, in _recv_bytes
    buf = self._recv(4)
  File "/home/yhx/anaconda3/envs/online3dbpp/lib/python3.7/multiprocessing/connection.py", line 383, in _recv
    raise EOFError
EOFError

Debugging by printing out information, I found the problem of a segmentation fault around here:

(at kfac.py)
      if self.steps % self.Tf == 0:
          # My asynchronous implementation exists, I will add it later.
          # Experimenting with different ways to this in PyTorch.

          self.d_g[m], self.Q_g[m] = torch.symeig(
              self.m_gg[m], eigenvectors=True)
          self.d_a[m], self.Q_a[m] = torch.symeig(
              self.m_aa[m], eigenvectors=True)

          self.d_a[m].mul_((self.d_a[m] > 1e-6).float())
          self.d_g[m].mul_((self.d_g[m] > 1e-6).float())

I guess my problem is at torch.symeig, since I found several issues about this. But different from their running, the code stopped at the first episode (instead of stopping after several hours of training). Is there any solution to this problem? Great thanks!

How to map to real-world data?

Hello
Thanks for sharing this awesome work.

The proposed method works good in the simulation for a smaller statespace. But how did you map the algorithm to the real world input such as high resolution image(ex:640x480) from camera.

RuntimeError: CUDA out of memory.

When I tried to use a different dataset to train new model , there is a error:
RuntimeError: CUDA out of memory. Tried to allocate 828.00 MiB (GPU 0; 8.00 GiB total capacity; 5.02 GiB already allocated; 209.38 MiB free; 5.18 GiB reserved in total by PyTorch)

Extrapolating to 4D

Can this method be used for 4D bin packing as well? Is yes, what changes are needed to be made?

visual graph

Hello, I want to reproduce the three-dimensional packing diagram in this paper, such as Figure 1, Figure 4, Figure 18, Figure 19. How to generate graphics like this？What should I do, please ?
Looking forward to your reply, thank you for sharing

gym registration

Hello,

I tried to register the "envs/Bpp-v0" evirionment by adding the following code in file ./Online-3D-BPP-DRL/envs/bpp0/init.py, but I am getting errors as "gym.error.UnregisteredEnv: No registered env with id: Bpp-v0".

Could you please advise?

Thank you.

from gym.envs.registration import register
register(
id='Bpp-v0',
entry_point='bpp0.bin3D:PackingGame',
)

RuntimeError: symeig_cuda: the algorithm failed to converge

Traceback (most recent call last):
File "main.py", line 258, in
main(args)
File "main.py", line 42, in main
train_model()
File "main.py", line 209, in train_model
value_loss, action_loss, dist_entropy, prob_loss, graph_loss = agent.update(rollouts)
File "D:\Online-3D-BPP-DRL-main\acktr\algo\acktr_pipeline.py", line 98, in update
self.optimizer.step()
File "C:\Users\Sty\anaconda3\envs\TF-GPU\lib\site-packages\torch\optim\optimizer.py", line 88, in wrapper
return func(*args, **kwargs)
File "D:\Online-3D-BPP-DRL-main\acktr\algo\kfac.py", line 215, in step
self.d_a[m], self.Q_a[m] = torch.symeig(
RuntimeError: symeig_cuda: the algorithm failed to converge; 1001 off-diagonal elements of an intermediate tridiagonal form did not converge to zero.

BrokenPipeError

Hello and thank you for this very interesting article and the resources it provides.
During training I meet with a problem, for train:
BrokenPipeError: [WinError 232] 管道正在被关闭。
F:\anaconda3\envs\3D-BPP-DRL\lib\site-packages\torch\nn_reduction.py:42: UserWarning: size_average and reduce args will be deprecated, please use reduction='none' instead.
warnings.warn(warning.format(ret))
Traceback (most recent call last):
File "F:/Online-3D-BPP-DRL-main/main.py", line 234, in
main(args)
File "F:/Online-3D-BPP-DRL-main/main.py", line 24, in main
train_model(args)
File "F:/Online-3D-BPP-DRL-main/main.py", line 122, in train_model
obs = envs.reset()
File "F:\Online-3D-BPP-DRL-main\acktr\envs.py", line 178, in reset
obs = self.venv.reset()
File "F:\Online-3D-BPP-DRL-main\baselines\common\vec_env\vec_normalize.py", line 47, in reset
obs = self.venv.reset()
File "F:\Online-3D-BPP-DRL-main\baselines\common\vec_env\shmem_vec_env.py", line 66, in reset
pipe.send(('reset', None))
File "F:\anaconda3\envs\3D-BPP-DRL\lib\multiprocessing\connection.py", line 206, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "F:\anaconda3\envs\3D-BPP-DRL\lib\multiprocessing\connection.py", line 280, in _send_bytes
ov, err = _winapi.WriteFile(self._handle, buf, overlapped=True)
BrokenPipeError: [WinError 232] 管道正在被关闭。

Process finished with exit code 1

Great thanks!