python run.py --graph_size 50 --eval_only --load_path pretrained/tsp50-epoch-198.pt --T_max 5000 \
--val_size 10000 --val_dataset datasets/tsp_50_10000.pkl --val_m 1 --no_tb --no_saving --no_DDP
'val_dataset': 'datasets/tsp_50_10000.pkl',
'val_m': 1,
'val_size': 10000,
'world_size': 1}
TSP with 50 nodes. Do assert: False
{'Total': 280801, 'Trainable': 280801}
Distributed: False
[*] Loading data from pretrained/tsp50-epoch-198.pt
Inference with x1 augments...
10000 instances initialized.
rollout: 1%|▏ | 58/5000 [00:21<30:37, 2.69it/s]
For TSP 100, I need to reduce the batch size, for example, to 4096. And the time for the rollout of a batch becomes 40 minutes.
According to the paper, they should be 6 and 18 minutes respectively (for one GPU and 512 batch size). As far as I can see, the GPU used for the paper is also TITAN RTX. Are there any other techniques used to speed up the inference?