Comments (3)
Can you explain what errors you are getting?
from graphcmr.
In trainer.py
, I did the following modifications.
# create GraphCNN
self.graph_cnn = GraphCNN(self.mesh.adjmat,
self.mesh.ref_vertices.t(),
num_channels=self.options.num_channels,
num_layers=self.options.num_layers)
self.graph_cnn = torch.nn.DataParallel(self.graph_cnn, device_ids=[0, 1]).to(self.device)
# SMPL Parameter regressor
self.smpl_param_regressor = SMPLParamRegressor()
self.smpl_param_regressor = torch.nn.DataParallel(self.smpl_param_regressor, device_ids=[0, 1]).to(self.device)
Note that self.device=device(type='cuda')
. My machine has two gpus so that their ids are [0, 1]
. To avoid the pre-trained weights mismatching issue, I tried to train from scratch.
But I got the following messages:
Traceback (most recent call last):
File "train.py", line 21, in <module>
trainer.train()
File "/home/GraphCMR/utils/base_trainer.py", line 65, in train
out = self.train_step(batch)
File "/home/GraphCMR/train/trainer.py", line 144, in train_step
pred_vertices_sub, pred_camera = self.graph_cnn(images)
File "/home/anaconda3/envs/gcnn/lib/python2.7/site-packages/torch/nn/modules/module.py", line 489, in __call__
result = self.forward(*input, **kwargs)
File "/home/anaconda3/envs/gcnn/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py", line 143, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/home/anaconda3/envs/gcnn/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py", line 153, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/home/anaconda3/envs/gcnn/lib/python2.7/site-packages/torch/nn/parallel/parallel_apply.py", line 83, in parallel_apply
raise output
RuntimeError: arguments are located on different GPUs at /pytorch/aten/src/THC/generic/THCTensorMathBlas.cu:479
Since each batch was split to different mini-batch for gpus, it seems like the model did not access the right one.
from graphcmr.
Actually when constructing the GraphResBlock, I pass in the graph adjacency matrix that is a sparse tensor. I tried to store it as a buffer but this is not possible because sparse tensors are not serializable, thus it will crash when trying to save a checkpoint. So you have to bring this further up in the pipeline and pass it in your call to forward. But even with this trick it might not work because last time I checked PyTorch did not supported batches of sparse tensors. If you make it work feel free to submit a PR if you want.
from graphcmr.
Related Issues (20)
- How to train model with human3.6m datasets HOT 1
- Preprocess of ground truth keypoints_3d on Human3.6M HOT 1
- question when I read the code HOT 1
- questions about "mesh_downsampling.npz" HOT 1
- Regarding fully connected baseline HOT 4
- Compute 'A', 'D', 'U' matrices HOT 6
- How to get 3d joints from demo.py and visualize it HOT 3
- About the SMPLParamRegressor
- Praise from a newbie HOT 1
- Why do you use different focal length for training and inference?
- Running βdemo.py' can't get good results HOT 2
- The problem of camera parameter βscβ HOT 1
- Loading Resnet50 pretrained?
- Asking for the weight of losses
- how to retrain this model in a new dataset with the real SMPL model
- preprocess datasets of h36m.py
- run demo.py
- wrong mesh volume
- The additional files could not be obtained.
- Pretrained model HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from graphcmr.