Comments (15)
This is due to a discrepancy in PyTorch version and we're trying to solve it. For now, only PyTorch 0.4.0 is supported. We'll update once we fix it.
from vid2vid.
This should be fixed now. Please pull the latest code and try again.
from vid2vid.
i pulled the latest code, when i reinstall flownet2-pytorch, i got below errors:
nvcc fatal : Unsupported gpu architecture 'compute_70'
error: command '/usr/local/cuda/bin/nvcc' failed with exit status 1
is it only suport cuda9.0?
from vid2vid.
@zzzkk2009
Now I use previous version of the code and change pytorch version to 0.4.0 .
And it works .
〒▽〒
from vid2vid.
While everything may not be ok,
The result directory doesn`t contain a file name index.html .
There are many images in the result directory.
from vid2vid.
i commented
'-gencode', 'arch=compute_70,code=sm_70',
'-gencode', 'arch=compute_70,code=compute_70'
these toe lines in
/vid2vid/models/flownet2_pytorch/networks/channelnorm_package/set_up.py
/vid2vid/models/flownet2_pytorch/networks/resample2d_package/set_up.py
/vid2vid/models/flownet2_pytorch/networks/correlation_package/set_up.py
files, and i can get the result dir, and i also have not index.html file in the dir, same with you.
from vid2vid.
@zzzkk2009
╮(╯▽╰)╭
from vid2vid.
@tcwang0509 Segmentation fault still exists
from vid2vid.
@kekedan are you able to run flownet2?
from vid2vid.
@tcwang0509
run flownet2 failure,and I use pytorch 0.2 to Compile flownet2 ,now it works.
from vid2vid.
@tcwang0509 Hi, im trying to test the model, by running 'bash ./scripts/street/test_2048.sh', but I am also getting segmentation fault. I am using CUDA 9.2, and torch 0.4.1. I also get segmentation error when I try training on my own dataset. I downloaded flownet2 by running the provided script.
from vid2vid.
Hi @tcwang0509, the segmentation fault issue still exists.
torch=0.4.1
cuda=9.0
flownet2
compiles ok. Testing the pre-trained cityscapes model is ok. But training on single or multi gpu both trigger the seg fault. Here's the complete stdout when training on single gpu:
------------ Options -------------
TTUR: False
add_face_disc: False
basic_point_only: False
batchSize: 1
beta1: 0.5
checkpoints_dir: ./checkpoints
continue_train: False
dataroot: datasets/Cityscapes/
dataset_mode: temporal
debug: False
densepose_only: False
display_freq: 100
display_id: 0
display_winsize: 512
feat_num: 3
fg: True
fg_labels: [26]
fineSize: 512
fp16: False
gan_mode: ls
gpu_ids: [0]
input_nc: 3
isTrain: True
label_feat: False
label_nc: 35
lambda_F: 10.0
lambda_T: 10.0
lambda_feat: 10.0
loadSize: 256
load_features: False
load_pretrain:
local_rank: 0
lr: 0.0002
max_dataset_size: inf
max_frames_backpropagate: 1
max_frames_per_gpu: 6
max_t_step: 1
model: vid2vid
nThreads: 2
n_blocks: 9
n_blocks_local: 3
n_downsample_E: 3
n_downsample_G: 2
n_frames_D: 3
n_frames_G: 3
n_frames_total: 6
n_gpus_gen: 1
n_layers_D: 3
n_local_enhancers: 1
n_scales_spatial: 1
n_scales_temporal: 2
name: label2city_256_g1
ndf: 64
nef: 32
netE: simple
netG: composite
ngf: 128
niter: 10
niter_decay: 10
niter_fix_global: 0
niter_step: 5
no_canny_edge: False
no_dist_map: False
no_first_img: False
no_flip: False
no_flow: False
no_ganFeat: False
no_html: False
no_vgg: False
norm: batch
num_D: 1
openpose_only: False
output_nc: 3
phase: train
pool_size: 1
print_freq: 100
random_drop_prob: 0.05
random_scale_points: False
remove_face_labels: False
resize_or_crop: scaleWidth
save_epoch_freq: 1
save_latest_freq: 1000
serial_batches: False
sparse_D: False
tf_log: False
use_instance: True
use_single_G: False
which_epoch: latest
-------------- End ----------------
CustomDatasetDataLoader
dataset [TemporalDataset] was created
#training videos = 6
vid2vid
---------- Networks initialized -------------
-----------------------------------------------
---------- Networks initialized -------------
-----------------------------------------------
create web directory ./checkpoints/label2city_256_g1/web...
Segmentation fault
Many thanks.
from vid2vid.
@michaelshiyu Hi, did you solve the segmentation fault error? I meet the same problem...
CUDA 9.0
Pytorch 1.0.0
from vid2vid.
Hi, @zhuhaozh!
Yes, it works now after I installed things by reading into the Dockerfile and following the set-ups there. There are some bugs in the Dockerfile, I think. For example, I think the desired environment uses cuda 9.0
but the torch
install instruction there would have you install a PyTorch version compiled with cuda 8.0
, which will result in extremely slow runtimes if your cuda
is actually 9.0.
I'm not sure what caused the segfault earlier so I will just post as much info about my current working set-up as possible. Hopefully, this would work for you and anyone else stuck with this issue.
Right now my working environment has
GPU: NVIDIA Tesla V100 w/ driver version 390.30
python 3.5.6
cuda 9.0
cudnn 7
And here's the complete output of my conda list
. This might be more information than you need though.
# Name Version Build Channel
_libgcc_mutex 0.1 main
absl-py 0.7.1 pypi_0 pypi
astor 0.8.0 pypi_0 pypi
backcall 0.1.0 pypi_0 pypi
ca-certificates 2019.5.15 0
certifi 2018.8.24 py35_1
cffi 1.12.3 pypi_0 pypi
chardet 3.0.4 pypi_0 pypi
colorama 0.3.7 pypi_0 pypi
cycler 0.10.0 pypi_0 pypi
decorator 4.4.0 pypi_0 pypi
dill 0.3.0 pypi_0 pypi
dominate 2.3.5 pypi_0 pypi
future 0.17.1 pypi_0 pypi
gast 0.2.2 pypi_0 pypi
grpcio 1.22.0 pypi_0 pypi
h5py 2.9.0 pypi_0 pypi
idna 2.8 pypi_0 pypi
imageio 2.5.0 pypi_0 pypi
ipython 7.6.1 pypi_0 pypi
ipython-genutils 0.2.0 pypi_0 pypi
jedi 0.14.1 pypi_0 pypi
keras-applications 1.0.8 pypi_0 pypi
keras-preprocessing 1.1.0 pypi_0 pypi
kiwisolver 1.1.0 pypi_0 pypi
libedit 3.1.20181209 hc058e9b_0
libffi 3.2.1 hd88cf55_4
libgcc-ng 9.1.0 hdf63c60_0
libstdcxx-ng 9.1.0 hdf63c60_0
markdown 3.1.1 pypi_0 pypi
matplotlib 3.0.3 pypi_0 pypi
mock 3.0.5 pypi_0 pypi
ncurses 6.1 he6710b0_1
networkx 2.3 pypi_0 pypi
numpy 1.16.4 pypi_0 pypi
opencv-python 4.1.0.25 pypi_0 pypi
openssl 1.0.2s h7b6447c_0
parso 0.5.1 pypi_0 pypi
pexpect 4.7.0 pypi_0 pypi
pickleshare 0.7.5 pypi_0 pypi
pillow 6.1.0 pypi_0 pypi
pip 19.1.1 pypi_0 pypi
prompt-toolkit 2.0.9 pypi_0 pypi
protobuf 3.9.0 pypi_0 pypi
ptyprocess 0.6.0 pypi_0 pypi
pycparser 2.19 pypi_0 pypi
pygments 2.4.2 pypi_0 pypi
pyparsing 2.4.1 pypi_0 pypi
python 3.5.6 hc3d631a_0
python-dateutil 2.8.0 pypi_0 pypi
pytz 2019.1 pypi_0 pypi
pywavelets 1.0.3 pypi_0 pypi
readline 7.0 h7b6447c_5
requests 2.22.0 pypi_0 pypi
scikit-image 0.15.0 pypi_0 pypi
scipy 1.2.0 pypi_0 pypi
setproctitle 1.1.10 pypi_0 pypi
setuptools 40.2.0 py35_0
six 1.12.0 pypi_0 pypi
sqlite 3.29.0 h7b6447c_0
tensorboard 1.13.1 pypi_0 pypi
tensorboardx 1.8 pypi_0 pypi
tensorflow 1.13.1 pypi_0 pypi
tensorflow-estimator 1.13.0 pypi_0 pypi
termcolor 1.1.0 pypi_0 pypi
tk 8.6.8 hbc83047_0
torch 0.4.0 pypi_0 pypi
torchvision 0.2.0 pypi_0 pypi
tqdm 4.32.2 pypi_0 pypi
traitlets 4.3.2 pypi_0 pypi
urllib3 1.25.3 pypi_0 pypi
wcwidth 0.1.7 pypi_0 pypi
werkzeug 0.15.5 pypi_0 pypi
wheel 0.31.1 py35_0
xz 5.2.4 h14c3975_4
zlib 1.2.11 h7b6447c_3
from vid2vid.
i have tested on several cuda,cudnn and pytorch version ,the latest vesion is pytorch1.0.1 cuda9.0 cudnn7.1.2,but all the version met the same error(segmentation fault(core dumped)). i have no idea to solve the problem.
Many thanks!!!
from vid2vid.
Related Issues (20)
- [QUESTION] Google colab HOT 1
- Where is the "pre-trained segmentation algorithm" in the repo?
- Sequence length: How to limit that to 30, it is increasing automatically as the no. of epochs is increasing
- Using Openpose docker image with vid2vid
- Is there any ways to improve the output quality of the pose model?
- Sometimes ran into RuntimeError: Given groups=1, weight of size [64, 18, 7, 7]... when training. HOT 3
- Inference time - How much FPS is possible?
- Error HOT 1
- How to test “pose-to-body”?
- RuntimeError: CUDA error: throwing an instance of 'c10::Error' HOT 1
- Vid HOT 1
- RuntimeError: Legacy autograd function with non-static forward method is deprecated. HOT 2
- FID evaluation
- Errors when running on CPU without CUDA
- RuntimeError: DataLoader worker (pid(s) 22100) exited unexpectedly HOT 1
- AttributeError: module 'torch._C' has no attribute '_cuda_setDevice' HOT 1
- "Pretrained network G0 has fewer layers..."
- Nivda
- Download gdrive doesn't work - need to manually download model for now HOT 2
- 伟大的英伟达,这一晃就过了六年,英伟达栽树后人乘凉
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vid2vid.