Code Monkey home page Code Monkey logo

viewformer's Introduction

ViewFormer: NeRF-free Neural Rendering from Few Images Using Transformers

Official implementation of ViewFormer. ViewFormer is a NeRF-free neural rendering model based on the transformer architecture. The model is capable of both novel view synthesis and camera pose estimation. It is evaluated on previously unseen 3D scenes.

Paper    Web    Demo


Open In Colab Python Versions

Citation

If you use this code for an academic publication, please cite the corresponding paper using the following citation:

@inproceedings{kulhanek2022viewformer,
  title={ViewFormer: NeRF-free Neural Rendering from Few Images Using Transformers},
  author={Kulh{\'a}nek, Jon{\'a}{\v{s}} and Derner, Erik and Sattler, Torsten and Babu{\v{s}}ka, Robert},
  booktitle={European Conference on Computer Vision (ECCV)},
  year={2022},
}

Getting started

Start by creating a python 3.8 venv. From the activated environment, you can run the following command in the directory containing setup.py:

pip install -e .

Model checkpoints

All model checkpoints are available online here:

https://data.ciirc.cvut.cz/public/projects/2022ViewFormer/checkpoints

All evaluation commands will download and extract the appropriate checkpoint automatically if you specify the checkpoint as one of the following:

7scenes-finetune-both-transformer-tf	 
7scenes-finetune-transformer-transformer-tf
7scenes-finetuned-interiornet-codebook-th
co3d-10cat-codebook-th
co3d-all-codebook-th
co3dv2-all-codebook-th
co3d-10cat-noloc-transformer-tf
co3d-10cat-transformer-tf
co3d-all-noloc-transformer-tf
co3dv2-all-noloc-transformer-tf
interiornet-codebook-th
interiornet-transformer-tf
shapenet-srn-codebook-th
shapenet-srn-transforner-tf
sm7-codebook-th
sm7-transformer-tf

All evaluation commands will need one transformer model (a model ending with -transformer-tf), and the associated codebook model (a model ending with -codebook-th). Please read the evaluation section for more details. You are also advised to explore the demo notebook.

Predictions

If you want to compare with our method, you can download the predictions here:

https://data.ciirc.cvut.cz/public/projects/2022ViewFormer/predictions/

If any dataset is missing, please open an issue.

Getting datasets

In this section, we describe how you can prepare the data for training. We assume that you have your environment ready and you want to store the dataset into {output path} directory.

Shepard-Metzler-Parts-7

Please, first visit https://github.com/deepmind/gqn-datasets.

viewformer-cli dataset generate \
    --loader sm7 \
    --image-size 128 \
    --output {output path}/sm7 \
    --max-sequences-per-shard 2000 \
    --split train

viewformer-cli dataset generate \
    --loader sm7 \
    --image-size 128 \
    --output {output path}/sm7 \
    --max-sequences-per-shard 2000 \
    --split test

InteriorNet

Download the dataset into the directory {source} by following the instruction here: https://interiornet.org/. Then, proceed as follows:

viewformer-cli dataset generate \
    --loader interiornet \
    --path {source} \
    --image-size 128  \
    --output {output path}/interiornet \
    --max-sequences-per-shard 50 \
    --shuffle \
    --split train

viewformer-cli dataset generate \
    --loader interiornet \
    --path {source} \
    --image-size 128  \
    --output {output path}/interiornet \
    --max-sequences-per-shard 50 \
    --shuffle \
    --split test

Common Objects in 3D

Download the dataset into the directory {source} by following the instruction here: https://ai.facebook.com/datasets/CO3D-dataset.

Install the following dependencies: plyfile>=0.7.4 pytorch3d. Then, generate the dataset for 10 categories as follows:

viewformer-cli dataset generate \
    --loader co3d \
    --path {source} \
    --image-size 128  \
    --output {output path}/co3d \
    --max-images-per-shard 6000 \
    --shuffle \
    --categories "plant,teddybear,suitcase,bench,ball,cake,vase,hydrant,apple,donut" \
    --split train

viewformer-cli dataset generate \
    --loader co3d \
    --path {source} \
    --image-size 128  \
    --output {output path}/co3d \
    --max-images-per-shard 6000 \
    --shuffle \
    --categories "plant,teddybear,suitcase,bench,ball,cake,vase,hydrant,apple,donut" \
    --split val

Alternatively, generate the full dataset as follows:

viewformer-cli dataset generate \
    --loader co3d \
    --path {source} \
    --image-size 128  \
    --output {output path}/co3d \
    --max-images-per-shard 6000 \
    --shuffle \
    --split train

viewformer-cli dataset generate \
    --loader co3d \
    --path {source} \
    --image-size 128  \
    --output {output path}/co3d \
    --max-images-per-shard 6000 \
    --shuffle \
    --split val

ShapeNet cars and chairs dataset

Download and extract the SRN datasets into the directory {source}. The files can be found here: https://drive.google.com/drive/folders/1OkYgeRcIcLOFu1ft5mRODWNQaPJ0ps90.

Then, generate the dataset as follows:

viewformer-cli dataset generate \
    --loader shapenet \
    --path {source} \
    --image-size 128  \
    --output {output path}/shapenet-{category}/shapenet \
    --categories {category} \
    --max-sequences-per-shard 50 \
    --shuffle \
    --split train

viewformer-cli dataset generate \
    --loader shapenet \
    --path {source} \
    --image-size 128  \
    --output {output path}/shapenet-{category}/shapenet \
    --categories {category} \
    --max-sequences-per-shard 50 \
    --shuffle \
    --split test

where {category} is either cars or chairs.

Faster preprocessing

In order to make the preprocessing faster, you can add --shards {process id}/{num processes} to the command and run multiple instances of the command in multiple processes.

Training the codebook model

The codebook model training uses the PyTorch framework, but the resulting model can be loaded by both TensorFlow and PyTorch. The training code was also prepared for TensorFlow framework, but in order to get the same results as published in the paper, PyTorch code should be used. To train the codebook model on 8 GPUs, run the following code:

viewformer-cli train codebook \
    --job-dir . \
    --dataset "{dataset path}" \
    --num-gpus 8 \
    --batch-size 352 \
    --n-embed 1024 \
    --learning-rate 1.584e-3 \
    --total-steps 200000

Replace {dataset path} by the real dataset path. Note that you can use more than one dataset. In that case, the dataset paths should be separated by a comma. Also, if the size of dataset is not large enough to support sharding, you can reduce the number of data loading workers by using --num-val-workers and --num-workers arguments. The argument --job-dir specifies the path where the resulting model and logs will be stored. You can also use the --wandb flag, that enables logging to wandb.

Finetuning the codebook model

If you want to finetune an existing codebook model, add --resume-from-checkpoint "{checkpoint path}" to the command and increase the number of total steps.

Transforming the dataset into the code representation

Before the transformer model can be trained, the dataset has to be transformed into the code representation. This can be achieved by running the following command (on a single GPU):

viewformer-cli generate-codes \
    --model "{codebook model checkpoint}" \
    --dataset "{dataset path}" \
    --output "{code dataset path}" \
    --batch-size 64 

We assume that the codebook model checkpoint path (ending with .ckpt) is {codebook model checkpoint} and the original dataset is stored in {dataset path}. The resulting dataset will be stored in {code dataset path}.

Training the transformer model

To train the models with the same hyper-parameters as in the paper, run the commands from the following sections based on the target dataset. We assume that the codebook model checkpoint path (ending with .ckpt) is {codebook model checkpoint} and the associated code dataset is located in {code dataset path}. All commands use 8 GPUs (in our case 8 NVIDIA A100 GPUs).

InteriorNet training

viewformer-cli train transformer \
    --dataset "{code dataset path}" \
    --codebook-model "{codebook model checkpoint}" \
    --sequence-size 20 \
    --n-loss-skip 4 \
    --batch-size 40 \
    --fp16 \
    --total-steps 200000 \
    --localization-weight 5. \
    --learning-rate 8e-5 \
    --weight-decay 0.01 \
    --job-dir . \
    --pose-multiplier 1.

For the variant without localization, use --localization-weight 0. Similarly, for the variant without novel view synthesis, use --image-generation-weight 0.

CO3D finetuning

In order to finetune the model for 10 categories, use the following command:

viewformer-cli train finetune-transformer \
    --dataset "{code dataset path}" \
    --codebook-model "{codebook model checkpoint}" \
    --sequence-size 10 \
    --n-loss-skip 1 \
    --batch-size 80 \
    --fp16 \
    --localization-weight 5 \
    --learning-rate 1e-4 \
    --total-steps 40000 \
    --epochs 40 \
    --weight-decay 0.05 \
    --job-dir . \
    --pose-multiplier 0.05 \
    --checkpoint "{interiornet transformer model checkpoint}"

Here {interiornet transformer model checkpoint} is the path to the InteriorNet checkpoint (usually ending with weights.model.099-last). For the variant without localization, use --localization-weight 0.

For all categories and including localization:

viewformer-cli train finetune-transformer \
    --dataset "{code dataset path}" \
    --codebook-model "{codebook model checkpoint}" \
    --sequence-size 10 \
    --n-loss-skip 1 \
    --batch-size 40 \
    --localization-weight 5 \
    --gradient-clip-val 1. \
    --learning-rate 1e-4 \
    --total-steps 100000 \
    --epochs 100 \
    --weight-decay 0.05 \
    --job-dir . \
    --pose-multiplier 0.05 \
    --checkpoint "{interiornet transformer model checkpoint}"

Here {interiornet transformer model checkpoint} is the path to the InteriorNet checkpoint (usually ending with weights.model.099-last).

For all categories without localization:

viewformer-cli train finetune-transformer \
    --dataset "{code dataset path}" \
    --codebook-model "{codebook model checkpoint}" \
    --sequence-size 10 \
    --n-loss-skip 1 \
    --batch-size 40 \
    --localization-weight 5 \
    --learning-rate 1e-4 \
    --total-steps 100000 \
    --epochs 100 \
    --weight-decay 0.05 \
    --job-dir . \
    --pose-multiplier 0.05 \
    --checkpoint "{interiornet transformer model checkpoint}"

Here {interiornet transformer model checkpoint} is the path to the InteriorNet checkpoint (usually ending with weights.model.099-last).

7-Scenes finetuning

viewformer-cli train finetune-transformer \
    --dataset "{code dataset path}" \
    --codebook-model "{codebook model checkpoint}" \
    --localization-weight 5 \
    --pose-multiplier 5. \
    --batch-size 40 \
    --fp16 \
    --learning-rate 1e-5 \
    --job-dir .  \
    --total-steps 10000 \
    --epochs 10 \
    --checkpoint "{interiornet transformer model checkpoint}"

Here {interiornet transformer model checkpoint} is the path to the InteriorNet checkpoint (usually ending with weights.model.099-last).

ShapeNet finetuning

viewformer-cli train finetune-transformer \
    --dataset "{cars code dataset path},{chairs code dataset path}" \
    --codebook-model "{codebook model checkpoint}" \
    --localization-weight 1 \
    --pose-multiplier 1 \
    --n-loss-skip 1 \
    --sequence-size 4 \
    --batch-size 64 \
    --learning-rate 1e-4 \
    --gradient-clip-val 1 \
    --job-dir .  \
    --total-steps 100000 \
    --epochs 100 \
    --weight-decay 0.05 \
    --checkpoint "{interiornet transformer model checkpoint}"

Here {interiornet transformer model checkpoint} is the path to the InteriorNet checkpoint (usually ending with weights.model.099-last).

SM7 training

viewformer-cli train transformer \
    --dataset "{code dataset path}" \
    --codebook-model "{codebook model checkpoint}" \
    --sequence-size 6 \
    --n-loss-skip 1 \
    --batch-size 128 \
    --fp16 \
    --total-steps 120000 \
    --localization-weight "cosine(0,1,120000)" \
    --learning-rate 1e-4 \
    --weight-decay 0.01 \
    --job-dir . \
    --pose-multiplier 0.2

You can safely replace the cosine schedule for localization weight with a constant term.

Evaluation

Codebook evaluation

In order to evaluate the codebook model, run the following:

viewformer-cli evaluate codebook \
    --codebook-model "{codebook model checkpoint}" \
    --loader-path "{dataset path}" \
    --loader dataset \
    --loader-split test \
    --batch-size 64 \
    --image-size 128 \
    --num-store-images 0 \
    --num-eval-images 1000 \
    --job-dir . 

Note that --image-size argument controls the image size used for computing the metrics. You can change it to a different value.

General transformer evaluation

In order to evaluate the transformer model, run the following:

viewformer-cli evaluate transformer \
    --codebook-model "{codebook model checkpoint}" \
    --transformer-model "{transformer model checkpoint}" \
    --loader-path "{dataset path}" \
    --loader dataset \
    --loader-split test \
    --batch-size 1 \
    --image-size 128 \
    --job-dir . \
    --num-eval-sequences 1000

Optionally, you can use --sequence-size to control the context size used for evaluation. Note that --image-size argument controls the image size used for computing the metrics. You can change it to a different value.

Transformer evaluation with different context sizes

In order to evaluate the transformer model with multiple context sizes, run the following:

viewformer-cli evaluate transformer-multictx \
    --codebook-model "{codebook model checkpoint}" \
    --transformer-model "{transformer model checkpoint}" \
    --loader-path "{dataset path}" \
    --loader dataset \
    --loader-split test \
    --batch-size 1 \
    --image-size 128 \
    --job-dir . \
    --num-eval-sequences 1000

Note that --image-size argument controls the image size used for computing the metrics. You can change it to a different value.

CO3D evaluation

In order to evaluate the transformer model on the CO3D dataset, run the following:

viewformer-cli evaluate \
    --codebook-model "{codebook model checkpoint}" \
    --transformer-model "{transformer model checkpoint}" \
    --path {original CO3D root}
    --job-dir . 

7-Scenes evaluation

In order to evaluate the transformer model on the 7-Scenes dataset, run the following:

viewformer-cli evaluate 7scenes \
    --codebook-model "{codebook model checkpoint}" \
    --transformer-model "{transformer model checkpoint}" \
    --path {original 7-Scenes root}
    --batch-size 1
    --job-dir .
    --num-store-images 0
    --top-n-matched-images 10
    --image-match-map {path to top10 matched images}

You can change --top-n-matched-images to 0 if you don't want to use top 10 closest images in the context. {path to top10 matched images} as a path to the file containing the map between most similar images from the test and the train sets. Each line is in the format {relative test image path} {relative train image path}.

CO3Dv2 challenge evaluation

In order to evaluate the transformer model on the CO3Dv2 dataset, install the dataset first. Follow the instructions here: https://github.com/facebookresearch/co3d. Since at the moment if you install the dataset the click version gets downgraded, please fix it by running the following

pip install --upgrade click

Download the CO3Dv2 dataset to a {co3dv2 dataset root} directory and run the following:

viewformer-cli evaluate co3dv2-challenge \
    --dataset-root "{co3dv2 dataset root}" \
    --output "{output path}" \
    --split "{split (dev/test)}"

By default the appropriate checkpoint gets automatically downloaded. You can change it by supplying your own --codebook-model and --transformer-model.

Thanks

We would like to express our sincere gratitude to the authors of the following repositories, that we used in our code:

viewformer's People

Contributors

jkulhanek avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

viewformer's Issues

Onnx build error in Demo ipynb

I'm getting this build error for onnx in the ipynb:

Building wheels for collected packages: viewformer, onnx
  Building wheel for viewformer (setup.py) ... done
  Created wheel for viewformer: filename=viewformer-0.0.1-py3-none-any.whl size=125182 sha256=0927054eae3adc599285f862a25d93f8f1628c7ace07150b682bd56870c1f9e7
  Stored in directory: /tmp/pip-ephem-wheel-cache-_u0h7mpm/wheels/1d/2a/36/002883d36cc65fdcbb1b83521fd65bf3759367c93f31db008f
  error: subprocess-exited-with-error
  
  × Building wheel for onnx (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> See above for output.
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  Building wheel for onnx (pyproject.toml) ... error
  ERROR: Failed building wheel for onnx
Successfully built viewformer
Failed to build onnx
ERROR: Could not build wheels for onnx, which is required to install pyproject.toml-based projects

Any idea how to fix it?

I can't find config.json

When I evaluated the code on Co3D datasets using "viewformer-cli evaluate co3d
--codebook-model "./viewformer-master/co3d-10cat-codebook-th.tar.gz"
--transformer-model "./viewformer-master/co3d-10cat-transformer-tf.tar.gz"
--path "./data/co3d_toy/"
--job-dir .
" ,
there was an error "tensorflow.python.framework.errors_impl.NotFoundError: ./viewformer-master/config.json; No such file or directory". I couldn't find any config.json in the code, did I miss something or how can I get it?

Some questions about the paper

Hello and thanks for sharing this great work!

I would like to ask some questions, since I'm interested in using this work to extract meshes of the object for unseen instances of a seen (during training) category

  • First of all: is this method capable of retrieving the 3D model from a single view within the same category? (Testing on instances not seen during the training)

  • Is the extracted 3D model with the correct scale?

  • Is there any method in this repo to generate and export mesh as .obj/.ply ?

Thanks in advance

Evaluation script

Hello! Thanks for the awesome work in this repo. I am have successfully downloaded a pertained model to run forward pass; however, when I try to use the evaluation script I run into the following error.

WARNING:absl:`0` is not a valid tf.function parameter name. Sanitizing to arg_0.
WARNING:absl:`1` is not a valid tf.function parameter name. Sanitizing to arg_1.
Traceback (most recent call last):
  File "/opt/conda/envs/vlp/bin/viewformer-cli", line 33, in <module>
    sys.exit(load_entry_point('viewformer', 'console_scripts', 'viewformer-cli')())
  File "/opt/conda/envs/vlp/lib/python3.8/site-packages/click/core.py", line 1130, in _call_
    return self.main(*args, **kwargs)
  File "/opt/conda/envs/vlp/lib/python3.8/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/opt/conda/envs/vlp/lib/python3.8/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/opt/conda/envs/vlp/lib/python3.8/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/opt/conda/envs/vlp/lib/python3.8/site-packages/aparse/click.py", line 104, in invoke
    return super().invoke(ctx)
  File "/opt/conda/envs/vlp/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/opt/conda/envs/vlp/lib/python3.8/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/home/ec2-user/novel-cross-view-generation/viewformer/viewformer/evaluate/evaluate_transformer.py", line 218, in main
    codebook_model = load_model(codebook_model)
  File "/home/ec2-user/novel-cross-view-generation/viewformer/viewformer/utils/tensorflow.py", line 50, in load_model
    model = AutoModel.from_config(th_model.config)
  File "/home/ec2-user/novel-cross-view-generation/viewformer/viewformer/models/__init__.py", line 35, in from_config
    return cls(config, **kwargs)
  File "/home/ec2-user/novel-cross-view-generation/viewformer/viewformer/models/vqgan.py", line 270, in _init_
    self.perceptual_loss = lpips(net='vgg')
  File "/home/ec2-user/novel-cross-view-generation/viewformer/viewformer/models/utils.py", line 298, in lpips
    model = load_lpips_model(net=net)
  File "/home/ec2-user/novel-cross-view-generation/viewformer/viewformer/models/utils.py", line 293, in load_lpips_model
    onnx_tf.backend.prepare(model).export_graph(f'{path}.pb')
  File "/opt/conda/envs/vlp/lib/python3.8/site-packages/onnx_tf/backend_rep.py", line 115, in export_graph
    signatures=self.tf_module.__call__.get_concrete_function(
  File "/opt/conda/envs/vlp/lib/python3.8/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py", line 1258, in get_concrete_function
    concrete = self._get_concrete_function_garbage_collected(*args, **kwargs)
  File "/opt/conda/envs/vlp/lib/python3.8/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py", line 1238, in _get_concrete_function_garbage_collected
    self._initialize(args, kwargs, add_initializers_to=initializers)
  File "/opt/conda/envs/vlp/lib/python3.8/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py", line 763, in _initialize
    self._variable_creation_fn    # pylint: disable=protected-access
  File "/opt/conda/envs/vlp/lib/python3.8/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compiler.py", line 171, in _get_concrete_function_internal_garbage_collected
    concrete_function, _ = self._maybe_define_concrete_function(args, kwargs)
  File "/opt/conda/envs/vlp/lib/python3.8/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compiler.py", line 166, in _maybe_define_concrete_function
    return self._maybe_define_function(args, kwargs)
  File "/opt/conda/envs/vlp/lib/python3.8/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compiler.py", line 396, in _maybe_define_function
    concrete_function = self._create_concrete_function(
  File "/opt/conda/envs/vlp/lib/python3.8/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compiler.py", line 300, in _create_concrete_function
    func_graph_module.func_graph_from_py_func(
  File "/opt/conda/envs/vlp/lib/python3.8/site-packages/tensorflow/python/framework/func_graph.py", line 1214, in func_graph_from_py_func
    func_outputs = python_func(*func_args, **func_kwargs)
  File "/opt/conda/envs/vlp/lib/python3.8/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py", line 667, in wrapped_fn
    out = weak_wrapped_fn().__wrapped__(*args, **kwds)
  File "/opt/conda/envs/vlp/lib/python3.8/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compiler.py", line 484, in bound_method_wrapper
    return wrapped_fn(*args, **kwargs)
  File "/opt/conda/envs/vlp/lib/python3.8/site-packages/tensorflow/python/framework/func_graph.py", line 1200, in autograph_handler
    raise e.ag_error_metadata.to_exception(e)
  File "/opt/conda/envs/vlp/lib/python3.8/site-packages/tensorflow/python/framework/func_graph.py", line 1189, in autograph_handler
    return autograph.converted_call(
  File "/opt/conda/envs/vlp/lib/python3.8/site-packages/tensorflow/python/autograph/impl/api.py", line 439, in converted_call
    result = converted_f(*effective_args, **kwargs)
  File "/tmp/__autograph_generated_filenb6tpkki.py", line 30, in tf____call__
    ag__.for_stmt(ag__.ld(self).graph_def.node, None, loop_body, get_state, set_state, (), {'iterate_names': 'node'})
  File "/opt/conda/envs/vlp/lib/python3.8/site-packages/tensorflow/python/autograph/operators/control_flow.py", line 451, in for_stmt
    for_fn(iter_, extra_test, body, get_state, set_state, symbol_names, opts)
  File "/opt/conda/envs/vlp/lib/python3.8/site-packages/tensorflow/python/autograph/operators/control_flow.py", line 502, in _py_for_stmt
    body(target)
  File "/opt/conda/envs/vlp/lib/python3.8/site-packages/tensorflow/python/autograph/operators/control_flow.py", line 468, in protected_body
    original_body(protected_iter)
  File "/tmp/__autograph_generated_filenb6tpkki.py", line 23, in loop_body
    output_ops = ag__.converted_call(ag__.ld(self).backend._onnx_node_to_tensorflow_op, (ag__.ld(onnx_node), ag__.ld(tensor_dict), ag__.ld(self).handlers), dict(opset=ag__.ld(self).opset, strict=ag__.ld(self).strict), fscope)
  File "/opt/conda/envs/vlp/lib/python3.8/site-packages/tensorflow/python/autograph/impl/api.py", line 439, in converted_call
    result = converted_f(*effective_args, **kwargs)
  File "/tmp/__autograph_generated_filerq2x04bj.py", line 50, in tf___onnx_node_to_tensorflow_op
    ag__.if_stmt(ag__.ld(handlers), if_body_1, else_body_1, get_state_1, set_state_1, ('do_return', 'retval_'), 2)
  File "/opt/conda/envs/vlp/lib/python3.8/site-packages/tensorflow/python/autograph/operators/control_flow.py", line 1266, in if_stmt
    _py_if_stmt(cond, body, orelse)
  File "/opt/conda/envs/vlp/lib/python3.8/site-packages/tensorflow/python/autograph/operators/control_flow.py", line 1319, in _py_if_stmt
    return body() if cond else orelse()
  File "/tmp/__autograph_generated_filerq2x04bj.py", line 44, in if_body_1
    ag__.if_stmt(ag__.ld(handler), if_body, else_body, get_state, set_state, ('do_return', 'retval_'), 2)
  File "/opt/conda/envs/vlp/lib/python3.8/site-packages/tensorflow/python/autograph/operators/control_flow.py", line 1266, in if_stmt
    _py_if_stmt(cond, body, orelse)
  File "/opt/conda/envs/vlp/lib/python3.8/site-packages/tensorflow/python/autograph/operators/control_flow.py", line 1319, in _py_if_stmt
    return body() if cond else orelse()
  File "/tmp/__autograph_generated_filerq2x04bj.py", line 36, in if_body
    retval_ = ag__.converted_call(ag__.ld(handler).handle, (ag__.ld(node),), dict(tensor_dict=ag__.ld(tensor_dict), strict=ag__.ld(strict)), fscope)
  File "/opt/conda/envs/vlp/lib/python3.8/site-packages/tensorflow/python/autograph/impl/api.py", line 439, in converted_call
    result = converted_f(*effective_args, **kwargs)
  File "/tmp/__autograph_generated_filesv6ir83a.py", line 34, in tf__handle
    ag__.if_stmt(ag__.ld(ver_handle), if_body, else_body, get_state, set_state, ('do_return', 'retval_'), 2)
  File "/opt/conda/envs/vlp/lib/python3.8/site-packages/tensorflow/python/autograph/operators/control_flow.py", line 1266, in if_stmt
    _py_if_stmt(cond, body, orelse)
  File "/opt/conda/envs/vlp/lib/python3.8/site-packages/tensorflow/python/autograph/operators/control_flow.py", line 1319, in _py_if_stmt
    return body() if cond else orelse()
  File "/tmp/__autograph_generated_filesv6ir83a.py", line 23, in if_body
    ag__.converted_call(ag__.ld(cls).args_check, (ag__.ld(node),), dict(**ag__.ld(kwargs)), fscope)
  File "/opt/conda/envs/vlp/lib/python3.8/site-packages/tensorflow/python/autograph/impl/api.py", line 439, in converted_call
    result = converted_f(*effective_args, **kwargs)
  File "/tmp/__autograph_generated_file7c2tsg8q.py", line 8, in tf__args_check
    dtype = ag__.ld(kwargs)['tensor_dict'][ag__.ld(node).inputs[0]].dtype
tensorflow.python.autograph.pyct.error_utils.KeyError: in user code:

    File "/opt/conda/envs/vlp/lib/python3.8/site-packages/onnx_tf/backend_tf_module.py", line 98, in _call_  *
        output_ops = self.backend._onnx_node_to_tensorflow_op(onnx_node,
    File "/opt/conda/envs/vlp/lib/python3.8/site-packages/onnx_tf/backend.py", line 289, in _onnx_node_to_tensorflow_op  *
        return handler.handle(node, tensor_dict=tensor_dict, strict=strict)
    File "/opt/conda/envs/vlp/lib/python3.8/site-packages/onnx_tf/handlers/handler.py", line 58, in handle  *
        cls.args_check(node, **kwargs)
    File "/opt/conda/envs/vlp/lib/python3.8/site-packages/onnx_tf/handlers/backend/sub.py", line 24, in args_check  *
        dtype = kwargs["tensor_dict"][node.inputs[0]].dtype

    KeyError: '0'

This issue seems similar to the one addressed in issue #6, but when I try the same hack it does not seem to patch the issue. I was wondering if you had any ideas on how to address the issue?

PyTorch Version

Hi,

Many thanks for your excellent work.

I am truly interested in your work and currently trying to play the code.

May I ask if is there any PyTorch version code for training and testing?

Thanks a lot.

Difference in results with CO3Dv1 models and CO3Dv2 models

I am trying to generate some visuals with ViewFormer on CO3Dv2 and I would like to double check a few things.

The changes that I know from v1 to v2 are:

  1. the input image is now 4 channels, with the first 3 being masked rgb with black background, and the last channel being a binary mask.

However, I am getting very different results using the same code but with different models.

The first gif is rendered using co3d-10cat-noloc-transformer-tf while the second gif is rendered using co3dv2-all-noloc-transformer-tf

The first gif looks reasonable but the second gif looks suspicious.

It would be great if you can provide some pointers for me to debug this. Thank you so much!

hydrant_000_ 12, 31, 25 _v1
hydrant_000_ 12, 31, 25 _v2

Dataloader generation not unzipping all files

Like mentioned in this issue , I tried putting several downloaded InteriorNet zip files in the HD7 folder and ran the dataset generation command

viewformer-cli dataset generate --loader interiornet --path /home/ec2-user/novel-cross-view-generation/viewformer/data --image-size 128 --output /home/ec2-user/novel-cross-view-generation/viewformer/dataset/interiornet --max-sequences-per-shard 50 --shuffle --split test

However, I am interested in using > 20 context images, but currently the dataset generation only unzips 20 images per sequence. How can I set it to unzip more of them?

CO3Dv2 Evaluation Code Error

Thanks again for uploading the CO3D weights and code so promptly.

When It try to run

viewformer-cli evaluate co3dv2-challenge --dataset-root /drive2/datasets/co3d --output output --split dev

I receive an error

File "/home/~/code/viewformer/viewformer/evaluate/evaluate_co3dv2_challenge.py", line 98, in main
    prediction_batch = make_batch([
  File "/home/~/code/viewformer/viewformer/evaluate/evaluate_co3dv2_challenge.py", line 99, in <listcomp>
    frame_annotation_map[(x, y)] for x, y, _ in eval_batch
ValueError: too many values to unpack (expected 3)

It appears that for some reason my eval_batch is simply a list and not a nested list

printing eval_batch gives

['171_18628_34226', 87, 'apple/171_18628_34226/images/frame000087.jpg']

What are some steps I can take to debug this issue? Thanks!

Using custom images

As mentioned here #8 , there is no loader implemented currently to load data in a generic format. It would be good to support loading COLMAP models to enable people to run ViewFormer on video sequences.

Currently, if you want to write your own loader, it should be quite easy if you start from the 7Scenes loader.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.