sizhean / panohead Goto Github PK

Code Repository for CVPR 2023 Paper "PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360 degree"

License: Creative Commons Zero v1.0 Universal

Python 83.50% Shell 0.16% C++ 4.02% Cuda 12.32%

panohead's Introduction

PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360°

PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360°
Sizhe An, Hongyi Xu, Yichun Shi, Guoxian Song, Umit Y. Ogras, Linjie Luo
https://sizhean.github.io/panohead

Abstract: Synthesis and reconstruction of 3D human head has gained increasing interests in computer vision and computer graphics recently. Existing state-of-the-art 3D generative adversarial networks (GANs) for 3D human head synthesis are either limited to near-frontal views or hard to preserve 3D consistency in large view angles. We propose PanoHead, the first 3D-aware generative model that enables high-quality view-consistent image synthesis of full heads in 360° with diverse appearance and detailed geometry using only in-the-wild unstructured images for training. At its core, we lift up the representation power of recent 3D GANs and bridge the data alignment gap when training from in-the-wild images with widely distributed views. Specifically, we propose a novel two-stage self-adaptive image alignment for robust 3D GAN training. We further introduce a tri-grid neural volume representation that effectively addresses front-face and back-head feature entanglement rooted in the widely-adopted tri-plane formulation. Our method instills prior knowledge of 2D image segmentation in adversarial learning of 3D neural scene structures, enabling compositable head synthesis in diverse backgrounds. Benefiting from these designs, our method significantly outperforms previous 3D GANs, generating high-quality 3D heads with accurate geometry and diverse appearances, even with long wavy and afro hairstyles, renderable from arbitrary poses. Furthermore, we show that our system can reconstruct full 3D heads from single input images for personalized realistic 3D avatars.

Requirements

We recommend Linux for performance and compatibility reasons.
1–8 high-end NVIDIA GPUs. We have done all testing and development using V100, RTX3090, and A100 GPUs.
64-bit Python 3.8 and PyTorch 1.11.0 (or later). See https://pytorch.org for PyTorch install instructions.
CUDA toolkit 11.3 or later. (Why is a separate CUDA toolkit installation required? We use the custom CUDA extensions from the StyleGAN3 repo. Please see Troubleshooting).
Python libraries: see environment.yml for exact library dependencies. You can use the following commands with Miniconda3 to create and activate your Python environment:
- cd PanoHead
- conda env create -f environment.yml
- conda activate panohead

Getting started

Download the whole models folder from link and put it under the root dir.

Pre-trained networks are stored as *.pkl files that can be referenced using local filenames.

Generating results

# Generate videos using pre-trained model

python gen_videos.py --network models/easy-khair-180-gpc0.8-trans10-025000.pkl \
--seeds 0-3 --grid 2x2 --outdir=out --cfg Head --trunc 0.7

# Generate images and shapes (as .mrc files) using pre-trained model

python gen_samples.py --outdir=out --trunc=0.7 --shapes=true --seeds=0-3 \
    --network models/easy-khair-180-gpc0.8-trans10-025000.pkl

Applications

# Generate full head reconstruction from a single RGB image.
# Please refer to ./gen_pti_script.sh
# For this application we need to specify dataset folder instead of zip files.
# Segmentation files are not necessary for PTI inversion.

./gen_pti_script.sh

# Generate full head interpolation from two seeds.
# Please refer to ./gen_interpolation.py for the implementation

python gen_interpolation.py --network models/easy-khair-180-gpc0.8-trans10-025000.pkl\
        --trunc 0.7 --outdir interpolation_out

Using networks from Python

You can use pre-trained networks in your own Python code as follows:

with open('*.pkl', 'rb') as f:
    G = pickle.load(f)['G_ema'].cuda()  # torch.nn.Module
z = torch.randn([1, G.z_dim]).cuda()    # latent codes
c = torch.cat([cam2world_pose.reshape(-1, 16), intrinsics.reshape(-1, 9)], 1) # camera parameters
img = G(z, c)['image']                           # NCHW, float32, dynamic range [-1, +1], no truncation
mask = G(z, c)['image_mask']                    # NHW, int8, [0,255]

The above code requires torch_utils and dnnlib to be accessible via PYTHONPATH. It does not need source code for the networks themselves — their class definitions are loaded from the pickle via torch_utils.persistence.

The pickle contains three networks. 'G' and 'D' are instantaneous snapshots taken during training, and 'G_ema' represents a moving average of the generator weights over several training steps. The networks are regular instances of torch.nn.Module, with all of their parameters and buffers placed on the CPU at import and gradient computation disabled by default.

Datasets

FFHQ-F(ullhead) consists of Flickr-Faces-HQ dataset, K-Hairstyle dataset, and an in-house human head dataset. For head pose estimation, we use WHENet.

Due to the license issue, we are not able to release FFHQ-F dataset that we used to train the model. test_data_img and test_data_seg are just an example for showing the dataset struture. For the camera pose convention, please refer to EG3D.

Datasets format

For training purpose, we can use either zip files or normal folder for image dataset and segmentation dataset. For PTI, we need to use folder.

To compress dataset folder to zip file, we can use dataset_tool_seg.

For example:

python dataset_tool_seg.py --img_source dataset/testdata_img --seg_source  dataset/testdata_seg --img_dest dataset/testdata_img.zip --seg_dest dataset/testdata_seg.zip --resolution 512x512

Obtaining camera pose and cropping the images

Please follow the guide

Obtaining segmentation masks

You can try using deeplabv3 or other off-the-shelf tool to generate the masks. For example, using deeplabv3: misc/segmentation_example.py

Training

Examples of training using train.py:

# Train with StyleGAN2 backbone from scratch with raw neural rendering resolution=64, using 8 GPUs.
# with segmentation mask, trigrid_depth@3, self-adaptive camera pose loss regularizer@10

python train.py --outdir training-runs  --img_data dataset/testdata_img.zip --seg_data dataset/testdata_seg.zip --cfg=ffhq --batch=32 --gpus 8\\
--gamma=1 --gamma_seg=1 --gen_pose_cond=True --mirror=1 --use_torgb_raw=1 --decoder_activation="none" --disc_module MaskDualDiscriminatorV2\\
--bcg_reg_prob 0.2 --triplane_depth 3 --density_noise_fade_kimg 200 --density_reg 0 --min_yaw 0 --max_yaw 180 --back_repeat 4 --trans_reg 10 --gpc_reg_prob 0.7


# Second stage finetuning to 128 neural rendering resolution (optional).

python train.py --outdir results --img_data dataset/testdata_img.zip --seg_data dataset/testdata_seg.zip --cfg=ffhq --batch=32 --gpus 8\\
--resume=~/training-runs/experiment_dir/network-snapshot-025000.pkl\\
--gamma=1 --gamma_seg=1 --gen_pose_cond=True --mirror=1 --use_torgb_raw=1 --decoder_activation="none" --disc_module MaskDualDiscriminatorV2\\
--bcg_reg_prob 0.2 --triplane_depth 3 --density_noise_fade_kimg 200 --density_reg 0 --min_yaw 0 --max_yaw 180 --back_repeat 4 --trans_reg 10 --gpc_reg_prob 0.7\\
--neural_rendering_resolution_final=128 --resume_kimg 1000

Metrics

./get_metrics.sh

There are three evaluation modes: all, front, and back as we mentioned in the paper. Please refer to cal_metrics.py for the implementation.

Citation

If you find our repo helpful, please cite our paper using the following bib:

@InProceedings{An_2023_CVPR,
    author    = {An, Sizhe and Xu, Hongyi and Shi, Yichun and Song, Guoxian and Ogras, Umit Y. and Luo, Linjie},
    title     = {PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360deg},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2023},
    pages     = {20950-20959}
}

Development

This is a research reference implementation and is treated as a one-time code drop. As such, we do not accept outside code contributions in the form of pull requests.

Acknowledgements

We thank Shuhong Chen for the discussion during Sizhe's internship.

This repo is heavily based off the NVlabs/eg3d repo; Huge thanks to the EG3D authors for releasing their code!

panohead's People

Contributors

Stargazers

Watchers

Forkers

scott-mao easychart slimevrx chikayan mohit-choithwani mevengue treksis phuoctuyentu phunters camenduru 41xu manick94 hack-mans dongmeng22 imejis grinco mustaphau tomchapin yiyu jackzhousz if-ai zero1018 fujohnwang xmas25 s-sc omaresguerra peterzs h3yliam 0thc zsxkib maddigit harrison001 hirajanwin mslsoftech hadryan fingerx flyarong videofeedback pablodawson aigongshe darth-veitcher best-ai-technologies xuejianbujia jaynoel alpay41 dorucioclea redfishiaven mgwade528 donaldfacun pocher zhannaobi trailaptrinh hogani gen-ai-experts trenza1ore hhy5277 rysk4 lerjaas aivicom yanahsur clarkeyyy420 andry060492 pdragonlabs ismatullah46 ahmadrj24 veo555 danny-b00y debunkorama emerytate357 max-energe lanyan520 asgrrewe dylanturner843 geralan1 ameblo1 bogdanlevin grigorysavin maxwellamaral kekewind frankie1900 moonwmh kickback-space co0lx lingzhiyanna tenngre wadewaltherwilliam zxwzxw kyktyttam commerceless onehundredfeet kirmo20 iamleon121 wangqinghuatudou daitomanabe perfume-dev imiracle landis007 zots0127 kmitch92 ilully0802

panohead's Issues

How to export texture?

Thanks for your great work!

My goal: is to import highly detailed meshes and textures into 3D software like Blender, Unreal for rendering

How do export mesh and textures (diffuse, specular, normal)?

Like image

Is Mask Actually Used in Inversion?

Hi, thank you for the brilliant work!

May I just ask a quick question -- are the masks actually used during the PTI inversion? The code in projector_withseg.py seems to only read the image directly from the given path, without reading/using the provided masks at all, is this expected?

If so, may I ask if it is actually possible to utilize a mask during the inversion? At the moment, it seems that inversion would fail pretty badly if the image contains a large area of hair.

Thank you in advance!

Bad cases detection?

Hi, thanks to your sharing of code and pretrained model!
Now I am trying to generate lots of human face pictures with PanoHead, and sometimes I get some bad generation pictures, which you have mentioned in the limitation part in CVPR supplementary material.
I am wondering is it possible to detect and filter bad generation results in an automatic way?

Model checkpoint: easy-khair-180-gpc0.8-trans10-025000.pkl

Thanks!

There are something wrong on the reproduction.

I executed this command: "python gen_samples.py --outdir=out --trunc=0.7 --shapes=true --seeds=0-3 --network models/easy-khair-180-gpc0.8-trans10-025000.pkl", but the generated shape (.mrc) and the rendered image are nothing.

Here is the link to access the failure cases: BaiduYun: https://pan.baidu.com/s/12dA0EA4ceVvzelvuYTN1Dw, code: 5546

How Do I Get This to Run on Colab?

I'm on Google Colab:

%%shell
git clone https://github.com/zsxkib/replicate-pano-head.git

ls

pip install numpy click pillow scipy torch requests tqdm ninja matplotlib imageio imgui glfw pyopengl imageio-ffmpeg pyspng psutil mrcfile tensorboard torchvision

cd replicate-pano-head

mkdir "/content/replicate-pano-head/models/"

FILE_ID=1FqvQzICV1H4fbQaz8BiWxtiRYxJd4T8N
DEST_PATH="/content/replicate-pano-head/models/easy-khair-180-gpc0.8-trans10-025000.pkl"
wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=${FILE_ID}' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')&id=${FILE_ID}" -O ${DEST_PATH} && rm -rf /tmp/cookies.txt

This all works. But when I run the following command I get a weird CUDA error. This is running on one T4 GPU:

python /content/replicate-pano-head/gen_videos.py --network /content/replicate-pano-head/models/easy-khair-180-gpc0.8-trans10-025000.pkl --seeds 0

Loading networks from "/content/replicate-pano-head/models/easy-khair-180-gpc0.8-trans10-025000.pkl"...
Traceback (most recent call last):
  File "/content/replicate-pano-head/gen_videos.py", line 371, in <module>
    generate_images() # pylint: disable=no-value-for-parameter
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/content/replicate-pano-head/gen_videos.py", line 334, in generate_images
    G = legacy.load_network_pkl(f)['G_ema'].to(device) # type: ignore
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1145, in to
    return self._apply(convert)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 797, in _apply
    module._apply(fn)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 797, in _apply
    module._apply(fn)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 797, in _apply
    module._apply(fn)
  [Previous line repeated 2 more times]
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 820, in _apply
    param_applied = fn(param)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1143, in convert
    return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
RuntimeError: CUDA error: invalid device ordinal
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

P.S. I have all the files required

How to switch to stylegan3，have you tried it？

Thanks to the author's work!

mesh outputs

Could you share please some of your output meshes with corresponding images?

How to use this for our custom images?

Hello, I would like to know how can I use my own images(downloaded from the internet or clicked from camera) and generate a 360 degree 3d model? Do we have to generate latents using StyleGAN3 and then feed this to the model?

ResolvePackageNotFound

ResolvePackageNotFound:

cudatoolkit=11.1

Not sure how to solve this issue

Am I generating colors correctly?

I would like some advice on extracting a voxel representation for generating colors.

I was able to extract very poor vertex color by editing these lines in the G.sample_mixed loop in gen_videos_proj_withseg.py:

sample_result = G.sample_mixed(...)
sigmas[:, head:head+max_batch] = sample_result['sigma']
color_batch = G.torgb(sample_result['rgb'].transpose(1,2)[...,None], ws[0,0,0,:1])
colors[:, head:head+max_batch] = np.transpose(color_batch[...,0], (2, 1, 0))

If I look for the nearest color on the isosurface mesh, it gives me this:

But when I look at the render I see this:

I realize that the render has a final superresolution pass that makes it so clear, but I feel like I might be missing something.

My understanding of the process is something like:

G.sample_mixed takes the samples (xyz coordinates in a 3d grid) and the transformed_ray_directions_expanded (which is just 0,0,-1) and w (which is the latent vectors of shape (14,512) from the mapping network output, combining latent and camera pose) and then outputs a few results (sigma, rgb, and a copy of xyz).
The rgb is not actually rgb, but it is a 32 dimensional feature vector. So we have to decode it to RGB using the G.torgb network. This is what I find tricky. The network seems designed to process 2D images, but here we only have a bundle of N=10M feature vectors. So I pass it in a 10Mx1 image, and I hope this is ok. Also, torgb expects only a single w from the 14 options. I just picked the first one ws[0,0,0,:1] but I'm not sure if this is correct. Would it be better to run torgb for each w and then average them, or find the median, or something else?
Finally, I convert these resulting colors back to voxel space and then use the mesh vertex locations to lookup the closest color.

My questions are:

Is it ok to give torgb a 10Mx1 image or is this damaging the performance of the feature-to-color conversion?
Is it ok to only use the first ws or should I be using multiple ones somehow? Are each of the ws latents representing a different camera pose, or do they represent something else?

Thanks @SizheAn!

Unable to reproduce the post rendering

Hi
I tried everything similar as said in the google colab unfortunately, I could able to get the output as shown in the youtube video.
my post output looks like this.

Could you please tell me what is going wrong here.

thanks in advance.

tri-grid projection

Hello, thank you for sharing your interesting work!

The authors of EG3D mentioned that their triplane projection is actually a bi-plane projection (XY, XZ, ZX) (see NVlabs/eg3d#67). After examining your code, it seems that this issue is still present. I'm wondering if you have considered the potential problems that may arise from this projection method in the proposed tri-grid?

I look forward to your kind reply!

Possibility to export to .GLB / .GLTF / .OBJ formats?

Hi, I know FBX is a proprietary format, so I'll leave that out.

My question is if there's a possibility to take all the "normal" light calculation (generated from the normal map from all 360°) and generate a full 3D mesh in the .glb/gltf/obj formats? (maybe with embedded textures)?

It would be so helpful if the seeded prompt (face image) with each of their 360° turnaround sequences, could be UV unwrapped on the face for this 3D generated model.
The idea would be to generate the texture with "flat" light, so no shadows are baked in the original seeded generated image.
After creating the normal map, it would then generate a list of 3d points in space (use Z forward coordinate system, please), thus generating the mesh.
From there, (since all human heads will be almost the same) a universal UV unwrapping code can open the UV space texture like this.

And since it's taking 1 picture per degree to generate the correctly anatomically accurate skin, eyes, hair, maybe that texture information could also baked in "degrees" progressing from left to right in the previous generated UVd space.

This would be one step closer to actually generate human characters for games and background props for animation in general 3D applications.

Please let me know.
Kind regards.
-Pierre.

training time

What is the approximate training time for using a single RTX3090?

How to use real images for avatars?

Hi, authors, thanks for the great contribution of this work. I am wondering how to preprocess the real images into a format like FFHQ-F dataset that you used. Thank you very much!

Best,
Mike

era

Benchmarks V100, RTX3090, and A100

What are the benchmarks or average seconds it takes for each of the hardware listed?

project real imagen

can you tell me how to project real images and then use it to generate 3d head.

try !python projector.py --target /content/vv --outdir=out --network /content/PanoHead/models/easy-khair-180-gpc0.8-trans10-025000.pkl

but it gives the following error

/content/PanoHead
Loading networks from "/content/PanoHead/models/easy-khair-180-gpc0.8-trans10-025000.pkl"...
projecting: [0] /content/vv/34r43.png
camera matrix: torch.Size([1, 0])
Computing W midpoint and stddev using 10000 samples...
Setting up PyTorch plugin "bias_act_plugin"... Done.
Traceback (most recent call last):
  File "/content/PanoHead/projector.py", line 348, in <module>
    run_projection() # pylint: disable=no-value-for-parameter
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/content/PanoHead/projector.py", line 293, in run_projection
    projected_w_steps = project(
  File "/content/PanoHead/projector.py", line 109, in project
    synth_images = G.synthesis(ws, c=c, noise_mode='const')['image']
  File "<string>", line 87, in synthesis
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/content/PanoHead/training/volumetric_rendering/ray_sampler.py", line 47, in forward
    x_cam = uv[:, :, 0].view(N, -1)
RuntimeError: cannot reshape tensor of 0 elements into shape [0, -1] because the unspecified dimension size -1 can be any value and is ambiguous

Process killed

About inference time

Great job! May I ask what's the inference time? Can it achieve 25 fps?

Better How to guide? Step by step readme or something.

I have installed everything on a fresh Linux Mint latest version as of today.
All the requirements are met, but it stops there, it seems firstly like there is an issue with:
"UserWarning: A NumPy version >=1.16.5 and <1.23.0 is required for this version of SciPy (detected version 1.24.3)

Cannot even get to the part where it generates the head, let alone, where do I feed it the image I want to create a head from?

cannot stat '/content/3DDFA_V2/crop_samples/img/*': No such file or directory

am trying to follow tutorial video and run colab from:

https://colab.research.google.com/github/camenduru/PanoHead-colab/blob/main/PanoHead_custom_colab.ipynb#scrollTo=v9wpwlGfiX2e

after running second block of code, the following error is produced:

/content/3DDFA_V2
/content/3DDFA_V2/bfm/bfm.py:34: FutureWarning: In the future `np.long` will be defined as the corresponding NumPy scalar.
  self.keypoints = bfm.get('keypoints').astype(np.long)  # fix bug
Traceback (most recent call last):
  File "/content/3DDFA_V2/recrop_images.py", line 322, in <module>
    main(args)
  File "/content/3DDFA_V2/recrop_images.py", line 183, in main
    tddfa = TDDFA(gpu_mode=gpu_mode, **cfg)
  File "/content/3DDFA_V2/TDDFA.py", line 34, in __init__
    self.bfm = BFMModel(
  File "/content/3DDFA_V2/bfm/bfm.py", line 34, in __init__
    self.keypoints = bfm.get('keypoints').astype(np.long)  # fix bug
  File "/usr/local/lib/python3.10/dist-packages/numpy/__init__.py", line 328, in __getattr__
    raise AttributeError("module {!r} has no attribute "
AttributeError: module 'numpy' has no attribute 'long'. Did you mean: 'log'?
cp: cannot stat '/content/3DDFA_V2/crop_samples/img/*': No such file or directory

No colour of face or mtl file?

Hi, how can we use .obj files and not .mrc. I want to have the skin colour of the person also.

Neural rendering resolution 128?

Hi Sizhe,

Thanks for this fabulous work. I noticed your default checkpoint G.neural_rendering_resolution is 64 rather than 128, I wonder whether you have fine-tuned a 128 version?

Thanks and looking forward to hearing form you!

"bias_act_plugin"...Failed!

Hello :)

I am struggling quite some time by now with the panohead configuration. I installed all the environment requirements for Panohead and checked up for the Stylegan3 requirements as well.

Here some information about the current env:

Working on Linux
Python 3.8.18
torch 2.1.1
cudatoolkit 11.8.0
ninja 1.11.1
GCC 11.4.0

nvidia-smi output:

nvcc --version output:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Thu_Nov_18_09:45:30_PST_2021
Cuda compilation tools, release 11.5, V11.5.119
Build cuda_11.5.r11.5/compiler.30672275_0

I'm trying to execute the following statement:
python gen_videos.py --network models/easy-khair-180-gpc0.8-trans10-025000.pkl --seeds 0-3 --grid 2x2 --output=out --cfg Head --trunc 0.7

Here the error message I am dealing with when executing:
Loading networks from "models/easy-khair-180-gpc0.8-trans10-025000.pkl"...
Setting up PyTorch plugin "bias_act_plugin"... Failed!
Traceback (most recent call last):
File "/home/research-server/anaconda3/envs/panohead/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 2100, in _run_ninja_build
subprocess.run(
File "/home/research-server/anaconda3/envs/panohead/lib/python3.8/subprocess.py", line 516, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "gen_videos.py", line 365, in etc.

I'm still not sure what is causing the error. Does anybody have some ideas?

About new identity

Hi, thank you for your excellent work!
I would like to ask whether it is possible to use the pre-trained model to reconstruct ply of a new identity?

Resolved

How to process K-Hairstyle images?

Dear authors,

Thanks for sharing such a great work on 3D avatar synthesis! Recently I was trying to train PanoHead on other datasets, and I'm wondering how you process the back-head images sampled from K-Hairstyle dataset. For example, you mentioned using WHENet for head pose estimation, but they can only produce 3D rotations of the head, so how to obtain 3D translations and convert them to the pose format of EG3D? Besides, how to crop and align these images as they don't have facial landmarks? Given these questions, I would be super grateful if you could share some scripts for the processing of these back-head images. Thanks!

Details about Pre-trained Models?

Hi, thanks for the great work and for sharing this helpful repository.
However, I do have some questions about the provided pre-trained models.
I see you provided 3 pkl files, and I can infer that the easy-khair-180-gpc0.8-trans10-025000.pkl is the desired one to use.

Could you also provide me some details about how the other two models differ from that one?
Thanks a lot.

Face Images from Different angles

Greetings to all

Dear All,

I want to take multiple images of face as input from front back and top. I wants to make a 3d Model using panohead. I donot understand what are the steps to use multiple seed images. I want to recrop multiple face images and reconstruct 3d model.

Can it generate multiview images from a original image ????

I can see your code are all generate random images or photos from random seeds input, But Can it generate multi-view images from a original image ???? Because my current project need to generate multi-view images from one specified real image, not from random seeds. Thanks.

Does it work with Heads without hair?

Does it work with heads without hair?

I know there was a render of The rock but only seen meshes/renders with hair. Is it trained with hair or will bald be less trained than hair?

RuntimeError: CUDA error: invalid device ordinal

I only have 1 nvidia GPU with CUDA installed - how to I ensure only 1 GPU used?

RuntimeError: CUDA error: invalid device ordinal
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.