ozcelikfu / brain-diffuser Goto Github PK

Official repository for the paper "Brain-Diffuser: Natural scene reconstruction from fMRI signals using generative latent diffusion" by Furkan Ozcelik and Rufin VanRullen.

License: MIT License

Python 99.89% Shell 0.11%

brain-diffuser's Introduction

Brain-Diffuser

Official repository for the paper "Brain-Diffuser: Natural scene reconstruction from fMRI signals using generative latent diffusion" by Furkan Ozcelik and Rufin VanRullen.

Results

The following are a few reconstructions obtained :

Instructions

Requirements

Create conda environment using environment.yml in the main directory by entering conda env create -f environment.yml . It is an extensive environment and may include redundant libraries. You may also create environment by checking requirements yourself.

Data Acquisition and Processing

Download NSD data from NSD AWS Server:
```
cd data
python download_nsddata.py
```
Download "COCO_73k_annots_curated.npy" file from HuggingFace NSD

Prepare NSD data for the Reconstruction Task:

cd data
python prepare_nsddata.py -sub 1
python prepare_nsddata.py -sub 2
python prepare_nsddata.py -sub 5
python prepare_nsddata.py -sub 7

First Stage Reconstruction with VDVAE

Download pretrained VDVAE model files and put them in vdvae/model/ folder

wget https://openaipublic.blob.core.windows.net/very-deep-vaes-assets/vdvae-assets-2/imagenet64-iter-1600000-log.jsonl
wget https://openaipublic.blob.core.windows.net/very-deep-vaes-assets/vdvae-assets-2/imagenet64-iter-1600000-model.th
wget https://openaipublic.blob.core.windows.net/very-deep-vaes-assets/vdvae-assets-2/imagenet64-iter-1600000-model-ema.th
wget https://openaipublic.blob.core.windows.net/very-deep-vaes-assets/vdvae-assets-2/imagenet64-iter-1600000-opt.th

Extract VDVAE latent features of stimuli images for any subject 'x' using python scripts/vdvae_extract_features.py -sub x
Train regression models from fMRI to VDVAE latent features and save test predictions using python scripts/vdvae_regression.py -sub x
Reconstruct images from predicted test features using python scripts/vdvae_reconstruct_images.py -sub x

Second Stage Reconstruction with Versatile Diffusion

Download pretrained Versatile Diffusion model "vd-four-flow-v1-0-fp16-deprecated.pth", "kl-f8.pth" and "optimus-vae.pth" from HuggingFace and put them in versatile_diffusion/pretrained/ folder
Extract CLIP-Text features of captions for any subject 'x' using python scripts/cliptext_extract_features.py -sub x
Extract CLIP-Vision features of stimuli images for any subject 'x' using python scripts/clipvision_extract_features.py -sub x
Train regression models from fMRI to CLIP-Text features and save test predictions using python scripts/cliptext_regression.py -sub x
Train regression models from fMRI to CLIP-Vision features and save test predictions using python scripts/clipvision_regression.py -sub x
Reconstruct images from predicted test features using python scripts/versatilediffusion_reconstruct_images.py -sub x . This code is written as you are using two 12GB GPUs but you may edit according to your setup.

Quantitative Evaluation

Although results are expected to be similar, it may vary because of variations at reconstruction

Save test images to directory python scripts/save_test_images.py
Extract evaluation features for test images using python scripts/eval_extract_features.py -sub 0
Extract evaluation features for reconstructed images of any subject using python scripts/eval_extract_features.py -sub x
Obtain quantitative metric results for each subject usingpython scripts/evaluate_reconstruction.py -sub x

ROI Analysis

It has a bug that prevents to get the exact results but provides an approximation for most of ROIs, hopefully will be fixed soon.

Extract ROI fMRI activations for any subject 'x' using python scripts/roi_extract.py -sub x
Generate VDVAE, CLIP-Text, CLIP-Vision features forom synthetic fMRI using python scripts/roi_generate_features.py -sub x
Generate VDVAE reconstructions for ROIs using python scripts/roi_vdvae_reconstruct.py -sub x
Generate Versatile Diffusion reconstructions for ROIs using python scripts/roi_versatilediffusion_reconstruct.py -sub x

References

Codes in vdvae directory are derived from openai/vdvae
Codes in versatile_diffusion directory are derived from earlier version of SHI-Labs/Versatile-Diffusion
Dataset used in the studies are obtained from Natural Scenes Dataset

brain-diffuser's People

Contributors

Stargazers

Watchers

Forkers

ricklentz yohann-benchetrit dahui-y beamishc jinghere11 haguragen haowenweijohn yahafifi palec87 setarehnj harighs nanmusir rc-zb shchetinnikov

brain-diffuser's Issues

Conda environment installation failed

I'm using conda 23.5.2

When I run the command in anaconda prompt, I got this error:
Warning: you have pip-installed dependencies in your environment file, but you do not list pip itself as one of your conda dependencies. Conda may not use the correct pip to install your packages, and they may end up in the wrong place. Please add an explicit pip dependency. I'm adding one for you, but still nagging you.
Collecting package metadata (repodata.json): done
Solving environment: failed

ResolvePackageNotFound:

lcms2==2.12=h3be6417_0
nbconvert==6.4.4=py38h06a4308_0
colorama==0.4.5=py38h06a4308_0
certifi==2022.9.24=py38h06a4308_0
libtiff==4.4.0=hecacb30_2
pytz==2022.1=py38h06a4308_0
lerc==3.0=h295c915_0
urllib3==1.26.12=py38h06a4308_0
glib==2.69.1=h4ff587b_1
chardet==4.0.0=py38h06a4308_1003
python==3.8.13=h12debd9_0
nbformat==5.3.0=py38h06a4308_0
beautifulsoup4==4.11.1=py38h06a4308_0
libxslt==1.1.35=h4e12654_0
gnutls==3.6.15=he1e5248_0
brotlipy==0.7.0=py38h27cfd23_1003
astroid==2.11.7=py38h06a4308_0
gmp==6.2.1=h295c915_3
mkl_fft==1.3.1=py38hd3c417c_0
numexpr==2.8.4=py38he184ba9_0
pylint==2.14.5=py38h06a4308_0
wrapt==1.14.1=py38h5eee18b_0
cryptography==38.0.1=py38h9ce1e76_0
libiconv==1.16=h7f8727e_2
tk==8.6.12=h1ccaba5_0
mkl_random==1.2.2=py38h51133e4_0
black==22.6.0=py38h06a4308_0
gst-plugins-base==1.14.0=h8213a91_2
idna==3.4=py38h06a4308_0
ujson==5.4.0=py38h6a678d5_0
nbclient==0.5.13=py38h06a4308_0
nettle==3.7.3=hbbd107a_1
numpydoc==1.4.0=py38h06a4308_0
intel-openmp==2021.4.0=h06a4308_3561
cffi==1.15.1=py38h74dc2b5_0
bzip2==1.0.8=h7b6447c_0
mkl-service==2.4.0=py38h7f8727e_0
entrypoints==0.4=py38h06a4308_0
requests==2.28.1=py38h06a4308_0
zstd==1.5.2=ha4553b6_0
freetype==2.12.1=h4a9f257_0
libsodium==1.0.18=h7b6447c_0
mkl==2021.4.0=h06a4308_640
pyrsistent==0.18.0=py38heee7806_0
mypy_extensions==0.4.3=py38h06a4308_1
joblib==1.1.1=py38h06a4308_0
watchdog==2.1.6=py38h06a4308_0
fftw==3.3.9=h27cfd23_1
qtconsole==5.3.2=py38h06a4308_0
ld_impl_linux-64==2.38=h1181459_1
inflection==0.5.1=py38h06a4308_0
xz==5.2.6=h5eee18b_0
qtwebkit==5.212=h4eab89a_4
jellyfish==0.9.0=py38h7f8727e_0
bottleneck==1.3.5=py38h7deecbd_0
gstreamer==1.14.0=h28cd5cc_2
icu==58.2=he6710b0_3
giflib==5.2.1=h7b6447c_0
libllvm10==10.0.1=hbcb73fb_5
pluggy==1.0.0=py38h06a4308_1
libpq==12.9=h16c4e8d_3
debugpy==1.5.1=py38h295c915_0
lz4-c==1.9.3=h295c915_1
nest-asyncio==1.5.5=py38h06a4308_0
readline==8.2=h5eee18b_0
whatthepatch==1.0.2=py38h06a4308_0
ca-certificates==2022.10.11=h06a4308_0
testpath==0.6.0=py38h06a4308_0
ffmpeg==4.3=hf484d3e_0
markupsafe==2.1.1=py38h7f8727e_0
pyqtwebengine==5.15.7=py38h6a678d5_1
typing-extensions==4.3.0=py38h06a4308_0
libffi==3.3=he6710b0_2
rtree==0.9.7=py38h06a4308_1
secretstorage==3.3.1=py38h06a4308_0
libwebp-base==1.2.4=h5eee18b_0
libgfortran-ng==11.2.0=h00389a5_1
pyqt5-sip==12.11.0=py38h6a678d5_1
imagesize==1.4.1=py38h06a4308_0
lame==3.100=h7b6447c_0
wurlitzer==3.0.2=py38h06a4308_0
expat==2.4.4=h295c915_0
python-lsp-server==1.5.0=py38h06a4308_0
libxml2==2.9.14=h74e7548_0
spyder==5.3.3=py38h06a4308_0
dbus==1.13.18=hb2f20db_0
keyring==23.4.0=py38h06a4308_0
nspr==4.33=h295c915_0
numpy-base==1.23.4=py38h31eccc5_0
pathspec==0.9.0=py38h06a4308_0
libevent==2.1.12=h8f2d780_0
yaml==0.2.5=h7b6447c_0
spyder-kernels==2.3.3=py38h06a4308_0
qt-webengine==5.15.9=hd2b0992_4
typing_extensions==4.3.0=py38h06a4308_0
mistune==0.8.4=py38h7b6447c_1000
libstdcxx-ng==11.2.0=h1234567_1
fontconfig==2.13.1=h6c09931_0
libgcc-ng==11.2.0=h1234567_1
openh264==2.1.1=h4ff587b_0
pysocks==1.7.1=py38h06a4308_0
sphinx==5.0.2=py38h06a4308_0
tomlkit==0.11.1=py38h06a4308_0
libunistring==0.9.10=h27cfd23_0
libgfortran5==11.2.0=h1234567_1
libpng==1.6.37=hbc83047_0
pandas==1.5.1=py38h417a72b_0
scikit-learn==1.1.3=py38h6a678d5_0
libxcb==1.15=h7f8727e_0
lazy-object-proxy==1.6.0=py38h27cfd23_0
libtasn1==4.16.0=h27cfd23_0
docutils==0.18.1=py38h06a4308_3
openssl==1.1.1s=h7f8727e_0
pcre==8.45=h295c915_0
zlib==1.2.13=h5eee18b_0
numpy==1.23.4=py38h14f4228_0
jsonschema==4.4.0=py38h06a4308_0
libxkbcommon==1.0.1=hfa300c1_0
zeromq==4.3.4=h2531618_0
nss==3.74=h0370c37_0
jpeg==9e=h7f8727e_0
psutil==5.9.0=py38h5eee18b_0
tomli==2.0.1=py38h06a4308_0
libuuid==1.0.3=h7f8727e_2
libclang==10.0.1=default_hb85057a_2
libspatialindex==1.9.3=h2531618_0
sip==6.6.2=py38h6a678d5_0
pyqt==5.15.7=py38h6a678d5_1
python-lsp-black==1.2.1=py38h06a4308_0
_openmp_mutex==5.1=1_gnu
krb5==1.19.2=hac12032_0
qtpy==2.2.0=py38h06a4308_0
qt-main==5.15.2=h327a75a_7
sqlite==3.39.3=h5082296_0
pytorch==1.12.1=py3.8_cuda11.3_cudnn8.3.2_0
ncurses==6.3=h5eee18b_3
pillow==9.2.0=py38hace64e9_1
libedit==3.1.20210910=h7f8727e_0
libgomp==11.2.0=h1234567_1
libidn2==2.3.2=h7f8727e_0
libdeflate==1.8=h7f8727e_5
libwebp==1.2.4=h11a3e52_0
cudatoolkit==11.3.1=h2bc3f7f_2
jedi==0.18.1=py38h06a4308_1
click==8.0.4=py38h06a4308_0

About model parameter settings

Thank you so much for your great work. The code is very detailed and I successfully ran it, but my reproduction effect was very poor.
Firstly, the predicted drafts are of poor quality and provide little low-level information for the model, and secondly the features extracted by CLIP do not seem to be effective in assisting the generation of high-definition images with mix_str = 0.4. In short, the generated images are blurry and do not have a clear semantics.
Better results can only be achieved by using test data directly instead of predicted data. My guess is that the parameters may not be set well enough.
Could you please provide the final ridge regression and vd parameter settings, such as Ridge's alpha and max_iter, vd's mix_str, etc? Thank you so much.

Error in the conference.py file

How to select average users for image reconstruction

Hi, author：
I see that your article not only performs user (1, 2, 5, 7), but also performs 'average_user' image reconstruction. But there is no 'average_user' processing found in your code and it is also not found 'average_user' in the NSD dataset . How should this 'average_user' processing be set?
Looking forward to your answer. Thank you very much.
Your fans

Analysis of reconstruction results

Thank you very much for your work.
I reconstructed the image according to your ’README.md‘ and did not change any parameters. The reconstruction result is very blurry, and there are some images that look like partial screenshots of the original image. May I ask what is the problem?
Original

Result

I look forward to your guidance very much.

KeyError: 'vd'

KeyErroe:'vd' occurs when I run cliptext_extract_features.py. Can anyone help me fix this

versatile diffusion reconstruction is worse than reported

I followed the instruction to run the code. The first stage vdvae results are OK and the evalution metrics are the same as in the paper. However, the second stage of versatile diffusion reconstruction is worse that in the paper. Can you help to locate where is the problem? Below are two examples.

Analysis of reconstruction results

Thank you very much for your contribution to this project. I also wish the subject better and better on the road in the future. For the reproduction of this project, I have a problem, in the first stage of the reconstruction result I reproduced successfully, but in the second stage, I followed the operation of the readme file step by step, but the final picture generation result is only two kinds of color noise and black picture, which makes me very confused, I hope that kind people can help me, thank you very much

evaluation scripts

would you mind providing the n-way identification scripts? Because some details about the evaluation are missed in your paper(e.g., which CLIP variants? how the similarity is caculated?).

Download script 403 error, add "--no-sign-request"

I had to add "--no-sign-request" to the aws cp command in data/download_nsddata.py to download the NSD dataset.

Could you please provide the preprocessed data？

Would you mind providing the preprocessed data？ It's too difficult to download the NSD dataset.

How to view training set images

Hello, I am very interested in your research. Congratulations on your article being accepted. I have some questions to ask. For example: 1. How to display the original images of the training set? I trained the results but couldn't compare them with the original image because I didn't know where to find the original image

Looking forward to your answer.

A question about the result of the cliptext regression

Hello, when I run the code "cliptext_regression.py", I find the "reg.score(test_fmri, test_clip[:,i])" is 0.14 in the end, I want to know if it is a normal result?

Issues related to dataset download

Hello, I am very interested in your paper, but when downloading the dataset, it shows garbled code? May I ask what link is required for downloading the dataset code?