rshaojimmy / multimodal-deepfake Goto Github PK

View Code? Open in Web Editor NEW

310.0 310.0 23.0 5.73 MB

[TPAMI 2024 & CVPR 2023] PyTorch code for DGM4: Detecting and Grounding Multi-Modal Media Manipulation and beyond

License: Other

Python 99.80% Shell 0.20%

multimodal-deepfake's People

Contributors

Stargazers

Watchers

multimodal-deepfake's Issues

The text part of the dataset DGM4

Hi, I have downloaded your DGM4 dataset directly via the link, but after checking, I only found the images in the file 'manipulation' and 'origin', which is different from your dataset samples.

在只有图像的数据集上比如celeb-df数据集上检测，那模型中的文本模态就无法起作用了吧

我想在celeb-df数据集上测试一下hammer的性能，但我发现如果去掉文本模态，那整个模型不就仅仅相当于通过一个图像编码器然后进行分类了吗？我的理解对吗？所以hammer在单模态数据集上测试意义不大，可以这样说吗？

Missing keys problem

Hello, when I try to use bert, I need to download the model locally and then specify the path to use, but I will be prompted for Missing keys during training. Does this have any impact?

There is no file called pretrain.yaml in directory,but calling in code

How much memory is needed for training

Hello,
I am trying to train but out of memory always occured. Could you tell me how much memory you used and how much time did you spend for training？

请问如何再FF++上训练你们的模型，需要制作json文件吗？是什么样的格式呢。

Lmac

Lmac中文本投影text_feat你们使用的是bert的最后一层，而非cls，这样的选取是因为你们做实验比较性能了吗？

TypeError: add_code_sample_docstrings() got an unexpected keyword argument 'tokenizer_class'

Thank you for the new insight in multimodel deepfake detection! When running test.py, it will report an error -- TypeError: add_code_sample_docstrings() got an unexpected keyword argument 'tokenizer_class'.

Dataset link failed

Could you please re-share the connection of the data set? It's not working now

CKPT download fails!

Hi everyone!
I tried to download best-model-checkpoint but every time it failed!
Would you please update the link or provide another source?

require for checkpoints

Nice job! Will be the well-trained checkpoints publicly avaliable?

你好！我按照你的代码训练，训练了30多epoch，但是auc在第六遍的时候就达到了峰值，只有80，而且边界框损失bbox和giou几乎从头到尾没有发生变化。

我只是更改了一些关于多分布训练的代码，因为我是单gpu，减半了训练的批次大小。其余的没有改动。您能为我提供一些意见吗？

Details of comparison with uni-modal methods

Thanks for your awesome work!

I was wondering when comparing to the deepfake detection and sequence tagging methods, do you retrain the model using uni-modal data? If it is re-training, are the multi-modal-related modules such as Multi-Modal Aggregator deleted, or are all 0 data used to replace the input of another modality?

Dataset link has been removed

Dear，
I am very interested in your task, but now the dataset link is invalid. Can you please resend the link, or can you please send it to my email?My email address is [email protected]，thanks!

torch.multiprocessing.spawn.ProcessExitedException: process 1 terminated with signal SIGABRT

Hello, sorry for the interruption. I'm encountering torch.multiprocessing.spawn.ProcessExitedException: process 1 terminated with signal SIGABRT while running train.sh script, may I ask for your insights!

训练参数

您好，我在使用您的数据集和模型时出现了达不到您论文中提到的结果，请问您当时详细的训练参数是什么呢？

bert模型加载失败

貌似显示无法连接~~~求帮助

mtcnn的两个注释

mtcnn我看到的解释是人脸的最大边界坐标，既然这张图片是真实的，那为什么有两个坐标？我还没理解清楚这两个坐标所代表的含义，希望您能解释一下。

请问您的实验结果是跑了多少个epoch呢？

Suggestion: Dump the command line configs into yaml config

Dear Sir,

I‘m going to reproduce your work and use the pretrained best checkpoint for transfer learning, but struggling to check the config parameters among config/*.yaml, *.sh shell scripts and the parser.add_argument() in python scripts back and forth.
I think aggregating all these configs into yaml files is a more delightful and elegant way with more readabilty and more convenience for others to use your checkpoint.

Appreciate it.

DGM4 dataset is showing an error: 'This link has been removed.'

Hello, the link to the DGM4 dataset is showing an error: 'This link has been removed.'T^T Is there any other way to access the dataset?

About Text_Swap

I want to get more details on text_swap. I have found that some text of the 'orig' label and the 'text_swap' label are the same in the datasets. Can you provide a more detailed explanation of text_swap and its fake_text_pos?
here is an example:
{
"id": 683133,
"image": "DGM4/origin/guardian/0385/488.jpg",
"text": "Making a song and dance David Hasselhoff will perform a oneman show at the Edinburgh festival fringe",
"fake_cls": "orig",
"fake_image_box": [],
"fake_text_pos": [],
"mtcnn_boxes": [...]
},
{
"id": 896499,
"image": "DGM4/origin/guardian/0114/251.jpg",
"text": "Making a song and dance David Hasselhoff will perform a oneman show at the Edinburgh festival fringe",
"fake_cls": "text_swap",
"fake_image_box": [],
"fake_text_pos": [
0,
7,
8,
9,
10,
11,
13,
14,
15,
16
],
"mtcnn_boxes": [...]
}

Visualization

Hi,
I have read your paper and code and was deeply impressed. But I had some difficulty trying to reproduce your code, how do you visualize it? In order to get the same results as you show in the Visualization Results section ,how could I reproduce the code?
thanks

Error in Training Codes with Default Distributed Setting as False

I am trying out the training codes that you have provided. I am not using a distributed GPU system, here is the config I am using.
*i converted the argparse code for the distributed argument to default to False

But I have encountered this error:

Start training
Traceback (most recent call last):
  File "train.py", line 561, in <module>
    mp.spawn(main_worker, nprocs=ngpus_per_node, args=(args, config))
  File "anaconda3/envs/DGM4/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 230, in spawn
    return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
  File "anaconda3/envs/DGM4/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 188, in start_processes
    while not context.join():
  File "anaconda3/envs/DGM4/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 150, in join
    raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:

-- Process 0 terminated with the following error:
Traceback (most recent call last):
  File "anaconda3/envs/DGM4/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap
    fn(i, *args)
  File "Github/MultiModal-DeepFake-root/MultiModal-DeepFake/train.py", line 416, in main_worker
    train_stats = train(args, model, train_loader, optimizer, tokenizer, epoch, warmup_steps, device, lr_scheduler, config, summary_writer)
  File "Github/MultiModal-DeepFake-root/MultiModal-DeepFake/train.py", line 141, in train
    loss_MAC, loss_BIC, loss_bbox, loss_giou, loss_TMG, loss_MLC = model(image, label, text_input, fake_image_box, fake_token_pos, alpha = alpha)
  File "/anaconda3/envs/DGM4/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "Github/MultiModal-DeepFake-root/MultiModal-DeepFake/models/HAMMER.py", line 211, in forward
    self._dequeue_and_enqueue(image_feat_m, text_feat_m)
  File "anaconda3/envs/DGM4/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
    return func(*args, **kwargs)
  File "Github/MultiModal-DeepFake-root/MultiModal-DeepFake/models/HAMMER.py", line 363, in _dequeue_and_enqueue
    image_feats = concat_all_gather(image_feat)
  File "anaconda3/envs/DGM4/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
    return func(*args, **kwargs)
  File "Github/MultiModal-DeepFake-root/MultiModal-DeepFake/models/HAMMER.py", line 386, in concat_all_gather
    for _ in range(torch.distributed.get_world_size())]
  File "anaconda3/envs/DGM4/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 845, in get_world_size
    return _get_group_size(group)
  File "anaconda3/envs/DGM4/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 306, in _get_group_size
    default_pg = _get_default_group()
  File "anaconda3/envs/DGM4/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 410, in _get_default_group
    raise RuntimeError(
RuntimeError: Default process group has not been initialized, please make sure to call init_process_group.

Do correct me if im wrong, but I have already specified distributed to be False, why are the errors still referencing distributed codes?

Here is the change I made to argparse:
parser.add_argument('--distributed', default=False, action='store_true')

AttributeError: 'list' object has no attribute 'size'

Hi, thanks for your wonderful work!
But when I ran train.sh, I encountered an error:

I checked the type before and in model.forward(), I found before forward(), the type of text.input_ids is correct, i.e., torch.LongTensor, but in forward() it changes to list. Can you help me find out where the mistake is? thanks very much!

预测值的选取

请问为什么预测值选取的是第二维？

Require for evaluation results on DGM4 val subset.

Hi! May I ask for the evaluation results of CLIP and ViLT on the DGM4 val subset, like Table 2 in your papter? Thanks. 😊

test videos

can we test our own videos in it or how to pre-process tests video to get metadata.json file

Visualization Results

I have read your paper and code and was deeply impressed. But I had some difficulty trying to reproduce your code, how do you visualize it? I used the grad-cam module and felt that it was difficult to integrate into this project.

Data download

Hi, nice work. But I always get error when directly download the dataset on Microsoft365. Is there another way to download data. Thanks a lot.

download datasets

The download link of datasets has been removed. Could you share a new one? Thank you!

train.sh not running

how to run the train.sh when using VScode it shows no command found but if i run in gitbash it shows
$ sh train.sh
Traceback (most recent call last):
File "train.py", line 18, in v$ sh train.sh
Traceback (most recent call last):
File "train.py", line 18, in
import torch.nn as nn
File "C:\Users\athen\anaconda3\envs\DGM4\lib\site-packages\torch\nn_init_.py", line 1, in
from .modules import * # noqa: F403
File "C:\Users\athen\anaconda3\envs\DGM4\lib\site-packages\torch\nn\modules_init_.py", line 1, in
from .module import Module
File "C:\Users\athen\anaconda3\envs\DGM4\lib\site-packages\torch\nn\modules\module.py", line 7, in
from ..parameter import Parameter
File "C:\Users\athen\anaconda3\envs\DGM4\lib\site-packages\torch\nn\parameter.py", line 2, in
from torch._C import _disabled_torch_function_impl
ModuleNotFoundError: No module named 'torch._C'

I'm running this normal cpu I5 processor

SH train.sh error is reported

When I do sh train.sh operation, I get the error that the train_ours.py cannot be found, how can I fix it

How to get the region of FA?

Hi, I was wondering how to get the region of FA. In the paper's Section 3.2 (Face Attribute Manipulation), the author mentioned, "we first predict ..... using GAN-based methods". Can I understand this as the author first applies an expression detector to the face to get the expression region (e.g., "smile mouth"), then employs StyleCLIP to modify the expression of the face, and finally replaces the original expression region with the modified expression region?

How many face attributes does the dataset provide (e.g., "smile to angry")? what is the expression detector? and is the process of replacing as simple as copy and paste?

Thanks!

rshaojimmy / multimodal-deepfake Goto Github PK

multimodal-deepfake's People

Contributors

Stargazers

Watchers

Forkers

multimodal-deepfake's Issues

Recommend Projects

Recommend Topics

Recommend Org