yyk-wew / f3net Goto Github PK

View Code? Open in Web Editor NEW

140.0 140.0 17.0 51 KB

Pytorch implementation of F3Net (ECCV 2020 F3Net: Frequency in Face Forgery Network)

Python 100.00%

f3net's People

Contributors

Stargazers

Watchers

Forkers

baopingliu 2016215226 leminhbinh0209 wwq111111 yubozuzu123 zhengtq dandelion915 wmbai guoyang-xie blosslzy noteboy gang370 lucasheng isolanomr mahabub657fy3 zj56 sunpro108

f3net's Issues

Improper equals sign casues serious problem

In the function of generate_filter, the code should be revised as
“return [[0. if i + j > end or i + j < start else 1. for j in range(size)] for i in range(size)]”.
In the original version, the code is
“return [[0. if i + j > end or i + j <= start else 1. for j in range(size)] for i in range(size)]”, which will alter the weight of point(0,0) as 0. This error casues the dc component disapear in the filter matrix.

Can anybody reproduce the results in the paper

original下的AUC比其他mode下的都高，是不是可以认为这篇文章复现不了？
我自己用这个代码跑的也是和readme中类似的结果

About testing!

谢谢你复现的代码！我有一些问题想请教一下：每个视频我提取了前300帧用作训练，在设置batchsize=2时，运行到auc, r_acc, f_acc = evaluate(model, dataset_path, mode='valid')时，测试模型需要花费15分钟时间，并且会无报错地返回Process finished with exit code -1。请问你每次测试模型需要花多久的时间？

Nice Work！

Hi, I'm wonder if you can reproduce the same result in dataset?(especially in Low Quality data part).

About the DCT Transform Matrix

F3Net/models.py

Lines 228 to 230 in 01942ca

    
           def DCT_mat(size): 
        
               m = [[ (np.sqrt(1./size) if i == 0 else np.sqrt(2./size)) * np.cos((j + 0.5) * np.pi * i / size) for j in range(size)] for i in range(size)] 
        
               return m

上面的代码中的公式似乎与mathworks链接中的公式不太一样,
，个人觉得应该把代码中前面的括号去掉，如下:

def DCT_mat(size):
    m = [[ np.sqrt(1./size) if i == 0 else np.sqrt(2./size) * np.cos((j + 0.5) * np.pi * i / size) for j in range(size)] for i in range(size)]
    return m

你好，可以把你的FF++数据集分享给我吗，我下载太慢了。

About trainning!

Thanks for your nice work. @yyk-wew
But I have some problems when training this network. I download the FF++ dataset and try to train it. Unfortunately, on the raw videos, the test result is just about 0.7. So I have two questions as follow:
1: I notice that the author set batch size as 128 and trained for 150k iterations. But in my experiment, the batch size is set as 64 and I train for 40k iterations. Thus could you offer some details about training? Do I really need to train it for so long ?
2: About dataset. In the original paper, the author mention that the size of the real videos is augmented four times, and crop the face in each video. But I am not sure the my own method of precessing video is right or not. Could you tell some information about them?
Thanks a lot again.

If possible, could you share your training config?

感谢您的代码，我在尝试复现您的结果时，想知道您训练对应代码的optimizer（您的实现是用adam，paper中是SGD+余弦退火）, scheduler(似乎代码中没有，而paper中是有用余弦退火的)，batch size，epoch（paper中只提到了iteration, 您的代码似乎是用epoch控制）等config, 不知能否得知！感谢！
另外，观察您的代码中的evaluate，似乎您采用的方法是会对每个视频图像shuffle然后取前50帧，这样是否会导致每次evaluate的结果有小幅度的shake呢？

请问如果图像的宽和高不一样，这样如何做DCT和IDCT呢？

Would you mind offering the trained weight of F3Net on low quality?

你好，我在我划分的ff++上运行你的代码，一段视频采帧60帧左右，但是在17k个iter之后，Both模型的auc还是没有明显提升，卡死在0.75附近，请问这个和数据集规模有关吗，如果方便的话，可否提供你训练的相关配置以及最终保存的模型权重呢

Dataset split need help!!!

Hi,yyk.Thanks for your code.
I'm just getting started in this field . Could you share me the split code? : )

test

How can I use the pretrained model to test?
I use
model = Trainer(gpu_ids, mode, pretrained_path)
model.load(pretrained_path)
but error
'NoneType' object has no attribute 'forward'

thanks

About the balance of true and false video proportion

It is mentioned in the paper that the real video is added to achieve the balance of true and false ratio. Did you do the same in the data augmentation section?

Will some fake data not trained?

你好，我在拜读您的代码的时候发现您是通过分开real和fake两个dataloader来读取，并在一个batch中拼接起来以达到balancing的目的。不过在代码中我有一些不理解的地方望解答，在train.py中，我知道len_dataloader = dataloader_real.len()，这样会不会导致在您while循环内，其他多余的fake数据没有得到训练呢？（因为是real:fake是1:1的关系）
`
while i < len_dataloader:

        i += 1
        model.total_steps += 1

        try:
            data_real = real_iter.next()
            data_fake = fake_iter.next()
        except StopIteration:
            break
        # -------------------------------------------------

About dataset split!

The FaceForensics++ dataset offers an official dataset split here, , but I am confused about the performance of Xception on the official split which is only about 75% ~ 80% on c40. However with the same training code, if I train Xception on my self-splitted dataset(72%-14%-14%, but FaceSwap/000_001.mp4 and NeuralTextural/000_001.mp4 might be splitted into train set and val set respectively for example), the performance is close to the results in all those papers which is about 86% ~ 89%. So could you please help me about which split are you using in this repo?

您好，想问一些代码的问题

您好，我是电信专业的学生，代码功底比较薄弱，在复现论文细节过程例如FAD模块的DCT图都出现了一些问题，请问您有微信吗，想请教一些问题

FAD分支

你好，我想问一下关于FAD分支，论文中说到定义三个基本滤波器和三个可学习的滤波器对输入x进行处理，但我在代码中看到你定义了四个滤波器low_filter ，middle_filter ，high_filter ，all_filter 输出list集合有四个元素

some question about real video augmentation

您好，非常感谢您的工作！我在论文中看到说对真视频样本进行了四倍增强，在论文中是 ‘The size of the real videos is augmented four times to solve, category imbalance between the real and fake data.‘ 请问您在复现中有尝试使用旋转、随机裁剪这类的数据增强方式吗？

Some question about the test result

Thanks for your work. @yyk-wew
cuz the ff++ dataset is too big to download,I downloaded the celebdf and organized the dataset as readmd to train the model.
And I set the max epoch to 25,then I get the strange result：

[2021-04-18 01:00:02,327][DEBUG] (Test @ epoch 24) auc: 0.9596398896802123, r_acc: 0.8729227761485826, f_acc:0.9508928571428571 [2021-04-18 01:00:14,197][DEBUG] loss: 0.001041474868543446 at step: 40960 [2021-04-18 01:00:26,063][DEBUG] loss: 0.00437780749052763 at step: 41000 [2021-04-18 01:00:37,918][DEBUG] loss: 0.00041091835009865463 at step: 41040 [2021-04-18 01:00:49,778][DEBUG] loss: 0.0002549117198213935 at step: 41080 [2021-04-18 01:01:14,447][DEBUG] (Val @ epoch 24) auc: 0.9573432691169508, r_acc: 0.8555871212121212, f_acc:0.9265625 [2021-04-18 01:01:29,453][DEBUG] (Test @ epoch 24) auc: 0.9628076560536238, r_acc: 0.8797653958944281, f_acc:0.9419642857142857 [2021-04-18 01:01:39,855][DEBUG] loss: 0.00019928392430301756 at step: 41120 [2021-04-18 01:01:51,724][DEBUG] loss: 0.00016936950851231813 at step: 41160 [2021-04-18 01:02:03,599][DEBUG] loss: 0.00015224494563881308 at step: 41200 [2021-04-18 01:02:15,466][DEBUG] loss: 0.0005151446093805134 at step: 41240 [2021-04-18 01:02:33,977][DEBUG] (Test @ epoch 25) auc: 0.4082194351347577, r_acc: 0.458455522971652, f_acc:0.49375

Test at the epoch25,the result is quite different from what we expected.
And I selected the 'Both' to train the model.
I wonder the reason.Or the epoch24‘s result is the final? Thank you.

About transforms.Normalize

Hi!
I found that, in your code(util.py), you didn't use transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) to normalize the raw data. Did this paper really not need this operation, or was this code forgotten?
Thank you for your reply！

about dataset

Excuse me, does the author use the LQ dataset to train and then test LQ or the HQ dataset to train and then test LQ?
thanks!!!

请问一下，这个FF++的数据集在哪里下载啊

请问一下，这个FF++的数据集在哪里下载啊，谢谢啊

FAD的三通道输入和LFS的灰度图输入

我比较疑惑的是，在FAD中的输入是3通道图做DCT，
x_freq = self._DCT_all @ x @ self._DCT_all_T # [N, 3, 299, 299]

但是在LFS时，却要先转成灰度图：
x_gray = 0.299x[:,0,:,:] + 0.587x[:,1,:,:] + 0.114*x[:,2,:,:]
x = x_gray.unsqueeze(1)

x = (x + 1.) * 122.5

文章中，两块应该都是RGB输入，请问为什么LFS时要灰度图输入啊？？

Environment

Can you provide a version of the environment?
thanks

请问这两个模块可以用作定位可行么?

如题，我看到最后使用fc做了分类，如果cat的时候按通道维度合并，保持原图尺寸输出，输入到decoder做切割，是否能起到定位篡改位置的作用?

RuntimeError

Excuse me, I got this error after training on 2070 super, how can I solve it?

Augment True!
Augment True!
Augment True!
Augment True!
Augment True!
[2022-04-28 17:51:57,044][DEBUG] No 0
Traceback (most recent call last):
File "train.py", line 94, in
loss = model.optimize_weight()
File "C:\Users\VMLab\Desktop\F3Net\trainer.py", line 36, in optimize_weight
stu_fea, stu_cla = self.model(self.input)
File "C:\Users\VMLab\Anaconda3\envs\F3\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "C:\Users\VMLab\Anaconda3\envs\F3\lib\site-packages\torch\nn\parallel\data_parallel.py", line 165, in forward
return self.module(*inputs[0], **kwargs[0])
File "C:\Users\VMLab\Anaconda3\envs\F3\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "C:\Users\VMLab\Desktop\F3Net\models.py", line 212, in forward
fea_LFS = self.LFS_head(x)
File "C:\Users\VMLab\Anaconda3\envs\F3\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "C:\Users\VMLab\Desktop\F3Net\models.py", line 116, in forward
y = torch.log10(y + 1e-15)
RuntimeError: CUDA out of memory. Tried to allocate 34.00 MiB (GPU 0; 8.00 GiB total capacity; 1.24 GiB already allocated; 4.80 GiB free; 1.33 GiB reserved in total by PyTorch)

About the distribution of the DCT power spectrum

您好！
我在论文中看到您将DCT的2D频谱图展平为1D的操作（图3）。我目前的研究工作中也需要用到这个操作，但是遇到了一些问题：获得的一维频谱图有剧烈波动，很不光滑平整。想请教一下，您有遇到过类似的问题吗？如果方便的话，可以请您分享一下相关的代码或经验吗？

Some questions about Test Results

从您的README.md中的Result中发现，尤其是Valid(Mine) 和 Test(Mine) 这两列的数据，Baseline的测试效果优于加了FAD、LFS的测试效果，请问为什么出现这种现象，这是否说明FAD、LFS没有发挥出它应有的作用？

	def DCT_mat(size):
	m = [[ (np.sqrt(1./size) if i == 0 else np.sqrt(2./size)) * np.cos((j + 0.5) * np.pi * i / size) for j in range(size)] for i in range(size)]
	return m