supernotman / retinaface_pytorch Goto Github PK

View Code? Open in Web Editor NEW

308.0 308.0 67.0 87.95 MB

Reimplement RetinaFace with Pytorch

Python 100.00%

retinaface_pytorch's People

Stargazers

Watchers

Forkers

khle08 yangshuailc yjingyu xsacha ruiming46zrm chaoso verigle amoliu trendingtechnology benjamesbabala ciel-zhang cbanyungong wyuzyf edwardpwtsoi zj717754140 leo-xxx jacke121 dreamfinoa zhangwulong couchpotato3508 qianji13 ibadr zyg11 lflyme menguangwen-cn-0411 oftenliu huizhaozh sorrowyn wlhuangwlhuang lihuikenny xiaozhuka tony109060581 shxy0 naqute maxwell2016lechouchou boozyguo ledduy610 123wk45678 fireae lyp-deeplearning baodijun thomaslin1990 tjm2020 dineshkumares bicycleman15 shapes-a1 xr-yang samjcheng freegliboracle wangdeyu brightchu tinyloop fightseed wolfworld6 choi612 qlhua001 martinhoang11 laudwika xrosliang adas-eye jackbjgao uptodiff dev-jinwoohong tryhard-ll cv-face sunnyisabaster martlettt

retinaface_pytorch's Issues

Landmark won't converge

我现在自己尝试用 Caffe 在训练，但是关键点回归得很差。请问有什么经验心得分享吗？🙏

模型文件hopenet_robust_alpha1.pkl不能下载

google总是下载失败，能不能上次至baidu

why not in mmdetection framework?

it has dcnv2,and more.

annotation links is not available , could you update the link?

Please I need help I get this error when I try to compile train.py
Evaluating epoch 0
0%| | 0/3226 [00:00<?, ?it/s]Traceback (most recent call last):
File "train.py", line 151, in
main()
File "train.py", line 136, in main
recall, precision = eval_widerface.evaluate(dataloader_val,retinaface)
File "C:\Desktop\RetinaFace_super\eval_widerface.py", line 74, in evaluate
for data in tqdm(iter(val_data)):
File "C:\AppData\Local\Continuum\anaconda3\envs\supernotman\lib\site-packages\tqdm\std.py", line 1099, in iter
for obj in iterable:
File "C:AppData\Local\Continuum\anaconda3\envs\supernotman\lib\site-packages\torch\utils\data\dataloader.py", line 582, in next
return self._process_next_batch(batch)
File "C:\AppData\Local\Continuum\anaconda3\envs\supernotman\lib\site-packages\torch\utils\data\dataloader.py", line 608, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
ValueError: Traceback (most recent call last):
File "C:\AppData\Local\Continuum\anaconda3\envs\supernotman\lib\site-packages\torch\utils\data_utils\worker.py", line 99, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "C:\AppData\Local\Continuum\anaconda3\envs\supernotman\lib\site-packages\torch\utils\data_utils\worker.py", line 99, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "C:\Desktop\RetinaFace_super\dataloader.py", line 348, in getitem
annotation[0,0] = label[0] # x1
ValueError: setting an array element with a sequence.

0%| | 0/3226 [00:00<?, ?it/s]

Evaluation problem

Hello,
I try to execute your code but there is problem, I cant find any solution
Can you please help me.
I download the dataset wider face as you explain and I tried to run this command on windows:
set CUDA_VISIBLE_DEVICES=0 & python train.py --data_path dataset/widerface --batch 1 --save_path ./out
but I get this problem:
Namespace(batch=1, data_path='dataset/widerface', depth=50, epochs=1, eval_step=3, img_size=512, save_path='./out', save_step=10, shuffle=True, verbose=10)
Traceback (most recent call last):
File "train.py", line 151, in
main()
File "train.py", line 55, in main
dataset_val = ValDataset(val_path,transform=transforms.Compose([RandomCroper()]))
File "C:\Desktop\RetinaFace_super\dataloader.py", line 332, in init
label = [float(x) for x in line]
File "C:\Desktop\RetinaFace_super\dataloader.py", line 332, in
label = [float(x) for x in line]
ValueError: could not convert string to float: '/24--Soldier_Firing/24_Soldier_Firing_Soldier_Firing_24_329.jpg'

when I change the val images with the same as train images it start the training then I get this error :

---- [Epoch 0/1, Batch 12870/12880] ----
+----------------+---------------------+
| loss name | value |
+----------------+---------------------+
| total_loss | 2.6635076999664307 |
| classification | 1.5447975397109985 |
| bbox | 0.34370726346969604 |
| landmarks | 0.7750030159950256 |
+----------------+---------------------+
-------- RetinaFace Pytorch --------
Evaluating epoch 0
0%| | 0/12880 [00:00<?, ?it/s]Traceback (most recent call last):
File "train.py", line 151, in
main()
File "train.py", line 136, in main
recall, precision = eval_widerface.evaluate(dataloader_val,retinaface)
File "C:\Desktop\RetinaFace_super\eval_widerface.py", line 74, in evaluate
for data in tqdm(iter(val_data)):
File "C:\AppData\Local\Continuum\anaconda3\envs\supernotman\lib\site-packages\tqdm\std.py", line 1099, in iter
for obj in iterable:
File "C:\AppData\Local\Continuum\anaconda3\envs\supernotman\lib\site-packages\torch\utils\data\dataloader.py", line 582, in next
return self._process_next_batch(batch)
File "C:\AppData\Local\Continuum\anaconda3\envs\supernotman\lib\site-packages\torch\utils\data\dataloader.py", line 608, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
ValueError: Traceback (most recent call last):
File "C:\AppData\Local\Continuum\anaconda3\envs\supernotman\lib\site-packages\torch\utils\data_utils\worker.py", line 99, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "C:\AppData\Local\Continuum\anaconda3\envs\supernotman\lib\site-packages\torch\utils\data_utils\worker.py", line 99, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "C:\Desktop\RetinaFace_super\dataloader.py", line 347, in getitem
annotation[0,0] = label[0] # x1
ValueError: setting an array element with a sequence.

0%|

I dont know what s go on
I really appreciate if you help me.

Had you ever use other backbone?

Thanks for your great job!
I'd use mobilenet V1 0.25 to replace your resnet ,however, I found it really hard to converge.
Although the loss was quite low even at the first several epochs, but it just keep that way forever.
Had you tried other light-weight backbone for your code? Could you share some details for your training?
Also, I am trying to increase # landmarks to 68 with the 300w dataset with your code, had you ever tried that?
Thanks!

About data argumentation

Hello
Which data argumentation did you use in your actual trainning? Cuz I saw several methods that you had commented but not sure which ones did you actually use.

BTW, many of them are not working and have bugs.

for example,

add this to line 297 in dataloader.py

pad = torch.from_numpy(np.array(pad))
before this
padded_img = F.pad(img, pad, "constant", value=0).

Or it will show

TypeError: narrow(): argument 'start' (position 2) must be int, not numpy.int64

ModuleNotFoundError: No module named 'torchvision.models._utils'

when do inference, load model is wrong?

RuntimeError: Error(s) in loading state_dict for RetinaFace
Missing key(s) in state_dict: "body.conv1.weight", "body.bn1.weight", "body.bn1.bias",....
Unexpected key(s) in state_dict: "module.body.conv1.weight", "module.body.bn1.weight",...

where is the landmarks labels?

hi, I was not found the landmarks in your annotations data. I'm trainning a model with resnet18, the landmarks' loss does not decline.Do landmarks and bbox separate to train?

FPS of the model and small face

Thanks for your work!
Is there any speed test(FPS) of your model? thx!

Will you upload your pretrained model?

Great job! And could you upload your pretrained model?
Or could you send me by mail? Thank you!

About labels and training

Hello,
I am following your instructions to train the network. However, the label file, in the website, is not like how you described it in the instructions. I changed the name of the bounding box and annotations txt file name to label.txt and the dataloader.py code cannot read it. What is the solution to that problem ?
To be more clear the file in the website of the widerface is like that:

0--Parade/0_Parade_marchingband_1_849.jpg
1
449 330 122 149 0 0 0 0 0 0
0--Parade/0_Parade_Parade_0_904.jpg
1
361 98 263 339 0 0 0 0 0 0
0--Parade/0_Parade_marchingband_1_799.jpg
21
78 221 7 8 2 0 0 0 0 0
78 238 14 17 2 0 0 0 0 0
113 212 11 15 2 0 0 0 0 0
134 260 15 15 2 0 0 0 0 0
163 250 14 17 2 0 0 0 0 0
201 218 10 12 2 0 0 0 0 0
182 266 15 17 2 0 0 0 0 0

And the output of the train.py is like that:

Traceback (most recent call last):
File "train.py", line 150, in
main()
File "train.py", line 53, in main
dataset_train = TrainDataset(train_path,transform=transforms.Compose([Resizer(),PadToSquare()]))
File "/home/barkntuncer/RetinaFace_Pytorch/dataloader.py", line 45, in init
label = [float(x) for x in line]
File "/home/barkntuncer/RetinaFace_Pytorch/dataloader.py", line 45, in
label = [float(x) for x in line]
ValueError: could not convert string to float: '0--Parade/0_Parade_marchingband_1_849.jpg'

how can I crop an Image from video_detect.py

Allow for dynamic input sizes / anchor sizes

Currently when tracing the model, the following two warnings apply:

/d/dev/RetinaFace_Pytorch/anchors.py:27: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
image_shape = np.array(image_shape)
/d/dev/RetinaFace_Pytorch/anchors.py:40: TracerWarning: torch.from_numpy results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
return torch.from_numpy(all_anchors.astype(np.float32)).cuda()

The model is then using a hardcoded 640x640 input size and anchors whereas the input size should be dynamic.

no prior_box?

it seems to be no prior_box part in this code. is it unnecessary?

what's the inference speed on a image average?

Does anyone tested speed on image with a decent GPU device? such as GTX1080ti etc.

输入的图片对于任意的大小是否都可以呢？

在detect.py 文件中，有padded image 这一环节，你是否考虑过对于大小不是640×640的图片，在padding和resize之后输入的模型中，得到的人脸框的位置和关键点的位置与原图之间会有偏移？这个偏移是否应该在显示的时候矫正一下呢？

The pre-train model

Hi, I can't reproduct the real precision, Can you give me the model_epoch_200.pt, Thanx

How to calculate the scales or used the default(1.0)?

when used a same picture with different scales will get different result. It can't adapt to different size of pictures. It can't detect small face and large face. Is there idea to deal with this?

retinaface做多类别检测可行吗

你好，在使用你的代码做人脸检测。我突发奇想，想用来检测人体和人体关键点+人脸和人脸关键点，请问这个是否可行

Fine tune pre-trained model

I was trying to fine tune pre-trained model but I think you current code did not provide this facility. I added a few lines in train.py, have a look at the following code. If you think it should be the part of it kindly add this in next commit. Thanks for your good work.


import argparse
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, models, transforms
from dataloader import TrainDataset, ValDataset, collater, RandomCroper, RandomFlip, Resizer, PadToSquare
from torch.utils.data import Dataset, DataLoader
from terminaltables import AsciiTable, DoubleTable, SingleTable
from tensorboardX import SummaryWriter
from torch.optim import lr_scheduler
import torch.distributed as dist
import eval_widerface
import torchvision
import model
import os
from torch.utils.data.distributed import DistributedSampler
import torchvision_model

def get_args():
    parser = argparse.ArgumentParser(description="Train program for retinaface.")
    parser.add_argument('--data_path', type=str, help='Path for dataset,default WIDERFACE')
    parser.add_argument('--batch', type=int, default=16, help='Batch size')
    parser.add_argument('--epochs', type=int, default=200, help='Max training epochs')
    parser.add_argument('--shuffle', type=bool, default=True, help='Shuffle dataset or not')
    parser.add_argument('--img_size', type=int, default=640, help='Input image size')
    parser.add_argument('--verbose', type=int, default=10, help='Log verbose')
    parser.add_argument('--save_step', type=int, default=10, help='Save every save_step epochs')
    parser.add_argument('--eval_step', type=int, default=3, help='Evaluate every eval_step epochs')
    parser.add_argument('--save_path', type=str, default='./out', help='Model save path')
    parser.add_argument('--depth', help='Resnet depth, must be one of 18, 34, 50, 101, 152', type=int, default=50)
    parser.add_argument('--pretrained_model_path', type=str, default='./out', help='Pre-Trained Model Path')
    args = parser.parse_args()
    print(args)
    return args


def main():
    args = get_args()
    if not os.path.exists(args.save_path):
        os.mkdir(args.save_path)
    log_path = os.path.join(args.save_path,'log')
    if not os.path.exists(log_path):
        os.mkdir(log_path)

    writer = SummaryWriter(log_dir=log_path)

    data_path = args.data_path
    train_path = os.path.join(data_path,'train/label.txt')
    val_path = os.path.join(data_path,'val/label.txt')
    # dataset_train = TrainDataset(train_path,transform=transforms.Compose([RandomCroper(),RandomFlip()]))
    dataset_train = TrainDataset(train_path,transform=transforms.Compose([Resizer(),PadToSquare()]))
    dataloader_train = DataLoader(dataset_train, num_workers=8, batch_size=args.batch, collate_fn=collater,shuffle=True)
    # dataset_val = ValDataset(val_path,transform=transforms.Compose([RandomCroper()]))
    dataset_val = ValDataset(val_path,transform=transforms.Compose([Resizer(),PadToSquare()]))
    dataloader_val = DataLoader(dataset_val, num_workers=8, batch_size=args.batch, collate_fn=collater)
    
    total_batch = len(dataloader_train)

	# Create the model
    # if args.depth == 18:
    #     retinaface = model.resnet18(num_classes=2, pretrained=True)
    # elif args.depth == 34:
    #     retinaface = model.resnet34(num_classes=2, pretrained=True)
    # elif args.depth == 50:
    #     retinaface = model.resnet50(num_classes=2, pretrained=True)
    # elif args.depth == 101:
    #     retinaface = model.resnet101(num_classes=2, pretrained=True)
    # elif args.depth == 152:
    #     retinaface = model.resnet152(num_classes=2, pretrained=True)
    # else:
    #     raise ValueError('Unsupported model depth, must be one of 18, 34, 50, 101, 152')

    # Create torchvision model
    return_layers = {'layer2':1,'layer3':2,'layer4':3}
    retinaface = torchvision_model.create_retinaface(return_layers)


    retinaface = retinaface.cuda()
    retinaface = torch.nn.DataParallel(retinaface).cuda()
    retinaface.training = True
    
    try:
        pretrained_model_path = args.pretrained_model_path
        state_dict=None
        with open( pretrained_model_path , "br" ) as f:
            stat_dict = torch.load(f)
        retinaface.load_state_dict( stat_dict )
        print( "Previuos Model is Successfully Loaded :)" )
    except:
        print( "Error while loading previous model :(" ) 

    optimizer = optim.Adam(retinaface.parameters(), lr=1e-3)
    # optimizer = optim.SGD(retinaface.parameters(), lr=1e-2, momentum=0.9, weight_decay=0.0005)
    # scheduler = optim.lr_scheduler.ReduceLROnPlateau(optimizer, patience=3, verbose=True)
    # scheduler  = optim.lr_scheduler.StepLR(optimizer, step_size=30, gamma=0.1)
    #scheduler  = optim.lr_scheduler.MultiStepLR(optimizer, milestones=[10,30,60], gamma=0.1)

    print('Start to train.')

    epoch_loss = []
    iteration = 0

    for epoch in range(args.epochs):
        retinaface.train()

        # Training
        for iter_num,data in enumerate(dataloader_train):
            optimizer.zero_grad()
            classification_loss, bbox_regression_loss,ldm_regression_loss = retinaface([data['img'].cuda().float(), data['annot']])
            classification_loss = classification_loss.mean()
            bbox_regression_loss = bbox_regression_loss.mean()
            ldm_regression_loss = ldm_regression_loss.mean()

            # loss = classification_loss + 1.0 * bbox_regression_loss + 0.5 * ldm_regression_loss
            loss = classification_loss + bbox_regression_loss + ldm_regression_loss

            loss.backward()
            optimizer.step()
            
            if iter_num % args.verbose == 0:
                log_str = "\n---- [Epoch %d/%d, Batch %d/%d] ----\n" % (epoch, args.epochs, iter_num, total_batch)
                table_data = [
                    ['loss name','value'],
                    ['total_loss',str(loss.item())],
                    ['classification',str(classification_loss.item())],
                    ['bbox',str(bbox_regression_loss.item())],
                    ['landmarks',str(ldm_regression_loss.item())]
                    ]
                table = AsciiTable(table_data)
                log_str +=table.table
                print(log_str)
                # write the log to tensorboard
                writer.add_scalar('losses:',loss.item(),iteration*args.verbose)
                writer.add_scalar('class losses:',classification_loss.item(),iteration*args.verbose)
                writer.add_scalar('box losses:',bbox_regression_loss.item(),iteration*args.verbose)
                writer.add_scalar('landmark losses:',ldm_regression_loss.item(),iteration*args.verbose)
                iteration +=1

        # Eval
        if epoch % args.eval_step == 0:
            print('-------- RetinaFace Pytorch --------')
            print ('Evaluating epoch {}'.format(epoch))
            recall, precision = eval_widerface.evaluate(dataloader_val,retinaface)
            print('Recall:',recall)
            print('Precision:',precision)

            writer.add_scalar('Recall:', recall, epoch*args.eval_step)
            writer.add_scalar('Precision:', precision, epoch*args.eval_step)

        # Save model
        if (epoch + 1) % args.save_step == 0 or iter_num>=100:
            torch.save(retinaface.state_dict(), args.save_path + '/model_epoch_{}.pt'.format(epoch + 1))

    writer.close()


if __name__=='__main__':
    main()

Out of memory

How much memory do you estimate this project needs?
I'm using a Titan V with 12GB and this goes out of memory with a batch size of 16 (default was 32), which seems quite small for WIDER face.

I had to use a batch size of 8, which used 10GB.

Question about focal loss

@supernotman Hi, thank you for this great project.
May I understand why you use cross entropy loss for classification head, other than focal loss? As Focal loss is the key feature of retinaNet.

About context module

I think there maybe some mistakes of channels in context module

x1 = self.det_conv1(x) # 256 channels
x_ = self.det_context_conv1(x) # 128 channels
x2 = self.det_context_conv2(x_) # 128 channels
x3_ = self.det_context_conv3_1(x_) # 128 channels
x3 = self.det_context_conv3_2(x3_) # 128 channels

and after concat x1,x2,x3 I got 512 channels. This is inconsistent with the paper.(256 channels)
Is there anything wrong with me?

How to install cpools==0.0.0

Anaconda and pip can't install cpools. Can you help me?

where is the Dense Regression Loss? I can not find it.

Unable to detect rotated faces / landmarks inaccurate

These two test images look better in other implementations of RetinaFace for PyTorch, for eg.
https://github.com/bogireddytejareddy/retinaface-pytorch/blob/master/test_results/t1.jpg
https://github.com/bogireddytejareddy/retinaface-pytorch/blob/master/test_results/t4.jpg

focal_loss = False

focal_loss = False
# focal loss
if focal_loss:
alpha = 0.25
gamma = 2.0
alpha_factor = torch.ones(targets.shape).cuda() * alpha

            alpha_factor = torch.where(torch.eq(targets, 1.), alpha_factor, 1. - alpha_factor)
            focal_weight = torch.where(torch.eq(targets, 1.), 1. - classification, classification)
            focal_weight = alpha_factor * torch.pow(focal_weight, gamma)

            bce = -(targets * torch.log(classification) + (1.0 - targets) * torch.log(1.0 - classification))

            cls_loss = focal_weight * bce

            cls_loss = torch.where(torch.ne(targets, -1.0), cls_loss, torch.zeros(cls_loss.shape).cuda())

            classification_losses.append(cls_loss.sum()/torch.clamp(num_positive_anchors.float(), min=1.0))
        else:
            if positive_indices.sum() > 0:
                classification_losses.append(positive_losses.mean() + sorted_losses.mean())
            else:
                classification_losses.append(torch.tensor(0).float().cuda())

never use focalloss???

element 0 of tensors does not require grad and does not have a grad_fn

Thank you for your open source, but I encountered the following problem when 104 epoch in training.can you help me? thanks

Traceback (most recent call last):
File "train.py", line 156, in
main()
File "train.py", line 111, in main
loss.backward()
File "/home/boyun/.conda/envs/retinaface/lib/python3.6/site-packages/torch/tensor.py", line 107, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home/boyun/.conda/envs/retinaface/lib/python3.6/site-packages/torch/autograd/init.py", line 93, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

Have you tested on widerface val?

I have the following result of image size (1200,1200):
Easy Val AP: 0.721983363755764
Medium Val AP: 0.742308954563704
Hard Val AP: 0.6196879642610857

Is there something wrong?

supernotman / retinaface_pytorch Goto Github PK

retinaface_pytorch's People

Stargazers

Watchers

Forkers

retinaface_pytorch's Issues

Recommend Projects

Recommend Topics

Recommend Org