supernotman / retinaface_pytorch Goto Github PK
View Code? Open in Web Editor NEWReimplement RetinaFace with Pytorch
Reimplement RetinaFace with Pytorch
我现在自己尝试用 Caffe 在训练,但是关键点回归得很差。请问有什么经验心得分享吗?🙏
google总是下载失败,能不能上次至baidu
it has dcnv2,and more.
hello everyone
Please I need help I get this error when I try to compile train.py
Evaluating epoch 0
0%| | 0/3226 [00:00<?, ?it/s]Traceback (most recent call last):
File "train.py", line 151, in
main()
File "train.py", line 136, in main
recall, precision = eval_widerface.evaluate(dataloader_val,retinaface)
File "C:\Desktop\RetinaFace_super\eval_widerface.py", line 74, in evaluate
for data in tqdm(iter(val_data)):
File "C:\AppData\Local\Continuum\anaconda3\envs\supernotman\lib\site-packages\tqdm\std.py", line 1099, in iter
for obj in iterable:
File "C:AppData\Local\Continuum\anaconda3\envs\supernotman\lib\site-packages\torch\utils\data\dataloader.py", line 582, in next
return self._process_next_batch(batch)
File "C:\AppData\Local\Continuum\anaconda3\envs\supernotman\lib\site-packages\torch\utils\data\dataloader.py", line 608, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
ValueError: Traceback (most recent call last):
File "C:\AppData\Local\Continuum\anaconda3\envs\supernotman\lib\site-packages\torch\utils\data_utils\worker.py", line 99, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "C:\AppData\Local\Continuum\anaconda3\envs\supernotman\lib\site-packages\torch\utils\data_utils\worker.py", line 99, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "C:\Desktop\RetinaFace_super\dataloader.py", line 348, in getitem
annotation[0,0] = label[0] # x1
ValueError: setting an array element with a sequence.
0%| | 0/3226 [00:00<?, ?it/s]
Hello,
I try to execute your code but there is problem, I cant find any solution
Can you please help me.
I download the dataset wider face as you explain and I tried to run this command on windows:
set CUDA_VISIBLE_DEVICES=0 & python train.py --data_path dataset/widerface --batch 1 --save_path ./out
but I get this problem:
Namespace(batch=1, data_path='dataset/widerface', depth=50, epochs=1, eval_step=3, img_size=512, save_path='./out', save_step=10, shuffle=True, verbose=10)
Traceback (most recent call last):
File "train.py", line 151, in
main()
File "train.py", line 55, in main
dataset_val = ValDataset(val_path,transform=transforms.Compose([RandomCroper()]))
File "C:\Desktop\RetinaFace_super\dataloader.py", line 332, in init
label = [float(x) for x in line]
File "C:\Desktop\RetinaFace_super\dataloader.py", line 332, in
label = [float(x) for x in line]
ValueError: could not convert string to float: '/24--Soldier_Firing/24_Soldier_Firing_Soldier_Firing_24_329.jpg'
when I change the val images with the same as train images it start the training then I get this error :
---- [Epoch 0/1, Batch 12870/12880] ----
+----------------+---------------------+
| loss name | value |
+----------------+---------------------+
| total_loss | 2.6635076999664307 |
| classification | 1.5447975397109985 |
| bbox | 0.34370726346969604 |
| landmarks | 0.7750030159950256 |
+----------------+---------------------+
-------- RetinaFace Pytorch --------
Evaluating epoch 0
0%| | 0/12880 [00:00<?, ?it/s]Traceback (most recent call last):
File "train.py", line 151, in
main()
File "train.py", line 136, in main
recall, precision = eval_widerface.evaluate(dataloader_val,retinaface)
File "C:\Desktop\RetinaFace_super\eval_widerface.py", line 74, in evaluate
for data in tqdm(iter(val_data)):
File "C:\AppData\Local\Continuum\anaconda3\envs\supernotman\lib\site-packages\tqdm\std.py", line 1099, in iter
for obj in iterable:
File "C:\AppData\Local\Continuum\anaconda3\envs\supernotman\lib\site-packages\torch\utils\data\dataloader.py", line 582, in next
return self._process_next_batch(batch)
File "C:\AppData\Local\Continuum\anaconda3\envs\supernotman\lib\site-packages\torch\utils\data\dataloader.py", line 608, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
ValueError: Traceback (most recent call last):
File "C:\AppData\Local\Continuum\anaconda3\envs\supernotman\lib\site-packages\torch\utils\data_utils\worker.py", line 99, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "C:\AppData\Local\Continuum\anaconda3\envs\supernotman\lib\site-packages\torch\utils\data_utils\worker.py", line 99, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "C:\Desktop\RetinaFace_super\dataloader.py", line 347, in getitem
annotation[0,0] = label[0] # x1
ValueError: setting an array element with a sequence.
0%|
I dont know what s go on
I really appreciate if you help me.
Thanks for your great job!
I'd use mobilenet V1 0.25 to replace your resnet ,however, I found it really hard to converge.
Although the loss was quite low even at the first several epochs, but it just keep that way forever.
Had you tried other light-weight backbone for your code? Could you share some details for your training?
Also, I am trying to increase # landmarks to 68 with the 300w dataset with your code, had you ever tried that?
Thanks!
Hello
Which data argumentation did you use in your actual trainning? Cuz I saw several methods that you had commented but not sure which ones did you actually use.
BTW, many of them are not working and have bugs.
for example,
add this to line 297 in dataloader.py
pad = torch.from_numpy(np.array(pad))
before this
padded_img = F.pad(img, pad, "constant", value=0).
Or it will show
TypeError: narrow(): argument 'start' (position 2) must be int, not numpy.int64
RuntimeError: Error(s) in loading state_dict for RetinaFace
Missing key(s) in state_dict: "body.conv1.weight", "body.bn1.weight", "body.bn1.bias",....
Unexpected key(s) in state_dict: "module.body.conv1.weight", "module.body.bn1.weight",...
hi, I was not found the landmarks in your annotations data. I'm trainning a model with resnet18, the landmarks' loss does not decline.Do landmarks and bbox separate to train?
Thanks for your work!
Is there any speed test(FPS) of your model? thx!
Great job! And could you upload your pretrained model?
Or could you send me by mail? Thank you!
Hello,
I am following your instructions to train the network. However, the label file, in the website, is not like how you described it in the instructions. I changed the name of the bounding box and annotations txt file name to label.txt and the dataloader.py code cannot read it. What is the solution to that problem ?
To be more clear the file in the website of the widerface is like that:
0--Parade/0_Parade_marchingband_1_849.jpg
1
449 330 122 149 0 0 0 0 0 0
0--Parade/0_Parade_Parade_0_904.jpg
1
361 98 263 339 0 0 0 0 0 0
0--Parade/0_Parade_marchingband_1_799.jpg
21
78 221 7 8 2 0 0 0 0 0
78 238 14 17 2 0 0 0 0 0
113 212 11 15 2 0 0 0 0 0
134 260 15 15 2 0 0 0 0 0
163 250 14 17 2 0 0 0 0 0
201 218 10 12 2 0 0 0 0 0
182 266 15 17 2 0 0 0 0 0
And the output of the train.py is like that:
Traceback (most recent call last):
File "train.py", line 150, in
main()
File "train.py", line 53, in main
dataset_train = TrainDataset(train_path,transform=transforms.Compose([Resizer(),PadToSquare()]))
File "/home/barkntuncer/RetinaFace_Pytorch/dataloader.py", line 45, in init
label = [float(x) for x in line]
File "/home/barkntuncer/RetinaFace_Pytorch/dataloader.py", line 45, in
label = [float(x) for x in line]
ValueError: could not convert string to float: '0--Parade/0_Parade_marchingband_1_849.jpg'
Currently when tracing the model, the following two warnings apply:
/d/dev/RetinaFace_Pytorch/anchors.py:27: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
image_shape = np.array(image_shape)
/d/dev/RetinaFace_Pytorch/anchors.py:40: TracerWarning: torch.from_numpy results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
return torch.from_numpy(all_anchors.astype(np.float32)).cuda()
The model is then using a hardcoded 640x640 input size and anchors whereas the input size should be dynamic.
it seems to be no prior_box part in this code. is it unnecessary?
Does anyone tested speed on image with a decent GPU device? such as GTX1080ti etc.
在detect.py 文件中,有padded image 这一环节,你是否考虑过对于大小不是640×640的图片,在padding和resize之后输入的模型中,得到的人脸框的位置和关键点的位置与原图之间会有偏移?这个偏移是否应该在显示的时候矫正一下呢?
Hi, I can't reproduct the real precision, Can you give me the model_epoch_200.pt, Thanx
when used a same picture with different scales will get different result. It can't adapt to different size of pictures. It can't detect small face and large face. Is there idea to deal with this?
你好,在使用你的代码做人脸检测。我突发奇想,想用来检测人体和人体关键点+人脸和人脸关键点,请问这个是否可行
I was trying to fine tune pre-trained model but I think you current code did not provide this facility. I added a few lines in train.py, have a look at the following code. If you think it should be the part of it kindly add this in next commit. Thanks for your good work.
import argparse
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, models, transforms
from dataloader import TrainDataset, ValDataset, collater, RandomCroper, RandomFlip, Resizer, PadToSquare
from torch.utils.data import Dataset, DataLoader
from terminaltables import AsciiTable, DoubleTable, SingleTable
from tensorboardX import SummaryWriter
from torch.optim import lr_scheduler
import torch.distributed as dist
import eval_widerface
import torchvision
import model
import os
from torch.utils.data.distributed import DistributedSampler
import torchvision_model
def get_args():
parser = argparse.ArgumentParser(description="Train program for retinaface.")
parser.add_argument('--data_path', type=str, help='Path for dataset,default WIDERFACE')
parser.add_argument('--batch', type=int, default=16, help='Batch size')
parser.add_argument('--epochs', type=int, default=200, help='Max training epochs')
parser.add_argument('--shuffle', type=bool, default=True, help='Shuffle dataset or not')
parser.add_argument('--img_size', type=int, default=640, help='Input image size')
parser.add_argument('--verbose', type=int, default=10, help='Log verbose')
parser.add_argument('--save_step', type=int, default=10, help='Save every save_step epochs')
parser.add_argument('--eval_step', type=int, default=3, help='Evaluate every eval_step epochs')
parser.add_argument('--save_path', type=str, default='./out', help='Model save path')
parser.add_argument('--depth', help='Resnet depth, must be one of 18, 34, 50, 101, 152', type=int, default=50)
parser.add_argument('--pretrained_model_path', type=str, default='./out', help='Pre-Trained Model Path')
args = parser.parse_args()
print(args)
return args
def main():
args = get_args()
if not os.path.exists(args.save_path):
os.mkdir(args.save_path)
log_path = os.path.join(args.save_path,'log')
if not os.path.exists(log_path):
os.mkdir(log_path)
writer = SummaryWriter(log_dir=log_path)
data_path = args.data_path
train_path = os.path.join(data_path,'train/label.txt')
val_path = os.path.join(data_path,'val/label.txt')
# dataset_train = TrainDataset(train_path,transform=transforms.Compose([RandomCroper(),RandomFlip()]))
dataset_train = TrainDataset(train_path,transform=transforms.Compose([Resizer(),PadToSquare()]))
dataloader_train = DataLoader(dataset_train, num_workers=8, batch_size=args.batch, collate_fn=collater,shuffle=True)
# dataset_val = ValDataset(val_path,transform=transforms.Compose([RandomCroper()]))
dataset_val = ValDataset(val_path,transform=transforms.Compose([Resizer(),PadToSquare()]))
dataloader_val = DataLoader(dataset_val, num_workers=8, batch_size=args.batch, collate_fn=collater)
total_batch = len(dataloader_train)
# Create the model
# if args.depth == 18:
# retinaface = model.resnet18(num_classes=2, pretrained=True)
# elif args.depth == 34:
# retinaface = model.resnet34(num_classes=2, pretrained=True)
# elif args.depth == 50:
# retinaface = model.resnet50(num_classes=2, pretrained=True)
# elif args.depth == 101:
# retinaface = model.resnet101(num_classes=2, pretrained=True)
# elif args.depth == 152:
# retinaface = model.resnet152(num_classes=2, pretrained=True)
# else:
# raise ValueError('Unsupported model depth, must be one of 18, 34, 50, 101, 152')
# Create torchvision model
return_layers = {'layer2':1,'layer3':2,'layer4':3}
retinaface = torchvision_model.create_retinaface(return_layers)
retinaface = retinaface.cuda()
retinaface = torch.nn.DataParallel(retinaface).cuda()
retinaface.training = True
try:
pretrained_model_path = args.pretrained_model_path
state_dict=None
with open( pretrained_model_path , "br" ) as f:
stat_dict = torch.load(f)
retinaface.load_state_dict( stat_dict )
print( "Previuos Model is Successfully Loaded :)" )
except:
print( "Error while loading previous model :(" )
optimizer = optim.Adam(retinaface.parameters(), lr=1e-3)
# optimizer = optim.SGD(retinaface.parameters(), lr=1e-2, momentum=0.9, weight_decay=0.0005)
# scheduler = optim.lr_scheduler.ReduceLROnPlateau(optimizer, patience=3, verbose=True)
# scheduler = optim.lr_scheduler.StepLR(optimizer, step_size=30, gamma=0.1)
#scheduler = optim.lr_scheduler.MultiStepLR(optimizer, milestones=[10,30,60], gamma=0.1)
print('Start to train.')
epoch_loss = []
iteration = 0
for epoch in range(args.epochs):
retinaface.train()
# Training
for iter_num,data in enumerate(dataloader_train):
optimizer.zero_grad()
classification_loss, bbox_regression_loss,ldm_regression_loss = retinaface([data['img'].cuda().float(), data['annot']])
classification_loss = classification_loss.mean()
bbox_regression_loss = bbox_regression_loss.mean()
ldm_regression_loss = ldm_regression_loss.mean()
# loss = classification_loss + 1.0 * bbox_regression_loss + 0.5 * ldm_regression_loss
loss = classification_loss + bbox_regression_loss + ldm_regression_loss
loss.backward()
optimizer.step()
if iter_num % args.verbose == 0:
log_str = "\n---- [Epoch %d/%d, Batch %d/%d] ----\n" % (epoch, args.epochs, iter_num, total_batch)
table_data = [
['loss name','value'],
['total_loss',str(loss.item())],
['classification',str(classification_loss.item())],
['bbox',str(bbox_regression_loss.item())],
['landmarks',str(ldm_regression_loss.item())]
]
table = AsciiTable(table_data)
log_str +=table.table
print(log_str)
# write the log to tensorboard
writer.add_scalar('losses:',loss.item(),iteration*args.verbose)
writer.add_scalar('class losses:',classification_loss.item(),iteration*args.verbose)
writer.add_scalar('box losses:',bbox_regression_loss.item(),iteration*args.verbose)
writer.add_scalar('landmark losses:',ldm_regression_loss.item(),iteration*args.verbose)
iteration +=1
# Eval
if epoch % args.eval_step == 0:
print('-------- RetinaFace Pytorch --------')
print ('Evaluating epoch {}'.format(epoch))
recall, precision = eval_widerface.evaluate(dataloader_val,retinaface)
print('Recall:',recall)
print('Precision:',precision)
writer.add_scalar('Recall:', recall, epoch*args.eval_step)
writer.add_scalar('Precision:', precision, epoch*args.eval_step)
# Save model
if (epoch + 1) % args.save_step == 0 or iter_num>=100:
torch.save(retinaface.state_dict(), args.save_path + '/model_epoch_{}.pt'.format(epoch + 1))
writer.close()
if __name__=='__main__':
main()
How much memory do you estimate this project needs?
I'm using a Titan V with 12GB and this goes out of memory with a batch size of 16 (default was 32), which seems quite small for WIDER face.
I had to use a batch size of 8, which used 10GB.
@supernotman Hi, thank you for this great project.
May I understand why you use cross entropy loss for classification head, other than focal loss? As Focal loss is the key feature of retinaNet.
I think there maybe some mistakes of channels in context module
x1 = self.det_conv1(x) # 256 channels
x_ = self.det_context_conv1(x) # 128 channels
x2 = self.det_context_conv2(x_) # 128 channels
x3_ = self.det_context_conv3_1(x_) # 128 channels
x3 = self.det_context_conv3_2(x3_) # 128 channels
and after concat x1,x2,x3 I got 512 channels. This is inconsistent with the paper.(256 channels)
Is there anything wrong with me?
Anaconda and pip can't install cpools. Can you help me?
These two test images look better in other implementations of RetinaFace for PyTorch, for eg.
https://github.com/bogireddytejareddy/retinaface-pytorch/blob/master/test_results/t1.jpg
https://github.com/bogireddytejareddy/retinaface-pytorch/blob/master/test_results/t4.jpg
focal_loss = False
# focal loss
if focal_loss:
alpha = 0.25
gamma = 2.0
alpha_factor = torch.ones(targets.shape).cuda() * alpha
alpha_factor = torch.where(torch.eq(targets, 1.), alpha_factor, 1. - alpha_factor)
focal_weight = torch.where(torch.eq(targets, 1.), 1. - classification, classification)
focal_weight = alpha_factor * torch.pow(focal_weight, gamma)
bce = -(targets * torch.log(classification) + (1.0 - targets) * torch.log(1.0 - classification))
cls_loss = focal_weight * bce
cls_loss = torch.where(torch.ne(targets, -1.0), cls_loss, torch.zeros(cls_loss.shape).cuda())
classification_losses.append(cls_loss.sum()/torch.clamp(num_positive_anchors.float(), min=1.0))
else:
if positive_indices.sum() > 0:
classification_losses.append(positive_losses.mean() + sorted_losses.mean())
else:
classification_losses.append(torch.tensor(0).float().cuda())
never use focalloss???
Thank you for your open source, but I encountered the following problem when 104 epoch in training.can you help me? thanks
Traceback (most recent call last):
File "train.py", line 156, in
main()
File "train.py", line 111, in main
loss.backward()
File "/home/boyun/.conda/envs/retinaface/lib/python3.6/site-packages/torch/tensor.py", line 107, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home/boyun/.conda/envs/retinaface/lib/python3.6/site-packages/torch/autograd/init.py", line 93, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
I have the following result of image size (1200,1200):
Easy Val AP: 0.721983363755764
Medium Val AP: 0.742308954563704
Hard Val AP: 0.6196879642610857
Is there something wrong?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.