haofanwang / accurate-head-pose Goto Github PK
View Code? Open in Web Editor NEWPytorch code for Hybrid Coarse-fine Classification for Head Pose Estimation
Pytorch code for Hybrid Coarse-fine Classification for Head Pose Estimation
part of loss is mse ( predicted * idx_tensor, yaw )
idx_tensor = [idx for idx in xrange(66)]
idx_tensor = Variable(torch.FloatTensor(idx_tensor)).cuda(gpu)
yaw_predicted = torch.sum(yaw_predicted * idx_tensor, 1)
loss_reg_yaw = reg_criterion(yaw_predicted, label_yaw_cont)
but torch.sum(yaw_predicted * idx_tensor, 1)
has infinite solutions?
for example
for target angle 45
bins 44 = 0.5 , 46 = 0.5 are correct answer.
so what point to use such approach?
@haofanwang ,您好,作者,我读了您的论文,我想问下,对于设置权重参数7 5 3 11 依据是什么,是你做实验实验出来的吗?
我的理解是,反向传播中,回归的损失值,和 分类损失值是设置1:1 的大致比例,然后进行实验的微调吗?
How do you process the BIWI data set?
hey there:
I have a problem here,
in your data_set:
bins_2 = np.array(range(-99, 102, 66))
binned_pose_2 = np.digitize([yaw, pitch, roll], bins_2) - 1
that means -99 to 99 will be divided into 3 parts, 3 labels.
But your network output labels_2 is a 6-label output...
self.fc_pitch_3 = nn.Linear(512 * block.expansion, 6)
that means the labels should be divided into 6 labels 6 classes(in the paper.)
hello , haofanwang , I use Pose_300W_LP_multi, 300W_LP_filename_filtered.txt ,to train,
but the loss ,give the result as follows, I haven't changed it.
Do you know why?
torch is 1.0.1
python 2.7
Loading data.
Ready to train network.
Epoch [1/25], Iter [100/1912] Losses: Yaw 161.4367, Pitch 180.8725, Roll 41.3196
161.436706543
Epoch [1/25], Iter [200/1912] Losses: Yaw 124.8338, Pitch 489.4776, Roll 246.7312
124.83379364
Epoch [1/25], Iter [300/1912] Losses: Yaw 147.8351, Pitch 60.8210, Roll 62.3920
147.83505249
Epoch [1/25], Iter [400/1912] Losses: Yaw 158.3398, Pitch 139.8713, Roll 43.4810
158.339782715
Epoch [1/25], Iter [500/1912] Losses: Yaw 894.2488, Pitch 936.0706, Roll 331.0779
894.248779297
Epoch [1/25], Iter [600/1912] Losses: Yaw 157.3936, Pitch 44.9767, Roll 61.1448
157.393615723
Epoch [1/25], Iter [700/1912] Losses: Yaw 189.1127, Pitch 605.8845, Roll 48.3591
189.112686157
Epoch [1/25], Iter [800/1912] Losses: Yaw 188.4961, Pitch 258.0078, Roll 55.0151
188.49609375
Epoch [1/25], Iter [900/1912] Losses: Yaw 298.6048, Pitch 60.1183, Roll 77.0532
298.604797363
Epoch [1/25], Iter [1000/1912] Losses: Yaw 107.6561, Pitch 78.1015, Roll 55.0624
107.656112671
Epoch [1/25], Iter [1100/1912] Losses: Yaw 154.2444, Pitch 479.2390, Roll 328.9452
154.244354248
Epoch [1/25], Iter [1200/1912] Losses: Yaw 643.8879, Pitch 2594.8455, Roll 3647.3457
643.887878418
Epoch [1/25], Iter [1300/1912] Losses: Yaw 82.8954, Pitch 194.4952, Roll 152.2793
82.8953704834
Epoch [1/25], Iter [1400/1912] Losses: Yaw 81.9724, Pitch 168.7021, Roll 279.5479
81.9723968506
Epoch [1/25], Iter [1500/1912] Losses: Yaw 367.3708, Pitch 193.3879, Roll 79.1561
367.370758057
Epoch [1/25], Iter [1600/1912] Losses: Yaw 180.7048, Pitch 1052.1819, Roll 48.1278
180.704772949
Epoch [1/25], Iter [1700/1912] Losses: Yaw 139.1630, Pitch 122.4742, Roll 91.7871
139.163024902
Epoch [1/25], Iter [1800/1912] Losses: Yaw 159.9963, Pitch 180.5340, Roll 41.4436
Is it possible to store your pretrained models on a hosting service different to Baidu? Baidu require a network disk client to download them, which I do not have and cannot install.
我在尝试使用BIWI和AFLW数据集按照此提出的代码进行训练时,需要提取bbox文件,但是下载的数据集没有,想请问是怎么把它转换为类似300W_LP或者AFLW2000数据集那样可以直接采用此代码调用。盼回复,谢谢。
i am trying my idea to predict pitch,yaw,roll
input: aligned face, mask of it from 68 points
model1: encoder -> latent(512) -> decoder
model2: latent(512) -> dense... -> pitch,yaw,roll (hybrid coarse-fine)
model1 train:
rec = model1( [randomly warped and transformed inp_face] )
DSSIM ( transformed inp_face*inp_mask, rec*inp_mask )
model2 train:
latent space of model1(inp_face) -> pitch,yaw,roll (hybrid coarse-fine)
mask excludes background
DSSIM faster reconstruct an image by pushing attention to edges first
model2 training to fetch pitch,yaw,roll info from latent of decoder space
it seems it works
model2 config
x = latent
x = Dense(4096, activation='relu')(latent)
output = []
for class_num in class_nums:
pitch = Dense(class_num, activation='softmax')(x)
yaw = Dense(class_num, activation='softmax')(x)
roll = Dense(class_num, activation='softmax')(x)
output += [pitch,yaw,roll]
return output
any advice? should I add more dense and/or dropout?
Hi @haofanwang !
Many thanks for sharing your work! However right now this code is impossible to run due to missing files.
All "multi" datasets require labels in txt format, which are missing. Can you provide them please?
Also if you could make the files available on some other location than baidu (which is inaccessible from outside of China) that would be much appreciated :)
Thank you for your code! I retrain the model on 300W-LP and evaluate it on AFLW2000, the MAE is 5.71, which is worse than 5.39 in paper, I just change the dataset using 300WLP-multi, can you tell me why? Thank you!
Hello,
Thanks for sharing code. I want to test model in my own dataset which is not on the dataset list.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.