Code Monkey home page Code Monkey logo

accurate-head-pose's People

Contributors

haofanwang avatar lucaskyle avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

accurate-head-pose's Issues

what MSE actually does?

part of loss is mse ( predicted * idx_tensor, yaw )

idx_tensor = [idx for idx in xrange(66)]
idx_tensor = Variable(torch.FloatTensor(idx_tensor)).cuda(gpu)

yaw_predicted = torch.sum(yaw_predicted * idx_tensor, 1)
loss_reg_yaw = reg_criterion(yaw_predicted, label_yaw_cont)

but torch.sum(yaw_predicted * idx_tensor, 1) has infinite solutions?
for example
for target angle 45
bins 44 = 0.5 , 46 = 0.5 are correct answer.

so what point to use such approach?

权重参数7 5 3 11 依据是什么

@haofanwang ,您好,作者,我读了您的论文,我想问下,对于设置权重参数7 5 3 11 依据是什么,是你做实验实验出来的吗?
我的理解是,反向传播中,回归的损失值,和 分类损失值是设置1:1 的大致比例,然后进行实验的微调吗?

data input label number doesnt match network output label

hey there:

I have a problem here,
in your data_set:

    bins_2 = np.array(range(-99, 102, 66))
    binned_pose_2 = np.digitize([yaw, pitch, roll], bins_2) - 1

that means -99 to 99 will be divided into 3 parts, 3 labels.
But your network output labels_2 is a 6-label output...

   self.fc_pitch_3 = nn.Linear(512 * block.expansion, 6)

that means the labels should be divided into 6 labels 6 classes(in the paper.)

regress loss error ,is too large, torch is 1.0.1 python 2.7

hello , haofanwang , I use Pose_300W_LP_multi, 300W_LP_filename_filtered.txt ,to train,
but the loss ,give the result as follows, I haven't changed it.
Do you know why?
torch is 1.0.1
python 2.7

Loading data.
Ready to train network.
Epoch [1/25], Iter [100/1912] Losses: Yaw 161.4367, Pitch 180.8725, Roll 41.3196
161.436706543
Epoch [1/25], Iter [200/1912] Losses: Yaw 124.8338, Pitch 489.4776, Roll 246.7312
124.83379364
Epoch [1/25], Iter [300/1912] Losses: Yaw 147.8351, Pitch 60.8210, Roll 62.3920
147.83505249
Epoch [1/25], Iter [400/1912] Losses: Yaw 158.3398, Pitch 139.8713, Roll 43.4810
158.339782715
Epoch [1/25], Iter [500/1912] Losses: Yaw 894.2488, Pitch 936.0706, Roll 331.0779
894.248779297
Epoch [1/25], Iter [600/1912] Losses: Yaw 157.3936, Pitch 44.9767, Roll 61.1448
157.393615723
Epoch [1/25], Iter [700/1912] Losses: Yaw 189.1127, Pitch 605.8845, Roll 48.3591
189.112686157
Epoch [1/25], Iter [800/1912] Losses: Yaw 188.4961, Pitch 258.0078, Roll 55.0151
188.49609375
Epoch [1/25], Iter [900/1912] Losses: Yaw 298.6048, Pitch 60.1183, Roll 77.0532
298.604797363
Epoch [1/25], Iter [1000/1912] Losses: Yaw 107.6561, Pitch 78.1015, Roll 55.0624
107.656112671
Epoch [1/25], Iter [1100/1912] Losses: Yaw 154.2444, Pitch 479.2390, Roll 328.9452
154.244354248
Epoch [1/25], Iter [1200/1912] Losses: Yaw 643.8879, Pitch 2594.8455, Roll 3647.3457
643.887878418
Epoch [1/25], Iter [1300/1912] Losses: Yaw 82.8954, Pitch 194.4952, Roll 152.2793
82.8953704834
Epoch [1/25], Iter [1400/1912] Losses: Yaw 81.9724, Pitch 168.7021, Roll 279.5479
81.9723968506
Epoch [1/25], Iter [1500/1912] Losses: Yaw 367.3708, Pitch 193.3879, Roll 79.1561
367.370758057
Epoch [1/25], Iter [1600/1912] Losses: Yaw 180.7048, Pitch 1052.1819, Roll 48.1278
180.704772949
Epoch [1/25], Iter [1700/1912] Losses: Yaw 139.1630, Pitch 122.4742, Roll 91.7871
139.163024902
Epoch [1/25], Iter [1800/1912] Losses: Yaw 159.9963, Pitch 180.5340, Roll 41.4436

model idea

i am trying my idea to predict pitch,yaw,roll

input: aligned face, mask of it from 68 points

model1: encoder -> latent(512) -> decoder
model2: latent(512) -> dense... -> pitch,yaw,roll (hybrid coarse-fine)

model1 train:

rec = model1( [randomly warped and transformed inp_face] )

DSSIM ( transformed inp_face*inp_mask, rec*inp_mask )

model2 train:

latent space of model1(inp_face) -> pitch,yaw,roll (hybrid coarse-fine)

mask excludes background
DSSIM faster reconstruct an image by pushing attention to edges first
model2 training to fetch pitch,yaw,roll info from latent of decoder space

it seems it works

python_2019-04-24_20-14-30

model2 config

x = latent

            x = Dense(4096, activation='relu')(latent)

            output = []
            for class_num in class_nums:
                pitch = Dense(class_num, activation='softmax')(x)
                yaw = Dense(class_num, activation='softmax')(x)
                roll = Dense(class_num, activation='softmax')(x)
                output += [pitch,yaw,roll]
                
            return output

any advice? should I add more dense and/or dropout?

Txt Annotations for multi datasets

Hi @haofanwang !

Many thanks for sharing your work! However right now this code is impossible to run due to missing files.
All "multi" datasets require labels in txt format, which are missing. Can you provide them please?

Also if you could make the files available on some other location than baidu (which is inaccessible from outside of China) that would be much appreciated :)

Test Images not Listed

Hello,
Thanks for sharing code. I want to test model in my own dataset which is not on the dataset list.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.