Code Monkey home page Code Monkey logo

optimizer's People

Contributors

ping-c avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

optimizer's Issues

Estimated standard deviations calculation

Hi,
At table 2 in the paper you say: " We also show the estimated standard deviations of the averages computed over 175 random data split and training seeds ".
But 'mnist_guess.yaml' parameters are: target_model_count=200 target_model_count_subrun=10 , which means that for each combination of (cur_num_samples, cur_loss_bin) there are 20 records in the Data Base (each record contain test_acc that was calculated over 10 models accuracies) that is used for the standard deviation calculation. So for my understanding, the standard deviation is calculated over 10 random data split and training seeds.
I would appreciate if you could clarify this issue for me, to make sure that i understand the way you calculated the standard deviations.

Thanks,
Tal

'forward_normalize' function

Hi,
At section 4.2 ("Results on 2-Class CIFAR-10/MNIST") in the paper you say: "we compare the performance of the models across different train loss levels after the model's weights have been normalized".
'LeNetModels' class contains 2 functions that are related to normalization: 'normalize' and 'forward_normalize'.
You don't use 'normalize' function during the training with G&C optimizer , but you do use 'forward_normalize'.
I have some questions regarding this function:

  1. Can you explain what's the idea behind normalizing the forward pass by the 'cum_norm' of the model's weights? I don't see how this function is equivalent to normalizing the model's weights.

  2. I saw that you are not using 'Softmax' at the end of your model as usually done in classification problem. Why is that? Is the 'forward_normalize' function is a substitute for the 'Softmax'? I am guessing that applying 'Softmax' at the end of the forward pass and using 'forward_normalize' will result in outputs that are close to zero.

  3. In line 509 at 'train_distributed.py' you pass 'forward_normalize' as the forward pass function for the train loss and accuracy calculation, but in line 687 you pass 'forward' as the forward pass function for the test train loss and accuracy calculation. Can you explain this difference?

Thanks!

reopening Estimated standard deviations calculation

Hi,
I am sorry for reopening this issue but something in your calculation of standard deviation of the estimated mean looks odd to me.
From my understanding, for each combination of (num_train_samples, loss_bin) you calculate 's' using 175 test accuracies that you found during the run.
Why do you calculate 's_mean' and treats it like the standard deviation of the estimated mean?
Is 's' not the result we are looking for?

Thanks!

Originally posted by @talrub in #4 (comment)

Problem in reproducing Table2 while working on Google Collab

Hi,

I uploaded the paper's code to Google Collab in order to restore Table 2 results on Arch:LeNet Optimizer: G&C .
I run the script : !python train_distributed.py -C configs/table2/mnist_guess.yaml .
I got results for:
num_samples=2 loss_bin= (0.3,0.35) after 320,000 tested models
num_samples=4 loss_bin= (0.3,0.35) after 1,760,000 tested model

But for num_samples=8 loss_bin= (0.3,0.35), i am not getting result even after 100,000,000 tested models ( The trials continue in an infinite loop).
I was wondering how many tested models did you try until you determined that there are no solutions with 100% training accuracy in a certain loss bin? (i saw in your paper that you got this case for the linear models).
I did not install your environment.yml in the Google Collab but i don't think that this is what causing the problem.
It will be great if you can advise what should i do in this situation.

Thanks,
Tal

train_distributed.py - line 526

Hi,
At line 526 you wrote: perfect_model_weights.append(model_result.get_weights_by_idx(perfect_model_idxs)) .

Shouldn't it be: perfect_model_weights.append(model_result.get_weights_by_idx(perfect_model_idxs.nonzero().squeeze(1))) ,
(similar to line 524) in order to take only models that correspond to the condition: ((es_l< train_loss) & (train_loss <= es_u) & (train_acc == 1.0)) instead of taking all models.

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.