Code Monkey home page Code Monkey logo

l3's People

Contributors

jacobandreas avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

l3's Issues

Reproduce L3 PbD results

Hi Jacob,

Which exp/* directory reproduces the 80 Val, 76 Test L3 programming by demonstration results? Haven't been able to find the numbers in any of the train.outs.

Thanks!

Why is the size of the featurized images 4608, and not either 8192 or 2048?

Hi Jacob,

Working my way slowly through L3 :) It looks like the featurized inputs are vectors of length 4608?

In [1]: import numpy as np

In [2]: d = np.load('train/inputs.feats.npy')

In [3]: d.shape
Out[3]: (9000, 4608)

However, looking at VGG16, it seems to me to look like it consists of three types of layers:

  • max-pooling, which halves the width and the height
  • relu, which doesnt change any dimension
  • padded conv, which also doesnt change the width or the height; and fixes the channels to an absolute number

Depending on whether we use the maxpooling after the final conv, there are either 4 max-poolings, or 5, meaning that the width and height will each be divided either by 2^4 = 16 or 2^5 = 32? The number of channels in either case is set by the final conv, which is 512?

Then, given an input that is 3 x 64 x 64, the output will be either:

  • 512 * 4 * 4 (if we ignore the pooling after the final conv), or
  • 512 * 2 * 2 (if we include it)

Then, the flattened vector size in each case would be 8192 or 2048?

It looks like in order to obtain a size of 4608, if we assume there are 512 channels, we'd need a final output dimension of 512 x 3 x 3?

What am I missing in the above analysis?

Shapesworld link is not valid

Thanks for open sourcing this work! Is there an updated link for http://people.eecs.berkeley.edu/~jda/data/shapeworld.tar.gz ?

Code to generate additional shapeworld data?

Hi Jacob,

Really like your shapeworld dataset. Question: how can I go about creating additional data, potentially with tweaked characteristics, eg number of distractors etc?

Hugh

How to run this to reproduce Table 1 results?

Hi Jacob,

Awesome paper :)

Quite hard to figure out how to run this :)

  • it looks like cls.py is the entry point, and ClsModel is the model that corresponds to the L3 "Learning with latent language" paper Table 1, is this a fair impression?
  • I've tried running with:
python cls.py -train -n_epochs 10000

which gives outputs like:

[iter]    774
[loss]    10.2767
[trn_acc] 0.9700
[val_acc] 0.4980
[val_same_acc] 0.5280
[val_mean_acc] 0.5130

This looks like the results of training the interpretation model, is that right?

  • How can I train in addition the proposal model?
  • How can I train and run the full model, as in Table 1?

Model selection

Hi @jacobandreas, this is a very interesting work. I wanted to understand how you selected the model for reporting the numbers in the paper, specifically the few-shot classification task. Was it the best model on the validation set? How many epochs of training did you run?

License?

Hi Jacob,

Question: what license is this code provided under?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.