Project repository for Fall 2023.
Additional project updates and ideas are tracked here in this Google document.
Need to debug/figure out why model(s') performance is/are drastically different from notebooks 4 and 6 & 7.
Results are located in sandbox4
notebook. At the moment, not all tasks assigned from Mon., 14 Aug 2023 have been completed. So far:
- [] Add option to store gradient norm of each layer, stored separately
- Change linear layers to: CNN + 1 linear layer
- Make deep model (5 layers), and train it to perfection (up to 99% or higher train accuracy)
- Save the model (we’ll call this the “ground model”) (if time, create 5 ground models)
- [] Then, create 10 models per noise level (pick 10 noise levels, between totally destroyed and basically no impact) (also loop through which layer)→ turns into 500 models. Make them noisy, measure all the things above (robustness, generalization.1, try generalization.2)
- [] Can experiment with gradcam (interesting but not most important)
- Try training with 32 all the way (in conv layers) - see if model can still be 99% good
- Use a smaller model, smallest non-trivial model
- Reduce number of linear layers
- Start profiling (draw on piece of paper)
- [] Look for number of weights in each model
- [] Get model training up to 100%
- Add option to store gradient norm of each layer, stored separately
- Create table, row -> model, col -> specs (grad norm, layerwise norm, specify train/test accuracy), list number of tunable parameters for each model.
- Add norms of total and/or per layer parameters to the table.
- [] GradCam (wishlist or next step)
Results are located in sandbox4
notebook. At the moment, not all tasks assigned from last week have been completed. So far:
- Add option to store gradient norm of each layer, stored separately
- Change linear layers to: CNN + 1 linear layer
- Make deep model (5 layers), and train it to perfection (up to 99% or higher train accuracy)
- Save the model (we’ll call this the “ground model”) (if time, create 5 ground models)
- Then, create 10 models per noise level (pick 10 noise levels, between totally destroyed and basically no impact) (also loop through which layer)→ turns into 500 models. Make them noisy, measure all the things above (robustness, generalization.1, try generalization.2)
- Can experiment with gradcam (interesting but not most important)
Current results are located in the sandbox3
notebook. The results mainly include a function whose input is a model and it returns a dictionary of gradients for both the weights and the bias(es). The dataset used was the MNIST
dataset. The model used was a simple 5-layer neural network.