Code Monkey home page Code Monkey logo

inside-deep-learning's People

Contributors

edwardraff avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

inside-deep-learning's Issues

Chapter 2 " optimizer.zero_grad()"

In the beginning of chapter 2
2.1.3 The training loop

you used the following steps in you train simple nn function.
optimizer.zero_grad()
loss.backward()
optimizer.step()

but in the end of the chapter for def run_epoch function you used a different order:

2.4.2 Training and testing passes

if model.training:
loss.backward()
optimizer.step()
optimizer.zero_grad()

is there any reason for switching the order? i thought we had to zero the gradients firs thing at every epoch.

Custom Dataset class on Chapter_1

Hi, I was getting the followed erro when I executing this code:

from torch.utils.data import Dataset
from sklearn.datasets import fetch_openml

X, y = fetch_openml("mnist_784", version=1, return_X_y=True)

class SimpleDataset(Dataset):
    def __init__(self, X, y):
        super(SimpleDataset, self).__init__()
        self.X = X
        self.y = y
    
    def __getitem__(self, index):
        inputs = torch.tensor(self.X[index, :], dtype=torch.float32)
        targets = torch.tensor(int(self.y[index]), dtype=torch.int64)
        return inputs, targets

    def __len__(self):
        return self.X.shape[0]

dataset = SimpleDataset(X, y)
example, label = dataset[0]
InvalidIndexError: (tensor(0), slice(None, None, None))

The same was fixed when I change the code of the fetch_openml to:

X, y = fetch_openml("mnist_784", version=1, return_X_y=True, as_frame=False)

The problem was that whithout the as_frame, scikit will import the data as a DataFrame, not as numpy anymore.

Chapter_6.ipynb wrong test data

test_data = torchvision.datasets.FashionMNIST("./", train=True, transform=transforms.ToTensor(), download=True)

should use train=False

This changes the figures in this chapter significantly.

a question about auto grad. thx

Dear Edward,

From page 21 to page 23, when we are talking about auto grad,

we choose to test conditon ||prev - cur || < epsilon satisfies or not to check whether we have got the minimun

my question is : why not just to test whether the grad of cur is zero or not ?

that is to say :

can

while torch.linalg.norm(x_cur-x_prev) > epsilon:

be replaced by

epsilon = 1e-12 # an enough small value

while abs(cur.grad) > epsilon:

?

thanks a lot !

an advice for ch1

Backpropagation is really an important and fundamental topic in deep learning

yeah, i admit that bp is a little math-heavy and a little hard for new guys

but i also can not imagine that a guy who can not understand bp can undestand deep learning, really.

bp hurts people but it is a good hurt and a must hurt.

you can not omit it . so , please add bp in ch1.

Error in Chapter_6.ipynb

When running in colab (using GPU) I got the following error in cell:

rnn_3layer = nn.Sequential( #Simple old style RNN 
  EmbeddingPackable(nn.Embedding(len(all_letters), 64)), #(B, T) -> (B, T, D)
  nn.RNN(64, n, num_layers=3, batch_first=True), #(B, T, D) -> ( (B,T,D) , (S, B, D)  )
  LastTimeStep(rnn_layers=3), #We need to take the RNN output and reduce it to one item, (B, D)
  nn.Linear(n, len(namge_language_data)), #(B, D) -> (B, classes)
)

#Apply gradient cliping to maximize its performance
for p in rnn_3layer.parameters():
    p.register_hook(lambda grad: torch.clamp(grad, -5, 5))

rnn_results = train_network(rnn_3layer, loss_func, train_lang_loader, val_loader=test_lang_loader, score_funcs={'Accuracy': accuracy_score}, device=device, epochs=10)

Error is:

/usr/local/lib/python3.6/dist-packages/torch/nn/utils/rnn.py in pack_padded_sequence(input, lengths, batch_first, enforce_sorted)

    242 
    243     data, batch_sizes = \
--> 244         _VF._pack_padded_sequence(input, lengths, batch_first)
    245     return _packed_sequence_init(data, batch_sizes, sorted_indices, None)
    246 
RuntimeError: 'lengths' argument should be a 1D CPU int64 tensor, but got 1D cuda:0 Long tensor

some typos in ch 2

-p47 you write W_{d,c} instead of W^{d,c}

  • p46 Your comment about Y_pred.ravel() could have been made earlier on p40 where it was first introduced

a typo on page 36

In the code snippet 'train_simple_network'

optimizer.step() # updates all the parameters theta(k+1) = theta(k)yita gradient

it should have been theta(k+1) = theta(k) - yita gradient

Adding mounting file info

Hi,
Thanks for the examples.
I think it would make sense to add the following to the notebooks:

from google.colab import drive
drive.mount('/content/drive')

and

# Here you want to customize the path to the right location in your drive
!cp drive/MyDrive/Inside\ Deep\ Learning/idlmam.py .

AttributeError in Chapter_2.ipynb

Hi,

when executing the 2nd cell:

import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import * 
from idlmam import *

I get this error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-2-376bfb908340> in <module>()
      2 import torch.nn as nn
      3 import torch.nn.functional as F
----> 4 from torch.utils.data import *
      5 from idlmam import *

AttributeError: module 'torch.utils.data' has no attribute 'BatchSamplerDistributedSamplerDataset'

it is solved by importing only the used modules

from torch.utils.data import Dataset, DataLoader, TensorDataset

I'm not sure if it has something to do with my setup at Collab, but based on this post, it is related to version 1.7.0 of PyTorch

Thanks

Sliding filter, Code snippet in Chapter 3.2.1

In Chapter 3.2.1, there is an implementation of sliding the filter over the input:

filter = [1, 0, -1]
input = [1, 0, 2, -1, 1, 2]
output = []
for i in range(len(input) - len(filter)):
    result = 0
    for j in range(len(filter)):
        result += input[i+j] * filter[j]
    output.append(result) 

The first loop does not catch the last possible slide, so it should be:

filter = [1, 0, -1]
input = [1, 0, 2, -1, 1, 2]
output = []
for i in range(len(input) - len(filter) + 1):
    result = 0
    for j in range(len(filter)):
        result += input[i+j] * filter[j]
    output.append(result) 

PS: @EdwardRaff your book is absolutely brilliant!

Chapter 3.4.4 - Dimensions after nn.Flattening-Module

In Chapter 3.4.4 the code is shown for creating a first CNN. For using nn.Flattening before the last Layer, it says in the code comments (in the book it's point 10), "Converts from (B, C, W, H) ->(B, D) so we can use a Linear layer".
Shouldn't it actually be (B, filters, C, W, H) -> (B, filters*D) ?

typos and suggestions

Figure 6.2: the 4th "high complexity" function figure is not a function because the figure suggests that a particular x value can map to multiple distinct y values.

Section 6.3.2: in the algorithm annotations on page 231, "cat=3" should be "dim=3"

Section 6.6.2: "anything time 0" should be "anything times 0"

Section 6.6.2: "forget fate" -> "forget gate" in the second to last paragraph in that section.

Section 9.0: "GANS" -> "GANs" right before the start of section 9.1.

Section 9.5.1: in the algorithm annotations on page 382 "inear" -> "linear"

Section 9.5.3: "which is now always a possibility" -> "which is not always a possibility"

Section 11.2: in the first figure describing the data: "journxe" -> "journée"

Also in the last paragraph of Section 11.2: "almond" translates to "amande" not "amende". Perhaps a homograph example might be clearer, like "avocat" which means both "lawyer" and "avocado" in French.

In Figure 11.6, it is unclear why there is an arrow from the "z hat" box to the "Attention" box.

Section 12.2.1: "EmbeddingAttentionBad" -> "EmbeddingAttentionBag"

Section 13.1.1: The cats and dogs dataset is no longer downloadable from the link assigned to "data_url_zip" in the code example.

Section 14.3.1: It might be informative to show some additional parameter settings for the Beta distribution. In particular settings for which the distribution might not be U shaped or might be asymmetrical, and when it looks like a Uniform distribution.

There is a random figure on the next to last page in the book. Is this normal?

Use padding mask for attention in SimpleTransformerClassifier

I think in the forward pass of the TransformerEncoder a padding mask for the attention should be used.
The padding tokens need to be excluded when calculating the attention weights. This is related to Chapter 12.2.1.

See cell 33 here. See also the PyTorch docs for refernece.

It should be changed into something like this (the src_key_padding_mask needs to be True for the values that need to be masked out):

def forward(self, input):
        if self.padding_idx is not None:
            mask = input != self.padding_idx
            src_key_padding_mask = torch.logical_not(mask)
        else:
            mask = input == input 
            src_key_padding_mask = None
        x = self.embd(input) #(B, T, D)
        x = self.position(x) #(B, T, D)
        #Because the resut of our code is (B, T, D), but transformers 
        #take input as (T, B, D), we will have to permute the order 
        #of the dimensions before and after 
        x = self.transformer(x.permute(1,0,2), src_key_padding_mask=src_key_padding_mask) #(T, B, D)
        x = x.permute(1,0,2) #(B, T, D)
        #average over time
        context = x.sum(dim=1)/mask.sum(dim=1).unsqueeze(1)
        return self.pred(self.attn(x, context, mask=mask))```

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.