rasbt / deeplearning-models Goto Github PK

A collection of various deep learning architectures, models, and tips

License: MIT License

Jupyter Notebook 99.81% Python 0.19% Shell 0.01%

deeplearning-models's Issues

Question about conditional autoencoder

Hi sebastian,
I have some questions about CVAE,hope you can help me understand
So,rather than training CVAE on images,I am trying to train it on a list of numpy arrays,each array contains 45 parameters.
As for my condition variable it consists of 33 parameters .
in all the training X's total size is 1500x43 and the condition variable's size is 33x3
Any ideas on what modifications I can do to run CVAE on my example.Basically I am trying to generate new values for each of the 45 parameters given the values in the condition variable.
Also,will a CVAE be a suitable choice for my case.

pre-trained model

thanks a lot about an amazing project. I want to ask if there are pretrained model of Resnet18 trained on celeba.

how to choose the parameter num_epochs

first, thanks for your excellent project. It's very friendly for beginner.
I have noticed that on dog-vs-cats dataset the hyper parameter of num_epochs is 100 which is bigger than that on CIFAR dataset. CIFAR dataset has more data and more classes. So why we need to train out net more on this dataset?How can I choose a proper hyper parameter of num_epochs. Looking forward for your response.

TSV to HDF5 converter on a very large dataset

Hi,
I'm trying to convert several TSV files from c4 200M dataset into HDF5 format and I based my conversion on your notebook.

The dataset is composed by 10 files, each file containing approximately 18 million records, with 2 string columns.
Given the size of the dataset, I thought that converting it to HDF5 format would give a significant benefit and would allow me to know the shape of each file and give significant performance boost in the read of chunks of the dataset.

In a first trial I converted 1 million records in about 3 minutes, however when I tried to convert all 18 million records it is taking more than 6 hours per file.
I am currently loading my tsv in the following way

def csv_to_hf5(csv_path, num_lines=1000000, chunksize=100000, columns=None):
    if columns is None:
        columns = ['input', 'labels']
    csv_path = pl.Path(csv_path)

    hdf_filename = csv_path.parent / pl.Path(csv_path).name.replace('.tsv', '.hf5')

    # suppose this is a large CSV that does not
    # fit into memory:

    # Get number of lines in the CSV file if it's on your hard drive:
    # num_lines = subprocess.check_output(['wc', '-l', in_csv])
    # num_lines = int(nlines.split()[0])
    # use 10,000 or 100,000 or so for large files

    dt = h5py.special_dtype(vlen=str)

    # this is your HDF5 database:
    with h5py.File(hdf_filename, 'w') as h5f:

        # use num_features-1 if the csv file has a column header
        dset1 = h5f.create_dataset('input',
                                   shape=(num_lines,),
                                   compression=9,
                                   dtype=dt
                                   )
        dset2 = h5f.create_dataset('labels',
                                   shape=(num_lines,),
                                   compression=9,
                                   dtype=dt
                                   )

        # change range argument from 0 -> 1 if your csv file contains a column header
        for i in tqdm(range(0, num_lines, chunksize)):
            df = pd.read_csv(csv_path,
                             sep='\t',
                             names=columns,
                             header=None,  # no header, define column header manually later
                             nrows=chunksize,  # number of rows to read at each iteration
                             skiprows=i,
                             )  # skip rows that were already read

            features = df.input.values.astype(str)
            labels = df.labels.values.astype(str)

            # use i-1 and i-1+10 if csv file has a column header
            dset1[i:i + chunksize] = features
            dset2[i:i + chunksize] = labels

where i set num_lines equal to the total lines of each file, where chunksize = 10000.

I did not expect this performance degradation, have you ever tried to use your code to convert dataset of a similar dimension?

Thanks in advance.

Feature Request: FCN with Vgg 16 and Resnet Backbones

Hello, Sebastian, can you please also include, FCN with Vgg16 and Resnet backbones sometime in near future. Currently I could only find pytorch implementations with Vgg16 backbones but not Resnet. Also many repos don't explain the appropriate cropping and pixel alignment with vgg16 backbone.

multi label

Hello,

I appreciate your work... I would like to upgrade it from binary labels to multi labels.

class CelebaDataset(Dataset):
    """Custom Dataset for loading CelebA face images"""

    def __init__(self, csv_path, img_dir, transform=None):
    
        df = pd.read_csv(csv_path, index_col=0)
        self.img_dir = img_dir
        self.csv_path = csv_path
        self.img_names = df.index.values
        self.y = df['Male'].values # <-- this needs to be changed  with other labels...
        self.transform = transform

What to do if I want to change it to several classes?

No file of "pytorch_ipynb/mlp/mlp-sequential.ipynb"

In the folder of pytorch_ipynb/mlp,I can't find the file "Sequential API and hooks [PyTorch]“.

自动驾驶更新笔记

您好， 看了您总结的内容非常全面，可否引荐下本人的笔记，把我对自动驾驶的理解分享给大家，希望大家和我一起不断完善相关内容 谢谢您
Autopilot-Updating-Notes

abhorrent implementation between tf and torch

In following implementations, op max_pool in pytorch uses stride 2. However, op max_pool in tensorflow uses stride 1.
I am confused about it that max_pool should sub-sample signals but in this imlementation it don't have ability.
https://github.com/rasbt/deeplearning-models/blob/master/pytorch_ipynb/cnn/cnn-basic.ipynb
https://github.com/rasbt/deeplearning-models/blob/master/tensorflow1_ipynb/cnn/cnn-basic.ipynb

Thanks you in advance. @rasbt

Imbalanced Classes

Hi,

For handling imbalance, I tried to duplicate the rows of minority class in train.csv file itself. But I am getting an error while joining the paths:

Traceback (most recent call last):
File "/home/surabhi/celeba-dataset/classify_bl_aug.py", line 302, in
model_conv = train_model(model_conv, optimizer_conv, exp_lr_scheduler, num_epochs=10)
File "/home/surabhi/celeba-dataset/classify_bl_aug.py", line 207, in train_model
for inputs, labels in dataloaders[phase]:
File "/home/dataset/packages/pytorch/1.0.0/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 637, in next
return self._process_next_batch(batch)
File "/home/dataset/packages/pytorch/1.0.0/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 658, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
TypeError: Traceback (most recent call last):
File "/home/dataset/packages/pytorch/1.0.0/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/dataset/packages/pytorch/1.0.0/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/surabhi/celeba-dataset/classify_bl_aug.py", line 69, in getitem
img = Image.open(os.path.join(self.img_dir, self.img_names[index]))
File "/home/dataset/packages/python/3.7/lib/python3.7/posixpath.py", line 94, in join
genericpath._check_arg_types('join', a, *p)
File "/home/dataset/packages/python/3.7/lib/python3.7/genericpath.py", line 149, in _check_arg_types
(funcname, s.class.name)) from None
TypeError: join() argument must be str or bytes, not 'int64'

CNN Link

Convolutional Neural Network in TensorFlow link is not opening. (page not found)

Understanding the models

Thank you SO MUCH for organizing the ever growing corpus. HOWEVER, it would be nice if there were an algorithm that would help me choose the right one. For example, if I wanted to do reverse style transfer (convert a monet painting into a photograph) Which model should I use? And obviously I would first need a training set (which I could generate by processing a billion images from google with a generic style transfer app).

In cnn-vgg16.ipynb, its acc has remained unchanged at 10%!

The results are as follows.

# Epoch: 001/010 | Train: 10.000% | Loss: 2.520%
# Time elapsed: 2.24 min
# Epoch: 010/010 | Train: 10.000% | Loss: 2.303%
# Time elapsed: 22.42 min
# Total Training Time: 22.42 min
# Test accuracy: 10.00%

I have tried to increase the learning rate by adding Dropout and nn.Adaptive eavgpool2d, but acc has remained unchanged!
Thanks!

missing data folder

I tried to run the perceptron model locally after cloning. But the data folder is missing in the repo.
I got this error : OSError: ../../ch02_perceptron/perceptron_toydata.txt not found.

HTTPError: HTTP Error 503: Service Unavailable

deeplearning-models/pytorch_ipynb/cnn/cnn-resnet50-mnist-dataparallel.ipynb

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to data/MNIST/raw/train-images-idx3-ubyte.gz
0it [00:00, ?it/s]
---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
<ipython-input-4-38058be550e3> in <module>
      5 # Note transforms.ToTensor() scales input images
      6 # to 0-1 range
----> 7 train_dataset = datasets.MNIST(root='data', 
      8                                train=True,
      9                                transform=transforms.ToTensor(),

MNIST download not working...

Imbalanced Classes

Hi,
Does this implementation solves any class imbalanced datasets ? Since of the features of Celeba dataset has high class imbalance, does your solution addresses this issue ?

Perception TF v1 code savedModel graph says it has no operation

hi I am using this repo code

g = tf.Graph()
with g.as_default():
    # Graph Inputs
    features = tf.placeholder(dtype=tf.float32,
                              shape=[None, 2], name='features')
    targets = tf.placeholder(dtype=tf.float32,
                             shape=[None, 1], name='targets')

    # Model Parameters
    weights = tf.Variable(tf.zeros(shape=[2, 1],
                                   dtype=tf.float32), name='weights')
    bias = tf.Variable([[0.]], dtype=tf.float32, name='bias')

    # Forward Pass
    linear = tf.add(tf.matmul(features, weights), bias, name='linear')
    ones = tf.ones(shape=tf.shape(linear))
    zeros = tf.zeros(shape=tf.shape(linear))
    prediction = tf.where(condition=tf.less(linear, 0.),
                          x=zeros,
                          y=ones,
                          name='prediction')

    # Backward Pass
    errors = targets - prediction
    weight_update = tf.assign_add(weights,
                                  tf.reshape(errors * features, (2, 1)),
                                  name='weight_update')
    bias_update = tf.assign_add(bias, errors,
                                name='bias_update')

    train = tf.group(weight_update, bias_update, name='train')

    saver = tf.train.Saver(name='saver')

and save it using

inputs = dict([(features.name, features)])
outputs = dict([(prediction.name, prediction)])
tf.saved_model.simple_save(sess, "my_path", inputs, outputs)

and I can use saved_model_cli to see the model, following is part of it

signature_def['serving_default']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['features:0'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 2)
        name: features:0

but when I use TF2 tf.keras.model.load_model("my_path") it raise error KeyError: "The name 'features:0' refers to a Tensor which does not exist. The operation, 'features', does not exist in the graph.", using java api raise the similar error.

Could this save to savedModel? How should I do it correctly?

Questions on notebook wgan-1.ipynb

Hi,

I am going through the notebook wgan-1.ipynb and upon running the script as is I encounter the following error:

Traceback (most recent call last):
  File "3_wgan.py", line 162, in <module>
    real_loss = wasserstein_loss(valid, disc_pred_real)
  File "3_wgan.py", line 76, in wasserstein_loss
    return torch.mean(y_true * y_pred)
RuntimeError: The size of tensor a (128) must match the size of tensor b (100352) at non-singleton dimension 0

As a result, I updated the line in the discriminator training part to:

real_loss = wasserstein_loss(valid, disc_pred_real[:128])

so that the dimensions match.

However, with the last 96 img, and labels and the dimensions are again unmatched. I am wondering if there could be a better way to approach this?

Thank you

Questions about the book

Dear author, thank you for offering the code.

But I can't open the ebook link https://leanpub.com/ann-and-deeplearning

Could you please send me the pdf of the book?
My email address is [email protected]

Thank you very much!

test_dataset with train=True in cnn-densenet121-mnist.ipynb

test_dataset = datasets.MNIST(root='data', 
                              train=True, 
                              transform=transforms.ToTensor(),
                              download=True)

Validation ACC: 98.50%
Test ACC: 99.91%

https://github.com/rasbt/deeplearning-models/blob/master/pytorch_ipynb/cnn/cnn-densenet121-mnist.ipynb

Question regarding gradient checkpointing

Hello,

I am trying to understand gradient checkpointing and found your explanation in gradient-checkpointing-nin.ipynb very helpful. I cloned the repo and tried rerunning the experiments. However, I was unable to reproduce the result mentioned in your conclusion.

When I run the notebook, for the vanilla NiN, my memory consumption (current, peak) are 413527 and 154049604, with runtime 109.1s.
For the checkpointed version (segments=1) of the model, the memory consumption are 402938 and 154064699, with runtime 110.14s.
From these tests, I was not able to observe a significant improvement in memory as the notebook states (22% memory improvement with 14% runtime sacrifice).

I've tried running with multiple seeds and checkpoint segment sizes, and was not able to see a significant memory improvement either.

I'm not sure why this is and could need a bit of help. Could this be due to the size of the network is relatively small and the effects are less obvious? Or could it be the checkpointing implementation from PyTorch has changed over the years? I would appreciate it if you could provide any insight in this.

question about features = (features - 0.5)*2 in GAN model

Hi.
I have noticed the operation here ( features = (features - 0.5)*2) in Generative Adversarial Networks (GAN). I don't understand why we need to do this here. The mean and variance of MINIST dataset are 0.1307 and 0.3081. Can you please explain the meaning of doing so? Looking forward to your reply.

a problem about LeNet-5 Classifier

Hi,
The model of LeNet-5 contains no nonlinearity. Is it wrong?

Bidirectional Multi-layer RNN with LSTM with Own Dataset in CSV Format (AG News)

In this notebook, while defining the model you didn't apply self.fc2()
So the shape of the returned tensor is (128, 64) instead of (128, 4).
But the interesting part is that it still achieves great accuracy. Now I'm just wondering why it is working.
Also, model(text, text_lengths) returns a tensor of size (128, 64), so why are we calling .squeeze(1) on that tensor?
Since the second dimension is not 1, the tensor shape remains the same. Is there any edge case we are dealing with?

NameError: name 'custom_where' is not defined

Hello,I discovered the following problem in pycharm
deeplearning-models/pytorch_ipynb/basic-ml/perceptron.ipynb

Traceback (most recent call last):
  File "/home/aibc/Desktop/DL/pytorch_ML/traditional_ml/preceptron/preceptron.py", line 99, in <module>
    ppn.train(X_train_tensor, y_train_tensor, epochs=5)
  File "/home/aibc/Desktop/DL/pytorch_ML/traditional_ml/preceptron/preceptron.py", line 84, in train
    errors = self.backward(x[i].view(1, self.num_features), y[i]).view(-1)
  File "/home/aibc/Desktop/DL/pytorch_ML/traditional_ml/preceptron/preceptron.py", line 75, in backward
    predictions = self.forward(x)
  File "/home/aibc/Desktop/DL/pytorch_ML/traditional_ml/preceptron/preceptron.py", line 71, in forward
    predictions = custom_where(linear > 0., 1, 0).float()
NameError: name 'custom_where' is not defined

I looked up the definition of the function, but nothing came of it.
I hope you can help me. Thank you

The out channel number in cnn_basic pytorch

28x28x1 => 28x28x4

    self.conv_1 = torch.nn.Conv2d(in_channels=1,
                                  out_channels=8,

The out_channels = 8 shouldn't it become 28x28x8

rasbt / deeplearning-models Goto Github PK

deeplearning-models's Issues

28x28x1 => 28x28x4

Recommend Projects

Recommend Topics

Recommend Org