ildoonet / pytorch-randaugment Goto Github PK

Unofficial PyTorch Reimplementation of RandAugment.

License: MIT License

Python 100.00%

pytorch autoaugment augmentation computer-vision imagenet cifar classification deep-learning convolutional-neural-networks

pytorch-randaugment's Issues

the pretrained model

would you upload the pretrained model in your result.thanks.

Hi there, thank you for your effort in trying to replicate the experimental results of RandAugment. I've cloned the repository, and I am trying to rerun RandAugment on Cifar10, but the following error comes up:

python RandAugment/train.py -c confs/wresnet28x10_cifar10_b256.yaml --save cifar10_wres28x10.pth File "RandAugment/train.py", line 13, in <module> from theconf import Config as C, ConfigArgumentParser ImportError: cannot import name 'Config'

I've installed theconf from https://github.com/ildoonet/theconf, but it seems that it is an outdated version of what you are using, as there is no Config class.

Can I use this project to generate a strategy for my own dataset？

Unable to reproduce Cifar-10 results for WRN-28-10

Hi,

I am unable to reproduce cifar-10 results for the model WRN-28-10. The accuracy I get stays as low as 96.7% but the reported accuracy is 97.4% (this repo) / 97.3% (in paper). Anyone else facing similar issues? @ildoonet Is your reported accuracy the average of multiple runs or only for a single run?

I am using the following command:

python RandAugment/train.py -c confs/wresnet28x10_cifar10_b256.yaml

Thanks

It seems that you dont use Linear Evaluation Protocol?

    top1, top5 = accuracy(logits, labels, topk=(1, 5))

the top1 accuracy doesn't seem like LEP accuracy, it seems the contrastive learning accuracy. Could you tell me where is the LEP code?please

Can you please clarify the idea behind it

By only tuning two hyperparameters(N, M), you can achieve competitive performances as AutoAugments

Is randaugment randomly chose augmentations type and it parameters on every iteration and in such case the model is improving?

How to change the architecture?

Hi. Thanks for the great work here.

Actually, I am trying to change a bit of your implementations.

However, whenever I try it, it gives me the error that I haven't changed the architecture file.
The problem is that the changes I made it to the file in RandAugment directory is not being updated even though the code is changed.

I think this is somewhat related to the package or pip install things...

How can I make a change to implementations in RandAugment directory and run with that?

Small remark on --cv-ratio (i.e. validation set)

Hey,

I am not sure if --cv-ratio was intended for use but I believe the wrong set of transformations are given to the respective dataloader, i.e. the validation dataloader is given the transformations of the training set (including RandAugment transformations).

The prob of applying augmentation?

Hi and thanks for this awesome repo.

I just checked the original TensorFlow implementation and found a part different from them. In the original implementation. There is a probability of applying and not applying the augmentation. But I did not find it in this repo.

The link for TensorFlow version: https://github.com/tensorflow/tpu/blob/5144289ba9c9e5b1e55cc118b69fe62dd868657c/models/official/efficientnet/autoaugment.py#L532

Original:
with tf.name_scope('randaug_layer_{}'.format(layer_num)):
for (i, op_name) in enumerate(available_ops):
prob = tf.random_uniform([], minval=0.2, maxval=0.8, dtype=tf.float32)
func, _, args = _parse_policy_info(op_name, prob, random_magnitude,
replace_value, augmentation_hparams)

this repo:
ops = random.choices(self.augment_list, k=self.n)
# print (ops)
for op, minval, maxval in ops:
val = (float(self.m) / 30) * float(maxval - minval) + minval
img = op(img, val)

May I ask is there any reason for this? Or is there any part I missing?

Thanks in advance

Posterize function error

hello. thank for your repo!

I found an minor error in Posterize function when I used RandAugment class.
although a comment was written # [4, 8],
actually, [0, 4] have being passed.

when v=0, a image have not level, so a image is all black.
I think that v must be more than 2.

Anyway, thank you for making RandAugment on pytorch!
It is helpful!

Why set depth=272 in pyramidnet?

Experiment in such a huge network takes 10 days with 4 gpus.

SmoothCrossEntropyLoss


1. class SmoothCrossEntropyLoss(Module):
2.     def __init__(self, label_smoothing=0.0, size_average=True):
3.         super().__init__()
4.         self.label_smoothing = label_smoothing
5.         self.size_average = size_average
6. 
7.     def forward(self, input, target):
8.         if len(target.size()) == 1:
9.             target = torch.nn.functional.one_hot(target, num_classes=input.size(-1))
10.             target = target.float().cuda()
11.         if self.label_smoothing > 0.0:
12.             s_by_c = self.label_smoothing / len(input[0])
13.             smooth = torch.zeros_like(target)
14.             smooth = smooth + s_by_c
15.             target = target * (1. - s_by_c) + smooth
16. 
17.         return cross_entropy(input, target, self.size_average)

It seems that the label smoothing I know is not done.

(Based on 7 num classes) Line 15 output print:

[1.0000, 0.0143, 0.0143, 0.0143, 0.0143, 0.0143, 0.0143]

label smoothing formula is:

y_ls = y_k * (1 - a) + a / K

but 15 line is:

y_ls = y_k * (1 - a / K) + a / K

correct code and result:

15. target = target * (1. - self.label_smoothing) + smooth

[0.9143, 0.0143, 0.0143, 0.0143, 0.0143, 0.0143, 0.0143]

Maybe I'm stupidly misunderstood?

RandAugment hyperparameters for markdown tables

Hello,

it would be nice if you could add the hyper parameters of RandAugment N and M that were used to get the results for the Datasets.

Can not reproduce the result (both paper & yours)

I reran your code and the result seems not so good.
My CIFAR_10(Wide-ResNet 28x10) result:

{   
    "loss_train": 0.46477456643031195,
    "loss_valid": 0.0,
    "loss_test": 0.10433908870220185,
    "top1_train": 0.8350560897435897,
    "top1_valid": 0.0,
    "top1_test": 0.9663,
    "top5_train": 0.971133814102564,
    "top5_valid": 0.0,
    "top5_test": 0.9991,
    "epoch": 200
}

Can you tell me your running environment like gpu nums & gpu device.

And one more questions, why the top1_train accuracy is so low.

Questions about pyramidNet

Hi，I‘m asking how long did you take to train the pyramidnet+shakedrop on cifar100.
And what is your experimental settings? How many nodes and GPUs you used？ Used horovod for distributed computing？ And I'm also wondering the batchsize of 64 is the total batchsize or on each GPU?

Thank you !

Reproducibility Problem(ResNet-50 on ImageNet)

Below is the top1 errors with N and M grid search, epoch=180.

M\N	1	2	3
5	0.2318	0.2303	0.2327
7	0.2281	0.2294	0.2323
9	0.2292	0.2289	0.2301
11	0.2264	0.2284	0.2320
13	0.2282	0.2294	0.2294
15	0.2258	0.2265	0.2297

But, I changed epoch from 180(paper's) to 270(autoaugment's), the result with N=2, M=9 are similar to the reported value(top1 error = 22.4).

About the 'PermissionError'?

Thank you for repo! I have tried to reproduced the result, but met some problem:

Traceback (most recent call last):
File "RandAugment/train.py", line 267, in
result = train_and_eval(args.tag, args.dataroot, test_ratio=args.cv_ratio, cv_fold=args.cv, save_path=args.save, only_eval=args.only_eval, metric='test')
File "RandAugment/train.py", line 93, in train_and_eval
trainsampler, trainloader, validloader, testloader_ = get_dataloaders(C.get()['dataset'], C.get()['batch'], dataroot, test_ratio, split_idx=cv_fold)
File "/home/davis/anaconda3/lib/python3.7/site-packages/RandAugment/data.py", line 79, in get_dataloaders
total_trainset = torchvision.datasets.CIFAR10(root=dataroot, train=True, download=True, transform=transform_train)
File "/home/davis/anaconda3/lib/python3.7/site-packages/torchvision/datasets/cifar.py", line 64, in init
self.download()
File "/home/davis/anaconda3/lib/python3.7/site-packages/torchvision/datasets/cifar.py", line 148, in download
download_and_extract_archive(self.url, self.root, filename=self.filename, md5=self.tgz_md5)
File "/home/davis/anaconda3/lib/python3.7/site-packages/torchvision/datasets/utils.py", line 248, in download_and_extract_archive
download_url(url, download_root, filename, md5)
File "/home/davis/anaconda3/lib/python3.7/site-packages/torchvision/datasets/utils.py", line 74, in download_url
makedir_exist_ok(root)
File "/home/davis/anaconda3/lib/python3.7/site-packages/torchvision/datasets/utils.py", line 50, in makedir_exist_ok
os.makedirs(dirpath)
File "/home/davis/anaconda3/lib/python3.7/os.py", line 211, in makedirs
makedirs(head, exist_ok=exist_ok)
File "/home/davis/anaconda3/lib/python3.7/os.py", line 211, in makedirs
makedirs(head, exist_ok=exist_ok)
File "/home/davis/anaconda3/lib/python3.7/os.py", line 221, in makedirs
mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/data'

If anybody know how to deal with it, please leave a commit there

epoch= 1800 in shake26_2x96d_cifar10_b512.yaml?

cannot import name 'Config' from 'theconf' (/home/../python3.7/site-packages/theconf/init.py)

Any suggestion?

Question about the modified operation types

I note that the augment_list is not the same as the paper. Have you evaluated the components and proposed a more effective one?
Moreover, is it a better way to include Cutout in the operators instead of a standalone one?

Man,e,where is the implementation.....

Why do you use M scale [0, 30] instead of [0, 10]?

Why do you use M scale [0, 30] instead of [0, 10]? I think the original paper uses [0, 10]. Maybe this should be made clear at README.md?

CIFAR10(0) w/ batch size 128

In you configs you use for both CIFAR 10 and 100 you use larger batch sizes than reported in the paper, but the rest of the hyper-parameters are the same. I am rather memory-limited, since I want to save extra statistics, thus I was wondering if you tried smaller batch sizes? If so, I wonder if it worked? Did you just choose that batch size since you could fit it on the GPU?

Best regards and Thank You,

Sam

TypeError: 'NoneType' object is not callable

Here the augmentation function.

def rand_augmentation():
    aug=transforms.Compose([
        transforms.RandomResizedCrop(248, scale=(0.08, 1.0), interpolation=Image.BICUBIC),
        transforms.RandomHorizontalFlip(1),
        transforms.RandomVerticalFlip(1),
        transforms.RandomRotation(degrees=30),
        transforms.ColorJitter(brightness=0.4,contrast=0.4,saturation=0.4),
        transforms.RandomPerspective(distortion_scale=0.1), 
        transforms.RandomAffine(degrees=10),
        transforms.ToTensor(),
        transforms.RandomErasing(p=0.5), 
        transforms.Normalize((0.5, ), (0.5, )),
                          ])
    return aug.transforms.insert(0, RandAugment(4, 3))

Here is data loader

def load_data(df,batchsize=8):
     data =SiameseNetworkDataset(df,image_D='2D',transform=(0,rand_augmentation()))
    loader = DataLoader(data,shuffle=True,num_workers=0,batch_size=batchsize)
    return loader

here is data loader

def __getitem__(self,index):
  
        if self.transform[0]==2:
            img0 = self.transform[1](image=np.array(img0))['image']   
            img1 = self.transform[1](image=np.array(img1))['image']  
        else:
            img0=self.transform[1](img0) 
            img1=self.transform[1](img1) 

        return img0, img1 ,label

If I return aug only instead of aug.transforms.insert(0, RandAugment(4, 3)), there is no error.
Error

TypeError                                 Traceback (most recent call last)
<timed exec> in <module>

D:\Datasets\Image dataset\Xray\SIAMESE-classifier\src\cross_vals.py in kfoldcv(model, data, epochs, n_splits, lr, batchsize, skip_tuning, aug)
     70 
     71         #train on all train images
---> 72         model=train_dl(train_loader,epochs,model,"cuda",criterion,opt)
     73         train_features,train_labels=get_features(train,model)
     74          #now get embeddings of test data

D:\Datasets\Image dataset\Xray\SIAMESE-classifier\src\dl_training.py in train_dl(loader, epochs, model, device, criterion, opt)
    120     model=model.to(device)
    121     for _epoch in range(epochs):
--> 122         for batch in loader:
    123             img1,img2,label=batch
    124             img1_emb,img2_emb=model(img1.to(device)),model(img2.to(device))

C:\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py in __next__(self)
    433         if self._sampler_iter is None:
    434             self._reset()
--> 435         data = self._next_data()
    436         self._num_yielded += 1
    437         if self._dataset_kind == _DatasetKind.Iterable and \

C:\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py in _next_data(self)
    473     def _next_data(self):
    474         index = self._next_index()  # may raise StopIteration
--> 475         data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
    476         if self._pin_memory:
    477             data = _utils.pin_memory.pin_memory(data)

C:\Anaconda3\lib\site-packages\torch\utils\data\_utils\fetch.py in fetch(self, possibly_batched_index)
     42     def fetch(self, possibly_batched_index):
     43         if self.auto_collation:
---> 44             data = [self.dataset[idx] for idx in possibly_batched_index]
     45         else:
     46             data = self.dataset[possibly_batched_index]

C:\Anaconda3\lib\site-packages\torch\utils\data\_utils\fetch.py in <listcomp>(.0)
     42     def fetch(self, possibly_batched_index):
     43         if self.auto_collation:
---> 44             data = [self.dataset[idx] for idx in possibly_batched_index]
     45         else:
     46             data = self.dataset[possibly_batched_index]

D:\Datasets\Image dataset\Xray\SIAMESE-classifier\src\dataloader.py in __getitem__(self, index)
     49             img1 = self.transform[1](image=np.array(img1))['image']
     50         else:
---> 51             img0=self.transform[1](img0)
     52             img1=self.transform[1](img1)
     53 

TypeError: 'NoneType' object is not callable

Brightness, Sharpness, Constrast, Color at magnitude 0 should have v=1?

It seems like some of the PIL functions are not centered? This would likely impact results by causing lower magnitudes to lead to higher distortions

ie. here's a function that appears centered

def Rotate(img, v):  # [-30, 30]  NOTE: should this be [0, 30]?
    assert -30 <= v <= 30 
    if random.random() > 0.5:
        v = -v 
    return img.rotate(v)

vs the color function that is not centered

def Color(img, v):  # [0.1,1.9]
    assert 0.1 <= v <= 1.9
    return PIL.ImageEnhance.Color(img).enhance(v)

instead, perhaps it should be this?

def Color(img, v):  # [0,0.9]
    assert 0 <= v <= 0.9
    if random.random() > 0.5:
        v = -v
    return PIL.ImageEnhance.Color(img).enhance(1+v)

Thanks for sharing this repo! It's very readable :)

The order of augmentation

Hi, thank you for the awesome repo.

I have a question about the order of augmentation.

In the original code, https://github.com/tensorflow/tpu/blob/c61a451165ba643ac9e0ae94448821ca71745ebf/models/official/efficientnet/preprocessing.py#L171 the preprocessing (e.g., random crop) is applied first, and then RandAug policy is applied.

Contrarily, in your code,

pytorch-randaugment/RandAugment/data.py

Line 69 in 48b8f50

    
           transform_train.transforms.insert(0, RandAugment(C.get()['randaug']['N'], C.get()['randaug']['M']))

, you first apply RandAug policy and then use the preprocessing.

Are there any ideas behind this modification?
Thanks in advance.

view size is not compatible with input tensor's size and stride

I am trying to reproduce the output for
! python RandAugment/train.py -c confs/wresnet28x10_svhn_b256.yaml --save svhn_wres28x10.pth
The error is encountered in the metrics.py
correct_k = correct[:k].view(-1).float().sum(0)

Error message : view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces)

The same error arises while testing for cifar10

Reproducing CIFAR-10 results

Hey @ildoonet. Thanks for your clarification on theconf. I've managed to run Wideresnet28_10 on CIFAR10 so far but our results don't match. I got:

"loss_train": 0.7089337987777514,
"loss_valid": 0.0,
"loss_test": 0.10971159720420838,
"top1_train": 0.7357572115384615,
"top1_valid": 0.0,
"top1_test": 0.9627,
"top5_train": 0.8983573717948717,
"top5_valid": 0.0,
"top5_test": 0.9992,
"epoch": 200

Python version: 3.6.9
Cuda version: 10.0
Pytorch version: 1.3.1

What could be the issue?

about the peformance of randAugment and baseline

I have run the wresnet28x10_cifar10_b256 with the yaml file you provide
the top1_test is 0.9706, and when I set the aug to 'default' and retrain the model, I got 0.9694.
Did I do something wrong to make the randAugment perform worse than you claimed and baseline perform better?

Reproducing CIFAR-100 results

hello，I reproduce CIFAR-100 ， test acc is 81.7% ， and I also see others‘ question ,you have changed your code for imagenet datasets , but I don't find your Changes to the code ,if you tell me ,I am very grateful.I'm a beginner,thank you.

ildoonet / pytorch-randaugment Goto Github PK

pytorch-randaugment's Issues

Recommend Projects

Recommend Topics

Recommend Org