ildoonet / pytorch-randaugment Goto Github PK
View Code? Open in Web Editor NEWUnofficial PyTorch Reimplementation of RandAugment.
License: MIT License
Unofficial PyTorch Reimplementation of RandAugment.
License: MIT License
would you upload the pretrained model in your result.thanks.
Hi there, thank you for your effort in trying to replicate the experimental results of RandAugment. I've cloned the repository, and I am trying to rerun RandAugment on Cifar10, but the following error comes up:
python RandAugment/train.py -c confs/wresnet28x10_cifar10_b256.yaml --save cifar10_wres28x10.pth File "RandAugment/train.py", line 13, in <module> from theconf import Config as C, ConfigArgumentParser ImportError: cannot import name 'Config'
I've installed theconf from https://github.com/ildoonet/theconf, but it seems that it is an outdated version of what you are using, as there is no Config class.
Hi,
I am unable to reproduce cifar-10 results for the model WRN-28-10. The accuracy I get stays as low as 96.7% but the reported accuracy is 97.4% (this repo) / 97.3% (in paper). Anyone else facing similar issues? @ildoonet Is your reported accuracy the average of multiple runs or only for a single run?
I am using the following command:
python RandAugment/train.py -c confs/wresnet28x10_cifar10_b256.yaml
Thanks
top1, top5 = accuracy(logits, labels, topk=(1, 5))
the top1 accuracy doesn't seem like LEP accuracy, it seems the contrastive learning accuracy. Could you tell me where is the LEP code?please
By only tuning two hyperparameters(N, M), you can achieve competitive performances as AutoAugments
Is randaugment randomly chose augmentations type and it parameters on every iteration and in such case the model is improving?
Hi. Thanks for the great work here.
Actually, I am trying to change a bit of your implementations.
However, whenever I try it, it gives me the error that I haven't changed the architecture file.
The problem is that the changes I made it to the file in RandAugment directory is not being updated even though the code is changed.
I think this is somewhat related to the package or pip install things...
How can I make a change to implementations in RandAugment directory and run with that?
Hey,
I am not sure if --cv-ratio was intended for use but I believe the wrong set of transformations are given to the respective dataloader, i.e. the validation dataloader is given the transformations of the training set (including RandAugment transformations).
Hi and thanks for this awesome repo.
I just checked the original TensorFlow implementation and found a part different from them. In the original implementation. There is a probability of applying and not applying the augmentation. But I did not find it in this repo.
The link for TensorFlow version: https://github.com/tensorflow/tpu/blob/5144289ba9c9e5b1e55cc118b69fe62dd868657c/models/official/efficientnet/autoaugment.py#L532
Original:
with tf.name_scope('randaug_layer_{}'.format(layer_num)):
for (i, op_name) in enumerate(available_ops):
prob = tf.random_uniform([], minval=0.2, maxval=0.8, dtype=tf.float32)
func, _, args = _parse_policy_info(op_name, prob, random_magnitude,
replace_value, augmentation_hparams)
this repo:
ops = random.choices(self.augment_list, k=self.n)
# print (ops)
for op, minval, maxval in ops:
val = (float(self.m) / 30) * float(maxval - minval) + minval
img = op(img, val)
May I ask is there any reason for this? Or is there any part I missing?
Thanks in advance
hello. thank for your repo!
I found an minor error in Posterize function when I used RandAugment class.
although a comment was written # [4, 8],
actually, [0, 4] have being passed.
when v=0, a image have not level, so a image is all black.
I think that v must be more than 2.
Anyway, thank you for making RandAugment on pytorch!
It is helpful!
Experiment in such a huge network takes 10 days with 4 gpus.
1. class SmoothCrossEntropyLoss(Module):
2. def __init__(self, label_smoothing=0.0, size_average=True):
3. super().__init__()
4. self.label_smoothing = label_smoothing
5. self.size_average = size_average
6.
7. def forward(self, input, target):
8. if len(target.size()) == 1:
9. target = torch.nn.functional.one_hot(target, num_classes=input.size(-1))
10. target = target.float().cuda()
11. if self.label_smoothing > 0.0:
12. s_by_c = self.label_smoothing / len(input[0])
13. smooth = torch.zeros_like(target)
14. smooth = smooth + s_by_c
15. target = target * (1. - s_by_c) + smooth
16.
17. return cross_entropy(input, target, self.size_average)
It seems that the label smoothing I know is not done.
(Based on 7 num classes) Line 15 output print:
[1.0000, 0.0143, 0.0143, 0.0143, 0.0143, 0.0143, 0.0143]
label smoothing formula is:
y_ls = y_k * (1 - a) + a / K
but 15 line is:
y_ls = y_k * (1 - a / K) + a / K
correct code and result:
15. target = target * (1. - self.label_smoothing) + smooth
[0.9143, 0.0143, 0.0143, 0.0143, 0.0143, 0.0143, 0.0143]
Maybe I'm stupidly misunderstood?
Hello,
it would be nice if you could add the hyper parameters of RandAugment N and M that were used to get the results for the Datasets.
I reran your code and the result seems not so good.
My CIFAR_10(Wide-ResNet 28x10) result:
{
"loss_train": 0.46477456643031195,
"loss_valid": 0.0,
"loss_test": 0.10433908870220185,
"top1_train": 0.8350560897435897,
"top1_valid": 0.0,
"top1_test": 0.9663,
"top5_train": 0.971133814102564,
"top5_valid": 0.0,
"top5_test": 0.9991,
"epoch": 200
}
Can you tell me your running environment like gpu nums & gpu device.
And one more questions, why the top1_train
accuracy is so low.
Hi,I‘m asking how long did you take to train the pyramidnet+shakedrop on cifar100.
And what is your experimental settings? How many nodes and GPUs you used? Used horovod for distributed computing? And I'm also wondering the batchsize of 64 is the total batchsize or on each GPU?
Thank you !
Below is the top1 errors with N and M grid search, epoch=180.
M\N | 1 | 2 | 3 |
---|---|---|---|
5 | 0.2318 | 0.2303 | 0.2327 |
7 | 0.2281 | 0.2294 | 0.2323 |
9 | 0.2292 | 0.2289 | 0.2301 |
11 | 0.2264 | 0.2284 | 0.2320 |
13 | 0.2282 | 0.2294 | 0.2294 |
15 | 0.2258 | 0.2265 | 0.2297 |
But, I changed epoch from 180(paper's) to 270(autoaugment's), the result with N=2, M=9 are similar to the reported value(top1 error = 22.4).
Thank you for repo! I have tried to reproduced the result, but met some problem:
Traceback (most recent call last):
File "RandAugment/train.py", line 267, in
result = train_and_eval(args.tag, args.dataroot, test_ratio=args.cv_ratio, cv_fold=args.cv, save_path=args.save, only_eval=args.only_eval, metric='test')
File "RandAugment/train.py", line 93, in train_and_eval
trainsampler, trainloader, validloader, testloader_ = get_dataloaders(C.get()['dataset'], C.get()['batch'], dataroot, test_ratio, split_idx=cv_fold)
File "/home/davis/anaconda3/lib/python3.7/site-packages/RandAugment/data.py", line 79, in get_dataloaders
total_trainset = torchvision.datasets.CIFAR10(root=dataroot, train=True, download=True, transform=transform_train)
File "/home/davis/anaconda3/lib/python3.7/site-packages/torchvision/datasets/cifar.py", line 64, in init
self.download()
File "/home/davis/anaconda3/lib/python3.7/site-packages/torchvision/datasets/cifar.py", line 148, in download
download_and_extract_archive(self.url, self.root, filename=self.filename, md5=self.tgz_md5)
File "/home/davis/anaconda3/lib/python3.7/site-packages/torchvision/datasets/utils.py", line 248, in download_and_extract_archive
download_url(url, download_root, filename, md5)
File "/home/davis/anaconda3/lib/python3.7/site-packages/torchvision/datasets/utils.py", line 74, in download_url
makedir_exist_ok(root)
File "/home/davis/anaconda3/lib/python3.7/site-packages/torchvision/datasets/utils.py", line 50, in makedir_exist_ok
os.makedirs(dirpath)
File "/home/davis/anaconda3/lib/python3.7/os.py", line 211, in makedirs
makedirs(head, exist_ok=exist_ok)
File "/home/davis/anaconda3/lib/python3.7/os.py", line 211, in makedirs
makedirs(head, exist_ok=exist_ok)
File "/home/davis/anaconda3/lib/python3.7/os.py", line 221, in makedirs
mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/data'
If anybody know how to deal with it, please leave a commit there
Any suggestion?
I note that the augment_list is not the same as the paper. Have you evaluated the components and proposed a more effective one?
Moreover, is it a better way to include Cutout in the operators instead of a standalone one?
Why do you use M scale [0, 30] instead of [0, 10]? I think the original paper uses [0, 10]. Maybe this should be made clear at README.md?
In you configs you use for both CIFAR 10 and 100 you use larger batch sizes than reported in the paper, but the rest of the hyper-parameters are the same. I am rather memory-limited, since I want to save extra statistics, thus I was wondering if you tried smaller batch sizes? If so, I wonder if it worked? Did you just choose that batch size since you could fit it on the GPU?
Best regards and Thank You,
Sam
Here the augmentation function.
def rand_augmentation():
aug=transforms.Compose([
transforms.RandomResizedCrop(248, scale=(0.08, 1.0), interpolation=Image.BICUBIC),
transforms.RandomHorizontalFlip(1),
transforms.RandomVerticalFlip(1),
transforms.RandomRotation(degrees=30),
transforms.ColorJitter(brightness=0.4,contrast=0.4,saturation=0.4),
transforms.RandomPerspective(distortion_scale=0.1),
transforms.RandomAffine(degrees=10),
transforms.ToTensor(),
transforms.RandomErasing(p=0.5),
transforms.Normalize((0.5, ), (0.5, )),
])
return aug.transforms.insert(0, RandAugment(4, 3))
Here is data loader
def load_data(df,batchsize=8):
data =SiameseNetworkDataset(df,image_D='2D',transform=(0,rand_augmentation()))
loader = DataLoader(data,shuffle=True,num_workers=0,batch_size=batchsize)
return loader
here is data loader
def __getitem__(self,index):
if self.transform[0]==2:
img0 = self.transform[1](image=np.array(img0))['image']
img1 = self.transform[1](image=np.array(img1))['image']
else:
img0=self.transform[1](img0)
img1=self.transform[1](img1)
return img0, img1 ,label
If I return aug
only instead of aug.transforms.insert(0, RandAugment(4, 3))
, there is no error.
Error
TypeError Traceback (most recent call last)
<timed exec> in <module>
D:\Datasets\Image dataset\Xray\SIAMESE-classifier\src\cross_vals.py in kfoldcv(model, data, epochs, n_splits, lr, batchsize, skip_tuning, aug)
70
71 #train on all train images
---> 72 model=train_dl(train_loader,epochs,model,"cuda",criterion,opt)
73 train_features,train_labels=get_features(train,model)
74 #now get embeddings of test data
D:\Datasets\Image dataset\Xray\SIAMESE-classifier\src\dl_training.py in train_dl(loader, epochs, model, device, criterion, opt)
120 model=model.to(device)
121 for _epoch in range(epochs):
--> 122 for batch in loader:
123 img1,img2,label=batch
124 img1_emb,img2_emb=model(img1.to(device)),model(img2.to(device))
C:\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py in __next__(self)
433 if self._sampler_iter is None:
434 self._reset()
--> 435 data = self._next_data()
436 self._num_yielded += 1
437 if self._dataset_kind == _DatasetKind.Iterable and \
C:\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py in _next_data(self)
473 def _next_data(self):
474 index = self._next_index() # may raise StopIteration
--> 475 data = self._dataset_fetcher.fetch(index) # may raise StopIteration
476 if self._pin_memory:
477 data = _utils.pin_memory.pin_memory(data)
C:\Anaconda3\lib\site-packages\torch\utils\data\_utils\fetch.py in fetch(self, possibly_batched_index)
42 def fetch(self, possibly_batched_index):
43 if self.auto_collation:
---> 44 data = [self.dataset[idx] for idx in possibly_batched_index]
45 else:
46 data = self.dataset[possibly_batched_index]
C:\Anaconda3\lib\site-packages\torch\utils\data\_utils\fetch.py in <listcomp>(.0)
42 def fetch(self, possibly_batched_index):
43 if self.auto_collation:
---> 44 data = [self.dataset[idx] for idx in possibly_batched_index]
45 else:
46 data = self.dataset[possibly_batched_index]
D:\Datasets\Image dataset\Xray\SIAMESE-classifier\src\dataloader.py in __getitem__(self, index)
49 img1 = self.transform[1](image=np.array(img1))['image']
50 else:
---> 51 img0=self.transform[1](img0)
52 img1=self.transform[1](img1)
53
TypeError: 'NoneType' object is not callable
It seems like some of the PIL functions are not centered? This would likely impact results by causing lower magnitudes to lead to higher distortions
ie. here's a function that appears centered
def Rotate(img, v): # [-30, 30] NOTE: should this be [0, 30]?
assert -30 <= v <= 30
if random.random() > 0.5:
v = -v
return img.rotate(v)
vs the color function that is not centered
def Color(img, v): # [0.1,1.9]
assert 0.1 <= v <= 1.9
return PIL.ImageEnhance.Color(img).enhance(v)
instead, perhaps it should be this?
def Color(img, v): # [0,0.9]
assert 0 <= v <= 0.9
if random.random() > 0.5:
v = -v
return PIL.ImageEnhance.Color(img).enhance(1+v)
Thanks for sharing this repo! It's very readable :)
Hi, thank you for the awesome repo.
I have a question about the order of augmentation.
In the original code, https://github.com/tensorflow/tpu/blob/c61a451165ba643ac9e0ae94448821ca71745ebf/models/official/efficientnet/preprocessing.py#L171 the preprocessing (e.g., random crop) is applied first, and then RandAug policy is applied.
Contrarily, in your code,
pytorch-randaugment/RandAugment/data.py
Line 69 in 48b8f50
Are there any ideas behind this modification?
Thanks in advance.
I am trying to reproduce the output for
! python RandAugment/train.py -c confs/wresnet28x10_svhn_b256.yaml --save svhn_wres28x10.pth
The error is encountered in the metrics.py
correct_k = correct[:k].view(-1).float().sum(0)
Error message : view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces)
The same error arises while testing for cifar10
Hey @ildoonet. Thanks for your clarification on theconf. I've managed to run Wideresnet28_10 on CIFAR10 so far but our results don't match. I got:
"loss_train": 0.7089337987777514,
"loss_valid": 0.0,
"loss_test": 0.10971159720420838,
"top1_train": 0.7357572115384615,
"top1_valid": 0.0,
"top1_test": 0.9627,
"top5_train": 0.8983573717948717,
"top5_valid": 0.0,
"top5_test": 0.9992,
"epoch": 200
Python version: 3.6.9
Cuda version: 10.0
Pytorch version: 1.3.1
What could be the issue?
I have run the wresnet28x10_cifar10_b256 with the yaml file you provide
the top1_test is 0.9706, and when I set the aug to 'default' and retrain the model, I got 0.9694.
Did I do something wrong to make the randAugment perform worse than you claimed and baseline perform better?
hello,I reproduce CIFAR-100 , test acc is 81.7% , and I also see others‘ question ,you have changed your code for imagenet datasets , but I don't find your Changes to the code ,if you tell me ,I am very grateful.I'm a beginner,thank you.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.