qinenergy / cotta Goto Github PK

View Code? Open in Web Editor NEW

220.0 220.0 24.0 160 KB

[CVPR 2022] Official CoTTA Code for our paper Continual Test-Time Domain Adaptation

License: Other

Python 97.55% Jinja 1.69% Shell 0.77%

cotta's People

Contributors

Stargazers

Watchers

cotta's Issues

Some question about the paper

Thanks for yout work, which inspired me a lot. And this work is highly corresponding to the continual learning or lifelong learning. Have you compared with other continual learning methods?

Query about speed of inference on semantic segmentation.

Hi, thanks for your nick work.
I find it is a little bit slow to inference on semantic segmentation code, the speed is less than 1 task/s on 1 v100.
Is this a normal case? or there are anything to do?
Thanks:-)

Running Code with DDP

Hello, I have another inquiry is it possible to run the of semantic segmentation code with DistributedDataParallel option ?

[resolved, user is now able to reproduce results for ImageNet] ImageNet experiments

Hi, thanks for releasing the code.
When I reproduce the table 6 Imagenet-C result in your paper (severity level 5), the source error is 82.4 and the BN adapt is 72.1, which is the same as your paper.
However, Tent result is 69.2 and the CoTTA result is about 71.6, which is inconsistent with your paper, in which the tent is 66.5 and CoTTA is 63.
Could you explain why the result is different?
Thanks

ModuleNotFoundError: No module named 'autoattack'

I get this error when I run the code, Could you please send us the link to this package. I cannot find it on the internet.

Error in loading ckpt/cifar10/corruptions/Standard.pt

The error is "_pickle.UnpicklingError: invalid load key, '<'." when checkpoint = torch.load(model_path, map_location=torch.device('cpu')), line 127 in utils.py is executed. The reported size of Standard.pt is 2.2K. Is this file complete?

Some questions about restoration design and the restoration factor

Hi, author! Thanks again for your sharing codes. I've fixed the problems occurring in reproducing CIFAR-10C results by replacing the model back to the standard one. And I am here for some questions in your proposed restoration design.
We know that "Stochastic Restoration" in CoTTA is designed to avoid catastrophic forgetting. The thing that bothers me is, in CL, avoiding catastrophic forgetting is needed when you want a model to solve all the tasks/domains that have been seen previously. But when all you need is just to keep high performance in current tasks/domains (i.e., online performance), what might be the reason for choosing restoration to avoid catastrophic forgetting (i.e., keep previous knowledge)? I know it achieves better results, and as the paper mentioned, it helps recovery after encountering hard examples. And I'll still appreciate it very much if there is any answer for my puzzles (it is also totally okay not to do so :) ).
Or maybe is there any advice for the restoration factor setup (why CoTTA sets 0.01) ? :)

How to pre train a model in your seg experiments

I've noticed that u used segformer.b5.1024x1024.city.160k.pth as your pre-trained model in your segmentation experiments. I want to know how to get this model, as I want to pretrain a model for my own settings. I'll be appreciate that if u can provide codes for training this pretrained model. Thanks!

Understanding of the anchor model in semantic segmentation tasks

I feel that your code and experiment are wrong, \acdc-submission\mmseg\apis\test.py line 106
with torch.no_grad(): result, probs, preds = ema_model(return_loss=False, **data) _, probs_, _ = anchor_model(return_loss=False, **data)
It feels like model prediction should be used here instead of anchor_model.

About step t

I would like to know if step t in your paper is self.steps in code?

You only set it to 1 in the config file, which doesn't seem to be adaptive

Questions about the moving average

Hi! Thanks for sharing your code. This work is interesting! I have some questions about the updating of the teacher model. It would be great if you could answer them.
The original Mean Teacher is trained with both labeled and unlabeled data. The consistency loss then calculates the difference between the outputs from the student and the teacher model. The author did not have high confidence about the feasibility of training Mean teacher without any labeled data. CuriousAI/mean-teacher#24
In your code, the student model update with the softmax_entropy and the teacher model accumulated the historical weights from the student model. I am wonder how could you avoid such performance degradation on the teacher model when there is no other guidance on the student model.
I have tried to implement the Mean Teacher model without labeled data training on the student model. The training results shows that the performance of the teacher model dropped when updated with exponential moving average every steps. Had you ever encountered such problems?

Minor typo in paper

Hi, authors, thanks for sharing code. I found a minor typo in paper (https://arxiv.org/pdf/2203.13591.pdf). In the caption of Tabel 2, "Tesults are evaluated", Tesults should be Results?

Question about the base_model in google drive

Hi, For some reason, I can't use request to download the base model from Google Drive, but I can download it by entering the Google Drive link from Chrome, and the link is "https://docs.google.com/uc?export=download&id=1t98aEuzeTL8P7Kpd5DIrCoCL21BNZUhC". The name of the file in the link is "natural.pt.tar", how do I change it to "Standard.pt"? Can I rename it directly? Hope to get your help.

SystemExit:1 error was reported while running cifar10c

Hello author, when I run several main programs of cifar, I will report an error SystemExit:1. Have you ever encountered such a situation?

Query about Tent on semantic segmentation

The tent processing classification task requires the entropy of the predicted probability as a loss, how is this loss calculated in the segmentation task.

Creating Segformer environment

Hi Qin, I encounter an error (Pip failed) while creating the environment for segmentation "Segformer" using conda, is it possible to re-write your environment locally again ? i.e. on your local workstation, not a cluster?

Confusion Regarding Batch Size

Hi,
I have a query regarding the batch size.
In the paper you have written, "We update the model for one step at each iteration (i.e. one gradient step per test point).". This seems to suggest that the BATCH_SIZE = 1.

But in the config file, it is BATCH_SIZE = 200. [See here: link]
So is the actual batch size 200 (and not 1) for the experiments in your paper?

I have tried setting the batch size to 1, but the results deteriorate drastically.

A clarification would be helpful.
Thank you.

Error in loading ckpt/cifar10/corruptions/Standard.pt

requests.exceptions.SSL Error: HTTPSConnectionPool ( host=‘docs. google.com' , port=443): Max retries exceeded wits url:/uc？export=download&confirm=t&id=1t98aEuzeTL8P7Kpd5DIrCoCL21BNZUhC (Caused by SSLError(SSLEOFError(8，'EOF occurred in violation of protocol ( ssl.c:1129) ')))

https://drive.qin.ee/ cannot be accessed

Dear author, thank you for exposing the code. But when I want to reproduce the code of segmentation, I can't open the URL you provided. I hope you can answer my doubts. Thank you again!

Question about cityscapes->ACDC

Hi, can you provide the code for this part?

Questions around code and paper

Hi,

Thanks for patiently answering all the questions in previous issues. The paper is interesting, congratulations! I had a few questions and it would be great if you could help me understand this better :

Which model is being used for evaluation? The model running on severity 5, has it already learnt/updated from all previous severities? When you move from one corruption to next while evaluating, has the model learnt from the last one or has it been tested individually on each corruption?
In each batch, is there only one severity and one corruption?

BN Stats Adapt in cityscape -> ACDC

Dear author:
Thanks for your code of semantic segmentation adaptation from cityscape -> ACDC. However, I can not find the code of BN Stats Adapt there. Could you tell me where is this part of code? Thanks

cotta v.s. domain incremental model?

Semantic segmentation experiments

Hello, will you also upload scripts for Cityscapes-to-ACDC experiments? Thanks a lot!

gaussian_noise file is missing

Dear author, thank you for exposing the code. But when I run the code, I found this error:

how can I fix it?

Semantic Segmentation Code

This link cannot be accessed now, can you fix it? @qinenergy

Installation of mmcv>=[1, 1, 4], <=[1, 3, 0] while using Segmentation codes

Hi, Thanks for your nice work!

I downloaded your segmentation code acdc-submission which you kindly provided in the issue before.
I successfully installed mmcv at cuda 11.1 + torch 1.8.0 using this codes

However, what I installed version of mmcv is MMCV==1.7.1.
As a result, I got an error like below.

AssertionError: MMCV==1.7.1 is used but incompatible. **Please install mmcv>=[1, 1, 4], <=[1, 3, 0**].

I searched here to install mmcv>=[1, 1, 4], <=[1, 3, 0] version, but they don't serve a low version of mmcv for cuda 11.1.

So how can I deal with this problem?

KeyError: Caught KeyError in DataLoader worker process 0

hey,why I have the error "KeyError: Caught KeyError in DataLoader worker process "(KeyError: 0) when I run the cotta.sh ? My datesets is cityscapes and I modify the dir as your. The error exists in our.py and api\test.py.
please tell me how to view the key in the dataloader.

Question about the student model

Dear author, thanks for your sharing! I am currently reading your paper and I have a question about the student model's initialization. Specifically, I am wondering how the student model is initialized in your experiments. (i can't understand the code well and have difficulty finding this part TT)
If you have time, I would greatly appreciate it if you could provide some clarification on this point.
Thank you for your time and consideration^^

about the choice of p_th

Awesome work, but a little question:

Therefore, we use a threshold $p_{th}$ to filter the images, and do not apply augmentations on those with high confidence. More specifically, we design $p_{th} = conf^S-\delta$, where $conf^S$ is the 5% quantile for the softmax predictions’ confidence on the source images from the source model $f_\theta$

In your code, 0.74 was the 5% quantile for cityscapes. So ,how is this value calculated?

Kind regards.

qinenergy / cotta Goto Github PK

cotta's People

Contributors

Stargazers

Watchers

Forkers

cotta's Issues

Recommend Projects

Recommend Topics

Recommend Org