p-lambda / wilds Goto Github PK

View Code? Open in Web Editor NEW

543.0 543.0 124.0 1.3 MB

A machine learning benchmark of in-the-wild distribution shifts, with data loaders, evaluators, and default models.

Home Page: https://wilds.stanford.edu

License: MIT License

Python 100.00%

wilds's People

Contributors

Stargazers

Watchers

Forkers

nguyenducnhaty christofernal linkonbsmrstu jbdatascience tubbz-alt runtianz samiraabnar henrikmarklund ethancaballero tigrangalstyan menkotoglou tonmoy-saikia amirfeder gianluigi121 zhangmarvin etiennedavid millerjohnp hulkund michiyasunaga o7s8r6 joe32140 trendingtechnology mitchellnw i-gao teetone ozansener suneelbelkhale keawang sharonwang77 greentfrapp pasinit johntzwei bigdatasciencegroup ch-shin jxzhangjhu sequoia-n9 bryan85le musikisomorphie rsokl meng-lingjun-xjtu jmamath shadowkiller33 itsjohnward hohohohia jingdong00 alinadubatovka yangarbiter sconsul rainwangphy patrikokanovic yongwuml lmkoch shuvozitghose caglasozen sharanyamohan-30 allenpu oztc niki-amini-naieni bnestor jfchi saoirselilyy tjoo512 kebaek yaodongyu ziningzhu cjshui aniquetahir jingzhengli socromp toory465 weichiyao qinlab idyanoleroy clinicalml wyf0912 cytsai obsinaan nalzok lhz1029 lilyjiayi saurabhgarg1996 ruqibai henriquehott madeleinegrunde kmatton trellixvulnteam dkaramardian emadaghajanzadeh atharvajk98 beetrootwang tonystark262 dhockaday alex-tifrea dzeiberg estija n-nsh bagotia16 mlondschien feladorhet vivere-dally

wilds's Issues

Error with example fMOW command: incorrect value of "unlabeled_n_groups_per_batch"

Hello,
If I directly run this command suggested in the README:
python examples/run_expt.py --dataset fmow --algorithm DANN --unlabeled_split test_unlabeled --root_dir data

I get the following exeption:

Traceback (most recent call last):
  File "/mnt/beegfs/bulk/mirror/jyf6/datasets/wilds/examples/run_expt.py", line 491, in <module>
    main()
  File "/mnt/beegfs/bulk/mirror/jyf6/datasets/wilds/examples/run_expt.py", line 454, in main
    train(
  File "/mnt/beegfs/bulk/mirror/jyf6/datasets/wilds/examples/train.py", line 114, in train
    run_epoch(algorithm, datasets['train'], general_logger, epoch, config, train=True, unlabeled_dataset=unlabeled_dataset)
  File "/mnt/beegfs/bulk/mirror/jyf6/datasets/wilds/examples/train.py", line 38, in run_epoch
    unlabeled_data_iterator = InfiniteDataIterator(unlabeled_dataset['loader'])
  File "/mnt/beegfs/bulk/mirror/jyf6/datasets/wilds/examples/utils.py", line 393, in __init__
    self.iter = iter(self.data_loader)
  File "/home/fs01/jyf6/miniconda3/envs/ponds/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 442, in __iter__
    return self._get_iterator()
  File "/home/fs01/jyf6/miniconda3/envs/ponds/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 388, in _get_iterator
    return _MultiProcessingDataLoaderIter(self)
  File "/home/fs01/jyf6/miniconda3/envs/ponds/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1085, in __init__
    self._reset(loader, first_iter=True)
  File "/home/fs01/jyf6/miniconda3/envs/ponds/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1118, in _reset
    self._try_put_index()
  File "/home/fs01/jyf6/miniconda3/envs/ponds/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1352, in _try_put_index
    index = self._next_index()
  File "/home/fs01/jyf6/miniconda3/envs/ponds/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 624, in _next_index
    return next(self._sampler_iter)  # may raise StopIteration
  File "/mnt/beegfs/bulk/mirror/jyf6/datasets/wilds/wilds/common/data_loaders.py", line 131, in __iter__
    groups_for_batch = np.random.choice(
  File "mtrand.pyx", line 984, in numpy.random.mtrand.RandomState.choice
ValueError: Cannot take a larger sample than population when 'replace=False'

I think this occurs because there are only 2 unique years in the test_unlabeled split, but unlabeled_n_groups_per_batch is set to 8, so it tries to sample 8 years without replacement.

I was able to fix this by changing the argument unlabeled_n_groups_per_batch to 2, here: https://github.com/p-lambda/wilds/blob/main/examples/configs/datasets.py#L220

It would be great if this can be fixed. Thank you so much for releasing these wonderful datasets and baseline algorithms!

including y in metadata_fields

Hi, for wilds/datasets/wilds_dataset.py Line 113, i was wondering why including y in metadata_fields if y_size == 1. Thanks!

ModuleNotFoundError: No module named 'transformers'

Hello, in your several files in examples (e.g. optimizer.py and transforms.py), you tried to import some functions from transformers but it is not provided in the current version. Could you please upload the module file? Thanks!

Unable to Train ERM model with civilcomments

Hi,

I am having trouble in running the code with command
python3 wilds/examples/run_expt.py --dataset civilcomments --algorithm ERM --root_dir data --download
Everything stuck, no error reported, both GPU and CPU are not leveraged.

If ctrl+C, it shows

The same thing didn't happen when I tried to run the same script but with groupDRO.

It would be very helpful if you have any clue on this, and thank you a lot for your amazing, well developed code!

Question about the creation of WILDS-FMoW subset

Hi,

In your paper, you have mentioned that you have used a subset of FMoW. However, in the rgb_metadata.csv file provided, you analyse the entire fmow dataset and I couldn't find where in the code you are creating the subset (sampling from the rgb_metadata.csv file). I have also looked at the parameter frac which was equal to 1.0 in the config file as well as the worksheet (https://worksheets.codalab.org/rest/bundles/0x20182ee424504e4a916fe88c91afd5a2/contents/blob/log.txt). Therefore, I would greatly appreciate it if you could kindly let me know how you created the subset.

Thank you.

Sara A. Al-Emadi

Dataset Split Size

I noted that the dataset has been updated, e.g. iwildcam. Where can I find the latest information about the dataset split size?

[Question] Easily accessible pre-trained models

Hi, is there any way to easily access pretrained models for quick evaluation?

For instance something like the following,

|Agorithm | Model | Parameters |
+--------+-------+------------+
| ERM | Resnet50 | Weights50|
| .......
| .......
| ......

Model loaded from a .pth predicts only zeros

Hello !

I downloaded for the Camelyon17 dataset your trained model from CodaLab (ERM and seed0). I have installed all packages correctly according to your readme and load the model as follows:

path = "/best_model.pth"
state = torch.load(path)['algorithm']

state_dict = {}
 
for key in list(state.keys()):
    state_dict[key.replace('model.', '')] = state[key]

model.load_state_dict(state_dict)

model.eval()

I initialize the dataset I use for testing the model as follows:

import datasets_load  # from wilds package
dataset = datasets_load.Dataset('camelyon17', 32, '/data', 0.75, False)

For the prediction I used the following piece of code:

from wilds.common.data_loaders import get_eval_loader

test_data = dataset.test_set
test_loader = get_eval_loader('standard', test_data, batch_size=32)

with torch.no_grad():
    for x, y_true, metadata in test_loader:
          y_pred = model(x)
          labels = y_true
          _, predicted = torch.max(y_pred, 1)
          # print statements to check the output
          print("Labels: ", labels)
          print("Predicted: ", predicted)
          print("Correct: ", (predicted == labels).sum().item())

So far so good. When I run the code, the labels are printed (which always consist of 1 at the beginning, because shuffle=False) and the prediction which always consists of 0 values.

Labels:  tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])
Predicted:  tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
Correct:  0

I would appreciate any advice or assistance. Many thanks in advance.
Tim

Unable to retrieve CodaLab experiment outputs

Hello,

I am trying to download the trained models using the link provided (CodaLab).

When clicking on any of the iWildCAM v2.0 (or any other dataset) experiment results in CodaLab, I get a page with the command line to run (to train the model myself) and loading logo underneath it.

It seems like it is trying to load something, but I had this page open for hours and it still isn't giving me anything. When I click the 'download' button on the left, it leads me to an error page.
Is there a way I can get the results like the best_model.pth that the CodaLab pages describes?

Thank you!

`assert` error in new wilds version with FMoW

Hello, I am using the new version of WILDS and getting the error:

... wilds/common/utils.py" line 86, in avg_over_groups
    assert v.numel()==g.numel()

any ideas? It may be a bug on my end and if I catch it I'll update here.

Issue in OOD data distribution when Grouper is set to "regions" for FMoW

Hi,

I am trying to change the groupby from "year" to "region". I have followed the instructions in the README page and currently using the following command:
python3 wilds/examples/run_expt.py --dataset fmow --algorithm ERM --groupby_fields region --root_dir wilds_fmow/

However, the issue is that the training dataset is not being separated in terms of distinct regions for ID and OOD manner. That is, all regions are included in ID as well as OOD. Here is a screenshot of the output:

Therefore, I was wondering if that is a bug in the code or am I missing something?

Thanks
Sara A. Al-Emadi

Error loading the ogb-molpcba dataset

In ogbmolpcba_dataset.py line 96 (and similarly 98), a PyGCollater object is initialized without passing a required positional argument (dataset) which raises an error when calling the get_dataset function for the molecule dataset. I think this can be fixed by replacing with
self._collate = PyGCollater(self.ogb_dataset, follow_batch=[], exclude_keys=[])? Or is this how it is supposed to be and I am missing something?

WILDSSubset getitem transform is broken

wilds/wilds/datasets/wilds_dataset.py

Line 456 in aa434b5

x, y = self.transform(x, y)

self.transform is given 2 inputs. But, from the README, transform can be defined as a torchvision.transforms which takes 1 input. The result is the following error:

TypeError: __call__() takes 2 positional arguments but 3 were given

Figuring out the log files

I am trying to understand the log output.
After running a training command for instance python examples/run_expt.py --dataset camlyon17 --algorithm ERM --root_dir data I will get a log folder with many files. What is the difference between test_algo.csv and test_eval.csv. I have seen that they are related to two loggers:

datasets[split]['eval_logger'] = BatchLogger(
            os.path.join(config.log_dir, f'{split}_eval.csv'), mode=mode, use_wandb=(config.use_wandb and verbose))
datasets[split]['algo_logger'] = BatchLogger(
            os.path.join(config.log_dir, f'{split}_algo.csv'), mode=mode, use_wandb=(config.use_wandb and verbose))

What is the difference between algo and eval ?

1 The Waterbirds dataset's link is invalid

Hi,

The Waterbirds dataset with UUID: '0x505056d5cdea4e4eaa0e242cbfe2daa4" on CodaLab is invalid right now with Error: 404 (cannot manually download it from the link or the page). It would be greatly appreciated if you could kindly fix it.

Thank you very much!

How to get the train loader of a single specific domain?

For example, for iWildCam dataset, I want to write a training data loader of the specific location No. 296
Are there any code snippets for that?

What are the random seeds used in the paper

Could you please share what are the random seeds used for the experiments in the paper? I think using the same set of random seeds for our own experiments can offer a fairer comparison of results.

algorithm.eval() vs. algorithm.model.eval()

Hi,

Really nice job with this repo! I had a small comment on the use of algorithm.eval() vs. algorithm.model.eval() in the wilds/examples/train.py file that might be useful to others.

I wasn't able to find this in the code, but how does algorithm.eval() differ from algorithm.model.eval()?

I ask because algorithm.model.eval() preserves the grad_fn attribute on the model output, while algorithm.eval() does not. This was unexpected behavior since pytorch's .eval() function doesn't do this. This is important for my use case, since I'm trying to evaluate the gradients when the model is in eval mode. If this does not break behavior elsewhere, I'd suggest switching to algorithm.model.eval().

Happy to explain more if this was confusing!

Waterbirds give 0 worst-group accuracy

Training waterbirds dataset out-of-the-box gives 0 worst group accuracy. Digging deeper, I noticed that all the predictions immediately become the 0 label. Any advice would be helpful. Thanks in advance.

n_groups_per_batch does not work for Poverty

Hi WILDS Team,
I am currently working with the WILDS repository. Currently, I work with the poverty dataset. I've run the script run_exp.py in the examples folder with the argument --n_groups_per_batch=3 or a different number. However, per batch, I get samples from more than 3 different groups. Do is use this argument wrongly? I understood the argument --n_groups_per_batch as the number of different environments from which samples exist in one batch.

The command line reads:
python examples/run_expt.py --dataset poverty --algorithm ERM --root_dir data --n_epochs=200 --seed=0 --log_every=200 --batch_size=64 --n_groups_per_batch=2 --progress_bar True

The output when i use the n_groups_variable defined in IRM.py:
n groups: 13
groups: tensor([ 3, 5, 7, 9, 10, 11, 13, 14, 16, 19, 20, 21, 22], device='cuda:0')

In addition, is the command --uniform_over_groups valid for Poverty? Since the samples are not uniformly distributed over the different environments used in the training split?

Thanks in advance for your help.

Niels

Poverty Map: Unable to map the image_id to its corresponding wealth_pooled and country domain from the dhs_meta.csv file.

Hi, In the train_mixup(train_loader, epoch, agg) function, for ith sample of a batch, I got image id= 5863, domain = 13 with wealth_pooled= -0.8209. Upon looking into the dhs_meta.csv file, I figured out the wealth_pool value and country domain, however the corresponding image_id is not 5863. Would you please help me how can I map the image_id to the corresponding country and wealth_pooled. Thanks.

Poverty Train and Val transforms

The WILDS paper claims to normalize and augment samples from the Poverty dataset in training. But this official repo and this non-official one do not apply any transformation. What is recommended?

Thanks

Using ordinary split as an unlabeled split?

Is there a clean way to use an ordinary split as an unlabeled split by just dropping the labels?
For example, I'd like to be able to use --unlabeled_split val.

Cannot fetch 'ogb-molpcba' dataset due to missing arg

dataset = get_dataset(dataset='ogb-molpcba', download=True, root_dir='../data/')

Results in the following error:

--------------------------------------------------------------------
TypeError                          Traceback (most recent call last)
<ipython-input-2-c369817b9157> in <module>
----> 1 dataset = get_dataset(dataset='ogb-molpcba', download=True, root_dir='../data/')

~/anaconda3/envs/benchmark/lib/python3.7/site-packages/wilds/get_dataset.py in get_dataset(dataset, version, **dataset_kwargs)
     51     elif dataset == 'ogb-molpcba':
     52         from wilds.datasets.ogbmolpcba_dataset import OGBPCBADataset
---> 53         return OGBPCBADataset(version=version, **dataset_kwargs)
     54 
     55     elif dataset == 'poverty':

~/anaconda3/envs/benchmark/lib/python3.7/site-packages/wilds/datasets/ogbmolpcba_dataset.py in __init__(self, version, root_dir, download, split_scheme)
     88             download_url('https://snap.stanford.edu/ogb/data/misc/ogbg_molpcba/scaffold_group.npy', os.path.join(self.ogb_dataset.root, 'raw'))
     89         self._metadata_array = torch.from_numpy(np.load(metadata_file_path)).reshape(-1,1).long()
---> 90         self._collate = PyGCollater(follow_batch=[])
     91 
     92         self._metric = Evaluator('ogbg-molpcba')

TypeError: __init__() missing 1 required positional argument: 'exclude_keys'

Versions:

wilds 1.1.0
torch_geometric 1.7.0

Label Description

Thanks for sharing the dataset, I have find the label description in the code.

Hyperpameter C for group-DRO missing?

Hi!
In the original paper of group-DRO (Sagawa) there is a hyperparameter C called "model capacity constant" in Equation 5. I tried searching for it in the code but I couldn't find anything. Is it implemented? And if yes, can you point me to the part of the code that includes it?

Replicating Civil Comments results with standard deviation with Group DRO (label ) algorithm

Dear Team,

I am trying to replicate results for the civil comments dataset using GROUP DRO - LABEL (i.e group by = 'Y') in the leaderboard .
Test average accuracy is mentioned as 90.2 (0.3) and validation average accuracy as 90.4 (0.4) ,I am not understanding on how these values were obtained, does that mean from each seed max of average accuracy among 5 epochs are used, or do we use only the average accuracy from the last epoch(i.e. 5th epoch)

I tried taking the average from all 5 seeds by using the 5th epoch average accuracy, i didn't not achieve 90.2 in the test average accuracy.( I used the average value from the test.eval.csv of all the 5 seeds published in the notebook)

Could you please help me with how to replicate the results including the standard deviation?

"Corrupt" image in dataset iWildCam version 2.0

Hi and thanks for sharing the code.

I found an issue when training on dataset iWildCam version 2.0. The problem doesn't exist in iWildCam v1.0.

I think that there is at least 1 corrupt image in iWildCam v2.0. When I train on this dataset, the training breaks because it cannot open the image.

The corrupt image is: /iwildcam_v2.0/train/8ad9843e-21bc-11ea-a13a-137349068a90.jpg
I also tried to open this image with Image Viewer on Ubuntu and it didn't work.

Other people on the internet encountered the problem as well:
https://www.kaggle.com/c/iwildcam-2020-fgvc7/discussion/134923

I'm not exactly sure what is the best solution here, but maybe if you would temporarily make "v1.0" the default it might decrease the people that stumble in it.

Thanks,
George

Support for faster model training

May I check if there's any plan to support: 1) multi-gpu parallel training; 2) fp16; 3) gradient accumulation. These functions would allow us to train models much faster and with larger batch size (especially for large models like BERT).

Obtaining (full) model predictions for trained models

Hello and thank you for all your work on this important project!

I am wondering if it's possible to share the full predictions of the trained models on the val/test data.

If I understand correctly, this is similar to the output files
{dataset}_split:{split}_seed:{seed}_epoch:{epoch}.csv (e.g. this file for Rxrx1), but where the information in the csv is not only the argmax class prediction, but the entire logits vectors (e.g. in this case a vector in R^1139).

I think this would be useful as it will allow people (myself included) to evaluate trained models using a variety of custom metrics, but without actually downloading the data and doing the evaluation (which could be prohibitive for some of the larger datasets).

Thanks!
Gal

fmow and Pandas 2.0.0 datetime conversion

I'm getting an error when initializing the "fmow" dataset. I got the following error for the conversion of the timestamp to datetime with Pandas:

ValueError: time data "2011-02-07T02:48:56.643Z" doesn't match format "%Y-%m-%dT%H:%M:%S%z", at position 92. You might want to try:
- passing format if your strings have a consistent format;
- passing format='ISO8601' if your strings are all ISO8601 but not necessarily in exactly the same format;
- passing format='mixed', and the format will be inferred for each element individually. You might want to use dayfirst alongside this.

I noticed I was using Pandas 2.0.0 (presumably the most recent version) and when I reverted to Pandas 1.5.3, the issue seemed to go away. I'm guessing the datetime formatting was changed in version 2 and it might be good to update WILDS to still work with the new version. Thanks!

Installating via pip seems to miss `torch_scatter` dependency

Hey,

I noticed that the installation via pip install wilds seems to miss the torch_scatter dependency that is also listed in the README. When e.g. trying to do from wilds.datasets.amazon_dataset import AmazonDataset I got

from wilds.datasets.amazon_dataset import AmazonDataset
  File "/Users/deul/Desktop/wilds/wilds/datasets/amazon_dataset.py", line 6, in <module>
    from wilds.common.utils import map_to_id_array
  File "/Users/deul/Desktop/wilds/wilds/common/utils.py", line 1, in <module>
    import torch, torch_scatter
ModuleNotFoundError: No module named 'torch_scatter'

As far as I can see, the solution should be as easy as adding torch_scatter>=2.0.5 to the install_requires attribute in setup.py. In my case, the error was resolved after installing torch_scatter separately.

Understanding the prediction_dir format for leaderboard submission

I wonder if the log folder used during training is the prediction_dir described in Get Started: Evaluating trained models.

I tried to reproduce the ERM result on a subset of camelyon with the following command:

python examples/run_expt.py --dataset camelyon17 --algorithm ERM--root_dir data --frac 0.1 --log_dir log_erm_01.

Training goes well.

But my file camelyon17_split:id_val_seed:0_epoch is empty.

Then I ran the following command:
python examples/evaluate.py log_erm_01 erm_01_output --root-dir data --dataset camelyon17

And I got this:

Traceback (most recent call last):
  File "examples/evaluate.py", line 282, in <module>
    main()
  File "examples/evaluate.py", line 244, in main
    evaluate_benchmark(
  File "examples/evaluate.py", line 136, in evaluate_benchmark
    predictions_file = get_prediction_file(
  File "examples/evaluate.py", line 89, in get_prediction_file
    raise FileNotFoundError(
FileNotFoundError: Could not find CSV or pth prediction file that starts with camelyon17_split:id_val_seed:0.

So my question is whether the log file is the prediction_dir described in Get Started ?

Calculation of OOD within the paper

Hello together,

first of all I would like to thanks for making the paper and code to "WILDS: A Benchmark of in-the-Wild Distribution Shifts" publicly available. What caught my interest when reading the paper was the estimation of the in (IID) and out-of-distribution (OOD) which has been evaluated using empirical risk minimization (Table 1 page 20). My question is how the IIDs and OODs were calculated. Did you use the softmax with temperature scaling according to the paper "Enhancing the reliability of out of distribution image detection in neural networks"? If not, can you give reference to the way you tackled this problem?

Thank you in advance for your kindful reply

Data loader for PovertyMap is very slow

Hi -

Ran into a bit of an issue with data loading the Povertymap dataset - loading a single minibatch with 128 examples takes about 5-6 seconds. This is not a huge deal but slow enough to make me curious if there's a faster way of doing this.

Digging into the code a bit, it looks like the slowdown is mostly due to the array copy on line 239 of poverty_dataset.py

wilds/wilds/datasets/poverty_dataset.py

Line 239 in f984047

img = self.imgs[idx].copy()

FWIW it looks like this is a known issue for memory-mapped numpy arrays on Linux systems (https://stackoverflow.com/questions/42864320/numpy-memmap-performance-issues).

I'm not sure if there are any recommendations for getting around this, or if there's another way the data could be loaded in? Or let me know if I'm totally off-base here. Thanks!

Downloading FMoW dataset

Hello author,
Thanks for the release of the code of your paper. I love your work.

The download is stuck in the middle of the process when implementing "wilds.get_dataset" for downloading the FMoW dataset.
Could you check this one?

Thank you in advance.

What is random baseline in paper (Amazon)?

Hi,

In the paper(version 2) , figure 22 mensions a baseline which the ERM cxompared to. What is this random baseline? Thank you!

run_expt.py: --device argument doesn't set the device

Hey, I'm running Wilds on a p2.8xlarge AWS EC2 instance with 8 K80 GPUs. I noticed that when I try to run run_expt.py and use the --device argument to divide the jobs I'm trying to run between the GPUs, they all end up running on GPU 0. I verified this by the memory usage in nvidia-smi as well as printing the device used by torch using torch.cuda.current_device(). My guess is that the CUDA_VISIBLE_DEVICES environment variable, set here, is set too late and PyTorch just defaults to device 0.

I've worked around this by setting the CUDA_VISIBLE_DEVICES variable manually, before running the script. I just thought I'd let you know I encountered this issue.

Really appreciate the project by the way! Being able to access multiple datasets for domain generalization with the same interface is really useful, and I managed to use run_expt pretty easily to run my own experiments.

camelyon17 split scheme: in-dist

I am not able to run Camelyon17 with --split_scheme in-dist (I'm assuming this corresponds to the setting with ID val data).

Any pointers on how to run this, or in general how to run camelyon with the ID val data?

Thank you for the help!

How do I access data from only one group?

Hello, Thanks for the fantastic library!

I have two questions:

Is there any way I can get a per-group dataloader in wilds? This will help with, for instance, training a separate model for each group of data.
Can I change the split of data for each dataset? My application requires 50% of the data for each group/domain for testing.

Thanks!

Oracle results for UDA tasks

Hi,

Could you please share Oracle, i.e. training on labeled target domain, results for the Camelyon17 and iWildCAM datasets in "Extending the WILDS benchmark for unsupervised adaptation"? Oracle results for the other datasets would be appreciated too.

If Oracle results are not available, could you please share the commands that can be used to obtain them?

Thanks.

Could you provide the trained weights？

Hello，

I am training BERT+ERM on the Amazon dataset but it is very time cost. Is it possible to provide the best trained parameters to the users? ( like BERT is proving the pretrained weights, maybe you can have another folder under examples which contains all the weights for users.) It will save users about a week ( and computations). Thank you!

releasing smaller subsets of the datasets

Hi, thanks for releasing the benchmark and the datasets. I was wondering if it would be possible to release smaller subsets of the datasets (e.g. similar size to cifar, mnist, etc) in order to allow for rapid prototyping? As it stands currently it takes more than 2 days just to download one dataset which could occupy the majority of disk space as well. This alone could prohibit people from trying out and exploring the datasets.

Also, it would be nice if you could put on the info page here on github what is the download size of each dataset and what is the actual size on disk.

Thanks!

How to distinguish between out-of-domain and in-domain samples in the civilcomments test set?

I want to be able to distinguish between these two types of samples during testing so that I can evaluate the model's performance separately on each type.

What are some strategies or techniques that I can use to distinguish between in-domain and out-of-domain samples in the test set?

civil commnets subgroups training

Hello,
Thanks for making your library easy to navigate.

I'm trying to reproduce your results using the executable version here: https://worksheets.codalab.org/worksheets/0x7985440df1b742039badfacb4669d0d2
Can you please elaborate for me what does IRM x black mean? specifically in the datasets configs: groupby_fields': ['black', 'y'] what would that imply?

pre-trained SwAV model weights for Camelyon17

Hi,

Are pre-trained SwAV model weights for Camelyon17 publicly shared?
I refer to this file used in the fine-tuning step: "--pretrained_model_path pretrained/checkpoints/ckp-55.pth"
Also, may I know which commands were used for the SwAV-Camelyon17 results in Table 2. of the paper [1]? I can find three sets of commands (camelyon17_swav55_ermaugment_seed, camelyon17_swav55_ermaugment_val_seed, camelyon17_swav55_ermaugment_train_seed) at this link: https://worksheets.codalab.org/worksheets/0xb148346a5e4f4ce9b7cfc35c6dcedd63.

I am not sure which ones were used as I get slightly different results when I calculate the results using the logs.

Thanks!

[1] Extending the WILDS benchmark for unsupervised adaptation

Map for adding cross validation training and evaluation

Hello and thank you for this amazing package.

Instead of using replicates, I would be interested in adding a cross validation training and evaluation scheme based on the domain metadata.

Say a dataset has domain: A,B,C. I would like to:

train on 70% of data sampled from A,B and evaluate in distribution on the remaining 30 % from A,B and out of distribution on C.
train on 70% of data sampled from B,C and evaluate in distribution on the remaining 30 % from B,C and out of distribution on A.
train on 70% of data sampled from C,A and evaluate in distribution on the remaining 30 % from C,A and out of distribution on B.

Finally average the in distribution and the out of distribution metric to have the final performance.

Here the 70-30 split is arbitrary and should be modifiable.

I am just starting exploring the package having only replicated the ERM result on the camelyon17 dataset.

It seems that the grouper object might be a good start to implement the following procedure. But, I am still lacking a high level overview of the code. So how would you do this ?

Fail to download ogb-molpcba dataset caused by the version of torch_geometric.

I ran python wilds/wilds/download_datasets.py --root_dir data --datasets ogb-molpcba.
And got an error message like this.

Traceback (most recent call last):
  File "wilds/wilds/download_datasets.py", line 34, in <module>
    main()
  File "wilds/wilds/download_datasets.py", line 27, in main
    wilds.get_dataset(
  File "..../wilds/get_dataset.py", line 52, in get_dataset
    from wilds.datasets.ogbmolpcba_dataset import OGBPCBADataset
  File "..../wilds/datasets/ogbmolpcba_dataset.py", line 7, in <module>
    from torch_geometric.data.dataloader import Collater as PyGCollater
ModuleNotFoundError: No module named 'torch_geometric.data.dataloader'

I found it is caused by torch_geometric changing the module name or moving the function.
I fixed it by changing from torch_geometric.data.dataloader import Collater as PyGCollater to from torch_geometric.loader.dataloader import Collater as PyGCollater. And successfully downloaded the data.

I guess you could check the dependency and solve it. The version of my torch_geometric is 2.0.2.

BTW This benchmark is very useful. It would be nice to have a TensorFlow version. I am looking forward to it.

Minor issue: `pip install wilds` changes pytorch version

A really minor issue, but the pip install wilds changed my pytorch version which then caused some prior evals on non-wilds datasets to change slightly. Is it possible for this to not occur? No worries if not.

p-lambda / wilds Goto Github PK

wilds's People

Contributors

Stargazers

Watchers

Forkers

wilds's Issues

Hello !

Recommend Projects

Recommend Topics

Recommend Org