Comments (6)
Thanks for the suggestion! We didn't have a lot of prior experience with SSL so we chose to match the defaults of the original SimCLR/MoCo papers. Do you know of any papers that demonstrate that no weight decay works better? I'm surprised Google/FAIR didn't find this during their hyperparameter tuning.
from torchgeo.
I don't think it is mentioned in the SimCLR paper but it is in the code here: https://github.com/google-research/simclr/blob/383d4143fd8cf7879ae10f1046a9baeb753ff438/tf2/model.py#L40-L42
BYOL does the same: https://github.com/google-deepmind/deepmind-research/blob/f5de0ede8430809180254ee957abf36ed62579ef/byol/byol_experiment.py#L191-L195
But I just noticed that you are not using LARS optimizer and in SimCLR they only did this for LARS. For the other optimizers they didn't use weight decay at all, but I am not sure if they benchmarked their code with these settings.
from torchgeo.
Yeah, PyTorch doesn't have a LARS optimizer. Let me do some digging and figure out where I found these weight decay values.
from torchgeo.
Okay, finally had time to look into this.
SimCLR
I don't think it is mentioned in the SimCLR paper
Weight decay is mentioned in:
For the other optimizers they didn't use weight decay at all
You are correct that weight decay is not used in the optimizer, although it is used in the loss function.
MoCo
Weight decay is mentioned in:
It isn't mentioned in MoCo v2, although the code for v2 is largely the same as v1. The value of weight decay for v3 is not mentioned in the paper, just that it was used.
In the code base, weight decay is used with SGD in v1/v2, LARS in v3, and AdamW in v3.
from torchgeo.
If you want to submit a PR that removes weight decay from our SimCLR optimizer and adds it to our loss function, I would be happy to accept it. I'm a little afraid to remove it entirely though.
from torchgeo.
I think this issue can be closed. If users want to reproduce the original MoCo/SimCLR papers, they can use our current defaults. If they want to try to improve performance, they can use weight_decay=0
.
from torchgeo.
Related Issues (20)
- Switch from SMP to TorchSeg HOT 1
- Add plot method to IntersectionDataset HOT 1
- v0.5.2 missing PRs HOT 2
- Use ruff
- Add Inference Example HOT 1
- Switch coverage providers? HOT 1
- Auto download fails for FireRisk HOT 11
- Anomaly with RandomGrayScale tests HOT 2
- Add YAML formatter HOT 16
- Change documentation theme HOT 1
- CDL: cannot redownload additional years HOT 20
- Overrideable resample property for IntersectionDataset
- UnionDataset of two IntersectionDataset fails HOT 2
- RandomBatchGeoSampler produces nan or nodata values HOT 6
- Check if bbox of intersection is valid HOT 4
- Git clone and pip install results in 'Successfully installed UNKNOWN-0.0.0' HOT 10
- class_weights cannot be passed via config file as a tensor is expected HOT 5
- README.md benchmark dataset code HOT 17
- AgriFieldNet missing filename glob HOT 3
- CLI script for test HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from torchgeo.