Comments (4)
Thanks for the issue @CedricLeon and for the link to wandb/wandb#6952 (I hadn't seen that). Unfortunately I don't have any solution for this at the moment.
It is really frustrating to be honest, especially if runs fail/get killed by a scheduler (in that case, no config is ever uploaded).
One thing I've tried was to set the config
via the wandb
API during the run (with this script that takes the hyperparameters from the lightning config and then sends them to wandb), but it seems like they get overwritten with the next sync.
from wandb-offline-sync-hook.
Obviously, this is more of a bug in wandb
, though wandb-osh
could include a fix/workaround if only we had one (but I don't know of any). Let me know if you discover something!
from wandb-offline-sync-hook.
Yes, I tried a similar trick than you, but it gets overwritten at the next synchronization it seems.
Ok thanks for your insights, we will see if they ever try to close wandb/wandb#6952...
So far, my "fix" is to (from time-to-time) run a script I wrote that scans my wandb project for runs which do not have a specific key (coming from my HYDRA config) available on wandb. For each of this run, I find the corresponding log folder in my workspace, do some manual interpolation on the config (because OS variables have changed), and then overwrite the wandb config and update it.
If you are interested, I can try to clean and publish this script somewhere
from wandb-offline-sync-hook.
I described the full problem in a wandb issue: #7227
from wandb-offline-sync-hook.
Related Issues (20)
- Document wandb --sync-all option
- Include readme in sphinx HOT 1
- Use sphinx-argparse to document CL tool
- Add timeout to subprocess run
- Syncing doesn't work with wandb HOT 26
- Make internal methods/modules private; make args kw-only
- automatically add the `--sync-all` argument and sync again if wandb notifies us about it
- Logging broken with ray 3.6
- Auto refresh at web ui HOT 2
- Change logging level HOT 4
- Change default logging level to INFO
- Resuming while offline HOT 1
- Wandb Config Updates are not synced HOT 8
- Timeout option from the CL is ignored HOT 3
- Wandb-osh cannot handle many runs at once HOT 6
- Doc build broken HOT 1
- Runs listed as "Finished" in the WandB portal while still running HOT 3
- `pytorch_lightning` mixed with `lightning.pytorch`: Getting a "ValueError('Expected a parent')" when using a list of Callbacks HOT 4
- Is this incremental sync?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from wandb-offline-sync-hook.