Comments (2)
EDITED
Turns out the error still exist, so threading is not an issue. Though at this point is't not harm to
- get git commit hash in prior of run
- joint multiple decorators into one
In a0a3780, we have the subprocess to talk with system. But it appears like the threading issue would cause the run been execute twice? Since the background mechanism of ray is not clear, it's probably better to get those tags in prior of trial runs. Then pass them into the experiment runs.
Affecting Parts
def mlflow_auto_tags(func: Callable):
@wraps(func)
def wrapper(*args, **kwargs):
tags = {
"docker.image.id": "hostname",
"git.commit": f"git --git-dir {os.getenv('GITDIR')} "
f"rev-parse --short HEAD",
}
mlflow.set_tags(
{
k: subprocess.check_output(v.split()).decode("ascii").strip()
for k, v in tags.items()
}
)
func(*args, **kwargs)
return wrapper
Error Logs
Traceback (most recent call last):
File "/home/mbl/mbl/workflow/grid_search.py", line 89, in wrapper
func(*args, **kwargs)
File "/home/mbl/mbl/workflow/grid_search.py", line 121, in experiment
mlflow.log_params(config)
File "/usr/local/lib/python3.9/site-packages/mlflow/tracking/fluent.py", line 639, in log_params
MlflowClient().log_batch(run_id=run_id, metrics=[], params=params_arr, tags=[])
File "/usr/local/lib/python3.9/site-packages/mlflow/tracking/client.py", line 918, in log_batch
self._tracking_client.log_batch(run_id, metrics, params, tags)
File "/usr/local/lib/python3.9/site-packages/mlflow/tracking/_tracking_service/client.py", line 292, in log_batch
self.store.log_batch(run_id=run_id, metrics=metrics, params=params, tags=tags)
File "/usr/local/lib/python3.9/site-packages/mlflow/store/tracking/rest_store.py", line 309, in log_batch
self._call_endpoint(LogBatch, req_body)
File "/usr/local/lib/python3.9/site-packages/mlflow/store/tracking/rest_store.py", line 56, in _call_endpoint
return call_endpoint(self.get_host_creds(), endpoint, method, json_body, response_proto)
File "/usr/local/lib/python3.9/site-packages/mlflow/utils/rest_utils.py", line 256, in call_endpoint
response = verify_rest_response(response, endpoint)
File "/usr/local/lib/python3.9/site-packages/mlflow/utils/rest_utils.py", line 185, in verify_rest_response
raise RestException(json.loads(response.text))
mlflow.exceptions.RestException: INVALID_PARAMETER_VALUE: Changing param values is not allowed. Param with key='n' was already logged with value='20' for run ID='d404ec5e9ad341feaade870dfca5af34'. Attempted logging new value '8'.
from mbl.
After hours of try-and-error and a lots of panics, it's finally resolved with ray==1.12.1
. The reason is not clear, may have to report this behavior to ray team.
from mbl.
Related Issues (13)
- Feature: compute energy bounds HOT 1
- Feature: inverse participation ratio
- Feature: modularity in tSDRG
- Feature: add authentication to Mlflow UI HOT 2
- Feature: support for Slurm custer
- Bug: Can't import mlflow with protobuf 4.21
- Performance: poetry resolving environment super slow
- Test: add unit test for mlflow, ray and datawrangler
- Refactor: extract highest energy bound with tsdrg variant
- Bug: (AccessDenied) when calling the ListObjectsV2 operation HOT 3
- Test: MPS representation of MBL and prethermal states
- Dependency Dashboard
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mbl.