Code Monkey home page Code Monkey logo

Comments (8)

mateiz avatar mateiz commented on May 19, 2024

I agree, we should add a way to get it directly by name in the AbstractStore class. This sounds like it will be a pretty common use case. We can also document that we expect experiments to have unique names (I'm not sure we're actually enforcing this yet).

from mlflow.

cryptexis avatar cryptexis commented on May 19, 2024

@mateiz thanks for the feedback. Currently, it throws an exception if one tries to create experiment with an existing name. Also, what would be the best way to tackle this issue? This kind of enhancement requires changes in several parts of the project. Should I commit the whole change at once or in step-by-step manner ?

from mlflow.

mateiz avatar mateiz commented on May 19, 2024

I think doing it in smaller patches is fine, as you did by starting with the protobuf. It would also be good to see a quick API sketch if possible -- what do we expect the Python API to be, as well as the command line interface (where you can currently pass an experiment ID, how would you pass a name).

from mlflow.

mparkhe avatar mparkhe commented on May 19, 2024

+1 to request of accessing experiment by name.

My recommendation however is that it should be a different API GetExperimentByName for instance. The way it is implemented in the proto -- means that both id and name arguments will be necessary inputs. Also, interface logic would need to ascertain that only 1 of them is provided.

You can share the response message of course like suggested here

Also as @mateiz mentioned, this would give us an opportunity to unify the store interfaces as well. As you must have seen FileStore already supports that API. Would be good to add this to AbstractStore as well. #68, seems to be creating an experiment. It would be clearer to continue with get_experiment_by_name() semantics.

I have left some comments in #122 as well.

from mlflow.

cryptexis avatar cryptexis commented on May 19, 2024

@mparkhe thanks for the suggestion. Definitely a way to go. What do you think, do we need correct the naming for the existing functionality e.g. GetExperiment -> GetExperimentById ?
@mateiz I have found couple of more touch points where experiment_id is used.

  1. As part of mlflow.tracking API: mlflow.tracking.start_run -> if we are getting the Experiment Object by calling GetExperimentByName we can also obtain experiment_id from it. e.g.
exp_obj = mlflow.tracking.GetExperimentByName("name")
with mlflow.start_run(experiment_id=exp_obj.experiment_id)

I don't think, that in this particular scenario we need to make changes in start_run function.
2) As part of mlflow project API: CLI interface allows to pass experiment_id as an argument, I think we should allow experiment_name also there. I see two possibilities there:
a) add experiment_name as parameter to every single function on the way and add parameter handling chunk where we check which parameter was actually passed and which one held to its default value.
b) get experiment object -> extract_id -> pass experiment_id to projects.run (cli.py line 97) and rely on previous logic
3) CLI experiments -> actually here I am not sure. At this moment it supports only FileStore backend. On the other hand it supports only two API calls: create and list, which are implemented also in RestStore class. Maybe there is a special reason for that which I don't see. Also, is there any intention to add more commands there such as GetExperimentById and GetExperimentByName. Could you please clarify a bit here ?

from mlflow.

mparkhe avatar mparkhe commented on May 19, 2024

One of the concerns about changing interfaces is that it affects a whole bunch of users. I would not recommending changing GetExperiment APIs name or what it does. We expect people to know their experiments by # more than names, because of the most the APIs use experiment_id (more on that in a bit). My input on your thoughts above

  1. I think that usage makes sense. Maybe users can add a check for None returned, but using the same Experiment object means will help.

  2. My concern with adding name as an additional argument to all CLI/python APIs is that each of them need to implement some boiler plate handling code: precedence checks between id vs name arguments, what if they dont match, what if they are both null ... etc? The other solution of possibly duplicating all APIs to handle id version (what we have currently) and new set for name version. The concerns I have with that is a lot of replicated code, and if we add a new unique way of identifying experiments like GetExperimentByArtifactLocation that would be another flavor.

It is a legit ask to be able to query experiments by id and name (for now). Those apis definitely could be added. For everything else, my recommendation is that we stay with the flow that does

  • get_experiment (by id) OR get_experiment_by_name
  • null and other checks on returned object
  • use experiment_id from returned object for all APIs.
  1. As above, it definitely makes sense to have a first class query level API to "get experiment by name". We should add that to AbstractStore interface and then add those all store implementations and then expose that out using both Python and REST API.

from mlflow.

cryptexis avatar cryptexis commented on May 19, 2024

Can't disagree, all makes sense! Thanks for the input :)

from mlflow.

cryptexis avatar cryptexis commented on May 19, 2024

@mparkhe @mateiz The following snippet helps to access experiment object either by name or by id.

    from mlflow.tracking import get_service
    mlflow.set_tracking_uri("http://localhost:5000")
    service = get_service()
    exp_id = service.get_experiment_by_id(ID).experiment_id
    with mlflow.start_run(experiment_id=exp_id):

or

    exp_id = service.get_experiment_by_name(NAME).experiment_id
    with mlflow.start_run(experiment_id=exp_id):

I tested it on tutorial scripts. Please let me know if anything is missing or should be corrected.
Also, in case everything is fine, shall I add the info about API calls to the documentation too ?

from mlflow.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.