Code Monkey home page Code Monkey logo

Comments (8)

wslulciuc avatar wslulciuc commented on August 27, 2024

/ cc @Liorba

from marquez.

ashulmanWeWork avatar ashulmanWeWork commented on August 27, 2024

Before I started coding, I wanted to better understand the intent of the issue before using the API definition suggested above.

The main way today to get info about current job runs is with this endpoint--
namespaces/{namespace}/jobs/{job}/runs

If we want to list the active jobs run for each job when we list all jobs, that might involve some changes to how we represent the Job object in the API, namely that we'll need to add the job's current runs to its description. This is problematic b/c the service layer will now need to also know about that current job runs for every job. So this change will permeate all the layers.

If the goal here is to get all the job runs in a namespace, could we instead create an endpoint for that? For example /namespaces/:namespace/job_runs, or create a new endpoint, /jobs/runs/, that will list all the job runs, with the optional parameter of namespace, which will get all the job runs for the namespace.

from marquez.

wslulciuc avatar wslulciuc commented on August 27, 2024

It might help to highlight the question we are looking to answer:

Given job K, return a list of runs L

So, let's say I have the job my_job under the namespace my_namespace. The API call:

GET namespaces/my_namespace/jobs/my_job/runs

will return the list of runs for my_job. Cool. Now, maybe I'm only interested in failed run attempts, we can and should add the filter run_state to limit our results:

GET namespaces/my_namespace/jobs/my_job/runs?run_state=FAILED

Note that the jobs/runs/* endpoints were introduced to simplify interactions with a single job run instance, not a global list.

But, now let's say we wanted to view runs for multiple jobs: my_job, my_other_job, then that would require following the steps outlined above for each job.

The issue does outline (possibly) returning a list of runs when retrieving a job:

  ...
  "runs": [
    "/jobs/runs/cfc4b5e6-c630-48d4-ad19-f2bd16c93a9d",
    "/jobs/runs/d33ef190-73bd-4a65-ab59-1bbd65364d0b",
    "/jobs/runs/5ced1097-8d59-46d8-933e-c9a688be8b8c",
    ...
  ]

My thinking here is that it's more of an optimization for the caller. Maybe we return the last N completed runs or something similar, but not a feature we'd need to support in release 0.2.0.

from marquez.

wslulciuc avatar wslulciuc commented on August 27, 2024

continued: I think it would be fine to define the endpoint:

GET /jobs/runs

returning a list of run IDs. But to learn more about the job run, the caller would have to make another API call:

GET /jobs/runs/cfc4b5e6-c630-48d4-ad19-f2bd16c93a9d

from marquez.

ashulmanWeWork avatar ashulmanWeWork commented on August 27, 2024

I like the idea of calling GET /jobs/runs, and then potentially filtering by namespace or by namespace and job_name. For example, to find all the jobs runs for namespace finance:
GET /api/v1/jobs/runs?namespace=finance
To find info about any runs for a given job, say quarterly_billings, which resides in the finance namespace, we would do this:
GET /api/v1/jobs/runs?namespace=finance?&job_name=quarterly_billings

Does this sound like a sensible way to proceed?

from marquez.

wslulciuc avatar wslulciuc commented on August 27, 2024

The endpoint /jobs/runs/{id} has a fundamental assumption:

The caller may or may not know the namespace and/or the job associated with {id}.

That is, a run ID would encode both the namespace and job version associated with the run instance, but this is very much an internal association maintained by Marquez.

I guess I'm not really sure how adding the filters namespace or job to /jobs/runs is any different than:

GET namespaces/{namespace}/jobs/{job}/runs

The call above would return a list of runs, allowing the caller to filter runs by job under a given namespace. What it wouldn't allow you to do is filter runs only by job name, but that's not a feature we have thought about supporting.

/cc @sshah-wework @hougs

from marquez.

sshah-wework avatar sshah-wework commented on August 27, 2024

Chiming in here, would rather see

GET namespaces/{namespace}/jobs/{job}/runs

Than

/jobs/runs?namespace=...&job=

since the first is more canonical. the /jobs/runs endpoint was just meant to be convenient for fetching a single run by ID, but I'd rather not support more functionality from that endpoint.

One day, we may want the equivalent lookup via a id filter (e.g. GET namespaces/{namespace}/jobs/{job}/runs?id=...) for consistency.

from marquez.

wslulciuc avatar wslulciuc commented on August 27, 2024

Being added in #633

from marquez.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.