Code Monkey home page Code Monkey logo

Comments (18)

grondo avatar grondo commented on August 11, 2024 3

Let's make sure we all have the same big picture. Here's the set of building blocks I have in mind, though I may not have all the information so this is just a starting point for a discussion:

  • jobinfo db (flux-core): stores data for inactive jobs so they can be purged from memory. Enables out-of-band sql queries against completed job information, etc.
  • utilization reports (external): should be able to query jobinfo db directly
  • bank/accounting db (flux-accounting): stores users, banks, accounts, and "associations", uses jobinfo db to do necessary updates of current user/bank usage.
  • priority plugin (flux-core): plugin in job-manager used to adjust or supplement the primary job priority. A plugin may be a worker or set of workers similar to implementation of job-ingest validator.
  • multi-factor priority plugin (flux-accounting): a job-manager priority plugin/script which calculates a multi-factor priority for jobs including fairshare priority

The flow of data for jobs might look like:

  1. inactive jobs are sucked into the jobinfo db, optionally purged from memory
  2. utilization reports are generated directly from this database when required
  3. accounting information is generated/derived from the jobinfo db and fed into the accounting/fairshare db on a periodic interval
  4. accounting/fairshare db is used to fetch or push fair-tree factor into multi-factor priority plugin/script
  5. priority plugin in job-manager runs multi-factor priority calculation on each job, possibly using worker script, like validator

This design can have 3 work streams going in parallel

  • accounting/fairshare db and multi-factor priority script (flux-accounting)
  • jobinfo db (flux-core) -- we'll need this anyway for system instance
  • job-manager priority plugin (flux-core)

Each of these can go in parallel once the interfaces have been agreed upon. Interfaces include:

  • job-manager priority plugin: How does script-based priority worker get jobspec, userid, t_submit etc (JSON on stdin?)
  • jobinfo db: gather requirements from flux-accounting for query interface

from flux-accounting.

grondo avatar grondo commented on August 11, 2024 2

FYI - as a comparison, here is the list of factors Slurm uses in its multi-factor plugin:

https://slurm.schedmd.com/priority_multifactor.html#mfjppintro

Edit: note especially that fairshare is just one factor in a multi-factor priority calculation

from flux-accounting.

grondo avatar grondo commented on August 11, 2024 2

Can this be its own sub-project when some of the data source that it requires would come from flux-core? For example, queue time?

As part of job-manager priority plugin development we would design an interface that would allow all known information to be shared, e.g. t_submit (queue time), primary priority, etc.

from flux-accounting.

cmoussa1 avatar cmoussa1 commented on August 11, 2024 1

Here's a summary about what we talked about. If I missed anything/incorrectly summarized something, feel free to correct me:

Instead of defining partitions in its own tables (where limits would be defined in a second location, as they are also defined in a cluster_association_table), @SteVwonder had a good idea where we could instead provide a label that's used when users are submitting jobs in order to associate it the max amount of resources it can utilize. example: a debug label would limit a user to 30 minutes and can only half the nodes available on a cluster.

It's necessary to analyze where our gaps are in terms of tracking factors for a multi-factor priority for jobs. I plan on doing this over the next couple of days, eventually posting a table containing all of the factors and where we would include it in our software architecture. This would help us narrow down the large scope that is user/job priority 😅.

from flux-accounting.

SteVwonder avatar SteVwonder commented on August 11, 2024 1

Originally, I was under the impression that fairshare values were calculated by passing in a user id, fetching its association id from the accounting database, and performing a Level Fairshare calculation based on the user's association information and current jobs in the queue. Essentially, I had thought that fairshare calculations would be constantly querying information from the accounting database in order to generate a priority value.

FWIW, I think it is totally reasonable to start with this implementation as a proof-of-concept. Once you have a working version of this, you could then refactor for performance to cache certain historical values in memory, etc.

from flux-accounting.

grondo avatar grondo commented on August 11, 2024 1

If we decide to go with a unified database within flux-core, do we expect the user and account tables can be tracked there? Seems a bit monolithic...

No I think the flux-core job-info db could be used to store job accounting information, then the flux-accounting project would house the user/account hierarchy, and would query the job accounting db to update user banks, calculate historical usage to get fair-share priority, etc

from flux-accounting.

dongahn avatar dongahn commented on August 11, 2024

@grondo: Thank you for starting up the big picture architecture discussion! We definitely need this to push forward this discussion. I have a few questions to make sure we are looking at the same page.

  1. We still haven't decided whether multi-factor priority plugin will sort jobs at the job-manager level or the external scheduler (e.g., flux-sched) level. While my preference is to do this at the job-manager level, we have to ensure this will not lead to "ALLOC" thrashing problem. Let me open up a ticket and reason about whether the "ALLOC" thrashing will be a real issue or not.

  2. It is not immediately clear to me if flux-accounting can provide a multi-factor priority plugin in its entirety. It will only have a subset of data needed for multi-factor priority calculation. I notice you mentioned "a job-manager priority plugin/script". So perhaps flux-accounting can provide a python command that will output some factors needed for multi-factor priority plugin and the plugin itself will be implemented at the level (decided from the further discussion from point 1 above)?

BTW, I love your ways to have the notion of parallel work stream. We really need this to be effective for this item.

from flux-accounting.

grondo avatar grondo commented on August 11, 2024

While my preference is to do this at the job-manager level, we have to ensure this will not lead to "ALLOC" thrashing problem. Let me open up a ticket and reason about whether the "ALLOC" thrashing will be a real issue or not.

Yeah, you are right. My thought is that we need to get started somewhere, and this choice has the benefit of dividing up the work even further, which may have a big benefit.

Another benefit is that this approach would allow a user to insert a custom priority plugin at runtime for a non-system flux instance. I'm not sure what exactly you could do with that, but it seems like it would be a nice feature.

So perhaps flux-accounting can provide a python command that will output some factors needed for multi-factor priority plugin and the plugin itself will be implemented at the level (decided from the further discussion from point 1 above)?

That might be a good approach, though I think eventually maybe the advanced multi-factor priority plugin could either be its own sub-project or just included with flux-accounting...

from flux-accounting.

chu11 avatar chu11 commented on August 11, 2024

jobinfo db (flux-core) -- we'll need this anyway for system instance

Had a side discussion with @grondo, in the past it was assumed that there would be two job history databases, a "core" one and a "sched" one, mostly so that we could work in parallel and not have development hindered on either path. Then we could "merge together" if necessary.

@grondo's feeling is that in order to save time, we should nix that, upping the "job-info" job history DB to a higher priority.

from flux-accounting.

dongahn avatar dongahn commented on August 11, 2024

That might be a good approach, though I think eventually maybe the advanced multi-factor priority plugin could either be its own sub-project or just included with flux-accounting...

Can this be its own sub-project when some of the data source that it requires would come from flux-core? For example, queue time?

from flux-accounting.

dongahn avatar dongahn commented on August 11, 2024

Then we could "merge together" if necessary.

If we decide to go with a unified database within flux-core, do we expect the user and account tables can be tracked there? Seems a bit monolithic...

from flux-accounting.

dongahn avatar dongahn commented on August 11, 2024

Yeah, you are right. My thought is that we need to get started somewhere, and this choice has the benefit of dividing up the work even further, which may have a big benefit.

Another benefit is that this approach would allow a user to insert a custom priority plugin at runtime for a non-system flux instance. I'm not sure what exactly you could do with that, but it seems like it would be a nice feature.

Like I said I certainly do hope that our reasoning on ALLOC trashing problem can lead us to this architecture.

from flux-accounting.

chu11 avatar chu11 commented on August 11, 2024

No I think the flux-core job-info db could be used to store job accounting information, then the flux-accounting project would house the user/account hierarchy, and would query the job accounting db to update user banks, calculate historical usage to get fair-share priority, etc

Agreed. The job-info module's database is effectively storing job history for its own purposes. Anyone else that wants to read from it can do so at its own discretion.

But of course if the internal database changes, any scripts / fair share calculations, etc. would have to adjust. This is the risk of having just 1 job history db.

from flux-accounting.

cmoussa1 avatar cmoussa1 commented on August 11, 2024

But of course if the internal database changes, any scripts / fair share calculations, etc. would have to adjust. This is the risk of having just 1 job history db.

This is a good point. But as long as the core information needed for fair share calculation remains attainable, even if the interface to get the data changes, I think it should be okay.

from flux-accounting.

dongahn avatar dongahn commented on August 11, 2024

But of course if the internal database changes, any scripts / fair share calculations, etc. would have to adjust. This is the risk of having just 1 job history db.

This is a good point. But as long as the core information needed for fair share calculation remains attainable, even if the interface to get the data changes, I think it should be okay.

Does this call for an RFC for job history database schema, then?

from flux-accounting.

chu11 avatar chu11 commented on August 11, 2024

Does this call for an RFC for job history database schema, then?

Maybe ... after the coffee time talk a few questions came up. I'm putting together a discussion in flux-core.

from flux-accounting.

dongahn avatar dongahn commented on August 11, 2024

Sorry I couldn't join. Stuck in creating a writeup.

from flux-accounting.

cmoussa1 avatar cmoussa1 commented on August 11, 2024

I think we have pretty much settled on the design/implementation for calculating fairshare values now (a combination of using the weighted tree library introduced in #65 and fetching and calculating job usage values from the job-archive DB from #79), so I can close this issue. Don't mind re-opening if others feel otherwise.

from flux-accounting.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.