Code Monkey home page Code Monkey logo

Comments (19)

jeremyeder avatar jeremyeder commented on August 27, 2024

@timothysc @smarterclayton @derekwaynecarr

Thanks for filing this. The idea for cells is to:

  • Retain the ability for developers to ship code quickly and easily
  • De-couple shipping code from the need to understand the infrastructure at it's lowest levels.
    • Assuming this code has special requirements, such as:
      • Which geography it should run in
      • Which mix of hardware/software is required (containers don't fix everything!)
      • What SLA the application has
      • Any other business need/logic

Also I noted that you titled this "node classes". In a way, cells are directly analogous to storage classes that have already been merged.

from node-feature-discovery.

derekwaynecarr avatar derekwaynecarr commented on August 27, 2024

A StorageClass is primarily for dynamic provisioning. The PVC itself is not mutated other than a reference to the StorageClass, and that reference is optional, and we still have issues that arise when the StorageClass changes or is removed while the PVC remains.

This appears more to modify the incoming pod with default labels/taints/tolerations/resource requests/etc which may or may not have its own problems, and the scope of things that you would want to modify may be quite large.

from node-feature-discovery.

fabiand avatar fabiand commented on August 27, 2024

Initially I was worried that exposing low-level features like CPU flags in annotations might quickly clutter those.
But by grouping these feature sin Cells (or maybe NodeFeatures?) this concern goes away.

But what about using the reverse fqdn notation - which makes it look nicer in a sorted fashion:

io.kubernetes.kernel.psap/flags: svm
io.kubernetes.net1.psap/name: data
io.kubernetes.net2.psap/name: controll

from node-feature-discovery.

ConnorDoyle avatar ConnorDoyle commented on August 27, 2024

@fabiand thanks for the feedback. See also #27 where we're bikeshedding the prefix for all labels published by NFD.

from node-feature-discovery.

davidopp avatar davidopp commented on August 27, 2024

Is there some way we can get the same benefits without having to introduce a new API object (Cell)?

You could put annotations directly on the daemonset pod, which could read them through the downward API (or even better, configure it using ConfigMap).

On the requesting pod, you could apply an annotation that is read by an admission controller (configured via the same ConfigMap as the daemonset pod uses) that adds the corresponding node selectors.

This isn't very different than what you have proposed, but the Cell just becomes something defined via a ConfigMap rather than as its own API object.

from node-feature-discovery.

fabiand avatar fabiand commented on August 27, 2024

@davidopp to me that sounds a little to generic. To me features look special enough (they will be valuable for any application having some kind of hardware dependence), that they should be exposed in a well defined place.

I am not sure if it was already discussed in another place already: But why don't we expose the features of a host in the Node status? In that case we could reuse the existing object.

What is actually speaking against this from my POV is that these informations should probably be gathered outside of core node, and might be an add-on. In that case we don't want add-ons to update the Node status, do we?

Besides that: I would not expose/list all the features in an entity directly, at least not low-level features like svm/vmx or whatever, but I'd rather think that we the features are either grouped (like in cells) and this group is referenced. Or that some kind of controller is deriving more high-level features from a set of primitives (i.e. virtualization from svm or vmx), and this high-level feature can then be used for scheduling.

from node-feature-discovery.

balajismaniam avatar balajismaniam commented on August 27, 2024

@davidopp, I think ConfigMap might work . @ConnorDoyle, @jeremyeder and I were discussing about using third party resources to define a Cell and associating the Cell to a node using labels. That way, we could select a node based on the label (or Cell). WDYT?

from node-feature-discovery.

jeremyeder avatar jeremyeder commented on August 27, 2024

I agree w/TPR primarily because it is a zero-friction method of prototyping the long term solution, which I'd hope is a new API object. There's nothing in the API that cleanly lets us do this now, and retains a simple UX for users.

We also want Cells tied to other features in the product such as admission control and RBAC. While we can do the first generation of this with TPR, at least I would like us have a clear goal of promotion to a real API object that people can rally their design/architectures/security stance around.

As far as @davidopp point about adding node selectors...I think we want users to be comfortably numb about individual nodes, and target "any node that can successfully run my magic-app", which is abstracted cleanly with a Cell object.

from node-feature-discovery.

davidopp avatar davidopp commented on August 27, 2024

I'm still not understanding why you need a new API object (or a TPR). Any kind of configuration that you can put in an API object you can put in a ConfigMap.

from node-feature-discovery.

timothysc avatar timothysc commented on August 27, 2024

I'm still not understanding why you need a new API object (or a TPR).

Usability and sane RBAC rules.

Any kind of configuration that you can put in an API object you can put in a ConfigMap.

Users already have a tough time, I think that would obfuscate it even more.

from node-feature-discovery.

davidopp avatar davidopp commented on August 27, 2024

Ah I see, you want to limit who can make changes to the Cell information, which I guess wouldn't be possible with a ConfigMap? (Though it seems that's a pretty big deficiency if we can't control who can modify ConfigMap, since we're moving towards doing all Kubernetes component config via ConfigMap)

from node-feature-discovery.

ConnorDoyle avatar ConnorDoyle commented on August 27, 2024

I've been meaning to comment since we chatted but have been on the road. Thanks Balaji for summarizing what we talked about. Just a couple of additional comments/ideas:

  • Another vote for TPR, allows for seamless UX between experimentation and potential first-class resource in the future.
  • If we represent Cell membership as a special label on nodes, then we can schedule against them for free using affinity.
  • Node cell membership labels could be published by the NFD container (via a new "cell" source). The cell source could fetch the criteria for the defined cells via the API server and compare against the discovered features. If the NFD container is configured to use --sources=cpuid,cell then the --label-whitelist attribute could be used to filter out everything but the cell membership labels.

Anyway, this seems to be shaping up. Should we collaborate on a design doc to work through the details?

On Nov 14, 2016, at 13:28, David Oppenheimer [email protected] wrote:

Ah I see, you want to limit who can make changes to the Cell information, which I guess wouldn't be possible with a ConfigMap? (Though it seems that's a pretty big deficiency if we can't control who can modify ConfigMap, since we're moving towards doing all Kubernetes component config via ConfigMap)


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

from node-feature-discovery.

fabiand avatar fabiand commented on August 27, 2024

Was there some further movement outside of this thread on this topic? Our use-case (http://kubevirt.io) also requires to publish node features.

I do like the separation of Cells as TPRs and associating them to pods using labels.

What would be a next step, @ConnorDoyle ?

from node-feature-discovery.

fejta-bot avatar fejta-bot commented on August 27, 2024

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

from node-feature-discovery.

fabiand avatar fabiand commented on August 27, 2024

from node-feature-discovery.

fejta-bot avatar fejta-bot commented on August 27, 2024

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

from node-feature-discovery.

fejta-bot avatar fejta-bot commented on August 27, 2024

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

from node-feature-discovery.

fejta-bot avatar fejta-bot commented on August 27, 2024

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

from node-feature-discovery.

k8s-ci-robot avatar k8s-ci-robot commented on August 27, 2024

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

from node-feature-discovery.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.