Code Monkey home page Code Monkey logo

Comments (9)

akrzos avatar akrzos commented on June 8, 2024 1

@akrzos based on the above, would you need anything else apart from by-path?

Every disk should have its by-path. Right now users have to determine this beforehand understanding what disk they see quads installed on prior to any building of an ocp cluster on multi-disk systems. Otherwise they risk install failure. Ideally included in the disk data the intended root install disk and/or bootable disk should also be "flagged" so it's easy to find.

For example this is what data I had to gather beforehand to build an ocp cluster and properly reference the disks:

control_plane_install_disk: /dev/disk/by-path/pci-0000:67:00.0-scsi-0:2:0:0
worker_install_disk: /dev/disk/by-path/pci-0000:67:00.0-scsi-0:2:0:0
worker_localstorage_device: /dev/disk/by-path/pci-0000:67:00.0-scsi-0:2:1:0
localvolume2_device: /dev/disk/by-path/pci-0000:67:00.0-scsi-0:2:2:0

That is with r650s.

from quads.

sadsfae avatar sadsfae commented on June 8, 2024 1

@akrzos I think you should consider gathering facts in the meantime for this since you're already using Ansible and setting a custom fact to assist with this more easily is likely an easy interim solution. Let us discuss what we're able to return via the APIv3 however but some level of conditional logic perhaps should be present in the playbook(s) to choose the right disk based on size or type even providing /dev/disk/by-path per block device.

We also have to consider keeping things rather generic and not overly OCP-specific too as it relates to QUADS. We need to consider that QUADS API might not be the right place to figure out this level of installation preparation metadata / ordering / classification wrt to disks, rather than just list what's available per disk, what those disks are, type, size etc that are more broadly applicable to anyone interested in the hardware.

There are cases as well where there isn't even 4-disks present and it could end up being overly complex to sort the "right" choice for those values given limited spindles or a combination of different disks without a weighting property or a decent amount of logic to figure that out in the codebase. Unless we seed these values somewhat manually that would be additional development burden. The upside is it should only need to happen once, but when you have 1,000+ of something that's still not trivial.

If we were just to list in the JSON API response like the following would this be useful assuming it always maps to the fixed disk id's we'd also present?

"disks": [
    {
      "count": 2,
      "disk_paths": ["by-path/pci0.0.0....", "by-path/pci0.0.1...."],
      "disk_type": "sata",
      "host_id": 2,
      "id": 21,
      "size_gb": 480
    }
  ],

Let us discuss this and see what we can do though, not ruling any of it out.

from quads.

grafuls avatar grafuls commented on June 8, 2024

Looking closer at the way the by-path id is constructed, we could potentially derive this one by the output on lshw where:

              "businfo" : "pci@0000:18:00.0",
              "children" : [
                {
                  ...
                  "businfo" : "scsi@0:2.0.0",
                  "logicalname" : "/dev/sda",
                  "children" : [
                    ...
                    {
                      "businfo" : "scsi@0:2.0.0,2",
                      "logicalname" : "/dev/sda2",
                      }
                    }
                  ]
                  ...

pci-0000:18:00.0-scsi-0:2:0:0-part2 -> ../../sda2

We could also store these values on the Quads host object under a new field on the DB although it might be a bit loosely coupled as the way we store the host disks is by grouping them by type and size relying on the count field on the disk object which would require storing the disk paths on an array like such:

  "disks": [
    {
      "count": 2,
      "disk_ids": ["by-path/pci0.0.0....", "by-path/pci0.0.1...."],
      "disk_type": "sata",
      "host_id": 2,
      "id": 21,
      "size_gb": 480
    }
  ],

Unless some additional data is required at an individual disk level, in which case we might want to consider removing the count field and creating a single disk entry for each disk with it's individual by-path value at the cost (perhaps negligible at the scale we use it) of storage space on DB.

from quads.

grafuls avatar grafuls commented on June 8, 2024

@akrzos based on the above, would you need anything else apart from by-path?

from quads.

sadsfae avatar sadsfae commented on June 8, 2024

Looking closer at the way the by-path id is constructed, we could potentially derive this one by the output on lshw where:

              "businfo" : "pci@0000:18:00.0",
              "children" : [
                {
                  ...
                  "businfo" : "scsi@0:2.0.0",
                  "logicalname" : "/dev/sda",
                  "children" : [
                    ...
                    {
                      "businfo" : "scsi@0:2.0.0,2",
                      "logicalname" : "/dev/sda2",
                      }
                    }
                  ]
                  ...

pci-0000:18:00.0-scsi-0:2:0:0-part2 -> ../../sda2

We could also store these values on the Quads host object under a new field on the DB although it might be a bit loosely coupled as the way we store the host disks is by grouping them by type and size relying on the count field on the disk object which would require storing the disk paths on an array like such:

  "disks": [
    {
      "count": 2,
      "disk_paths": ["by-path/pci0.0.0....", "by-path/pci0.0.1...."],
      "disk_type": "sata",
      "host_id": 2,
      "id": 21,
      "size_gb": 480
    }
  ],

Unless some additional data is required at an individual disk level, in which case we might want to consider removing the count field and creating a single disk entry for each disk with it's individual by-path value at the cost (perhaps negligible at the scale we use it) of storage space on DB.

A few options here, we do capture Serial Number and that's unique per physical block device - would that be a good organizing element in an array @grafuls ? lshw does capture this and we could probably walk the JSON structure to parse it.

We could prefix each item with child items based on type, count, serial as a string that's unique?

e.g. id#_nvme_$SERIAL:

I do like the simplicity of having them in list inside the dict though with disk_ids or disk_paths

"disks": [
    {
      "count": 2,
      "disk_paths": ["by-path/pci0.0.0....", "by-path/pci0.0.1...."],
      "disk_type": "sata",
      "host_id": 2,
      "id": 21,
      "size_gb": 480
    }
  ],

from quads.

akrzos avatar akrzos commented on June 8, 2024

@sadsfae All we need is the bootable disk's by-path, ideally the other disks by-paths as well. I do not believe we should be adding brittle fact gathering depending on the state of the systems that are given, rather we just want to query the api for the data.

from quads.

sadsfae avatar sadsfae commented on June 8, 2024

@sadsfae All we need is the bootable disk's by-path, ideally the other disks by-paths as well. I do not believe we should be adding brittle fact gathering depending on the state of the systems that are given, rather we just want to query the api for the data.

Understood, I checked and Ansible facts don't actually gather this anyway you'd need to create your own task to gather it and then set custom facts via set_fact module

I think our contention here is we're not going to know generally nor are we best suited to determine which disk(s) are best for what usage category / key value pair you described earlier across all the models and configs but perhaps its enough to start just having them all return and on the tenant deployment end you can figure that out.

from quads.

akrzos avatar akrzos commented on June 8, 2024

@sadsfae We really just need the default installation disk to kick start this. Since foreman already installs RHEL on every single machine in the scale lab, what 99% of the users need here is the by-path to that exact disk rhel was installed on for that machine. (With OCP 4.13 and newer we can not rely on symbolic links especially on a multi-disk system, infact you never should have been relying on symbolic links to begin with if we feel like opening that can of worms) If you want to make it a bit more robust, you could grab by-path for every disk, and also include by-id references as well. We want to query the api because sometimes machines have been used or already installed on, or are powered off in an allocation and we don't want a disjoint experience and extra steps of having to re-install rhel to discover the by-path when it stays the same for the majority of the life of a machine.

from quads.

sadsfae avatar sadsfae commented on June 8, 2024

@sadsfae We really just need the default installation disk to kick start this. Since foreman already installs RHEL on every single machine in the scale lab, what 99% of the users need here is the by-path to that exact disk rhel was installed on for that machine.

Ok that works, something like os_disk_path that can be updated when a machine is provisioned, this can likely be part of the /disks/{hostName} APIv3 return.

(With OCP 4.13 and newer we can not rely on symbolic links especially on a multi-disk system, infact you never should have been relying on symbolic links to begin with if we feel like opening that can of worms) If you want to make it a bit more robust, you could grab by-path for every disk, and also include by-id references as well.

Would you say it's more important to get the definitive OS disk /dev/disk/by-path than just a fixed list return of all /dev/disk/by-path? Remember that with the current WIP APIv3 you'll also get disk type (e.g. sata, nvme) size and id.

If definitive OS disk is ideal perhaps we aim for that first then see how useful it would be as a future RFE to add/categorize the rest.

We want to query the api because sometimes machines have been used or already installed on, or are powered off in an allocation and we don't want a disjoint experience and extra steps of having to re-install rhel to discover the by-path when it stays the same for the majority of the life of a machine.

That makes sense.

from quads.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.