ondat / discoblocks Goto Github PK
View Code? Open in Web Editor NEWOpen Source declarative disk configuration system for Kubernetes
Home Page: https://discoblocks.io
License: Apache License 2.0
Open Source declarative disk configuration system for Kubernetes
Home Page: https://discoblocks.io
License: Apache License 2.0
Storageclass name should be optional, and would be nice to create default storageclass per csi driver
Kuttl test fails on github action:
2023-01-09T12:57:28.766132542Z stderr F ++ crictl --runtime-endpoint unix:///run/containerd/containerd.sock inspect --output go-template --template '{{.info.pid}}' c6498557205f69edad77ac4c8bdd929eef3032d9a6b32ecbfd7a5ebc37ba0972
2023-01-09T12:57:28.791778323Z stderr F time="2023-01-09T12:57:28Z" level=fatal msg="validate service connection: CRI v1 runtime API is not implemented for endpoint \"unix:///run/containerd/containerd.sock\": rpc error: code = Unimplemented desc = unknown service runtime.v1.RuntimeService"
2023-01-09T12:57:28.79406194Z stderr F + PID=
But works well on my local box, so maybe there is some version mismatch.
Busybox has lots of tools and it increases the attack surface. Need to replace it with specific commands maybe in a custom image.
Discoblocks creates storageclass on topology bases. The created StorageClass has finalizer
just like any other. But during DiskConfig
deletion we remove finalizers of SCs. When customer recreates the config with same name and storageclass, Discoblocks re-uses the existing SC, but don't append finalizer! Not a big drama, just we loose the protection of those SCs.
It would be nice to try and document how to create a single disk EBS backend for Ondat
Current solution expects, mkdir, mknod and mount commands are available in the container. But this isn't the case all the time.
https://github.com/ondat/discoblocks/blob/main/drivers/ebs.csi.aws.com/main.go#L66-L69
Currently Discoblocks uses one single cert, it creates secret per namespace once and doesn't take care on cert rotation or change.
New disk creation and all the related CRD changes must be documented somewhere. Related PR: #32
Tracking issue for:
Would be nice to detect upscaling issues and automatically pause
autoscaling if we were not able to did it for a few times.
It would be nice to create a pipeline to automatically publish images, manifests or other release related artifacts.
The system uses coolDown time for many wait operation, so if the value is too low it kills the context of provisioning. Because the relation of timeout and cooldown makes sense and low cooldown doesn't makes sense at all, i suggest to validate and decline low values
Currently, only the log tells if something fails. It would be nice to produce metrics about the different operations, including failed and successful tasks. Kubernetes events also would be nice.
We have had design sessions and requirements gathering meetings to pin down the requirements and design. We have decided to use the CSI driver route to make it possible for users to declaratively add disk from the backend from day 0, and also on day 2 for operations.
Ondat driver finds new disk by listing /var/lib/storageos
by time. It picks the latest, but this should lead to concurrency issues in production. Also would be nice to make /var/lib/storageos
configurable.
Nixery image has been used for mount, resize jobs and metric ssidecar. For production it would be nice to replace it with some production ready solution.
Autoscaling and Pod.Spec.HostPID is not supported together in Discoblocks. We have to figure out how to find, format, mount, and resize a filesystem when hostPid is true.
Currently containerd and docker sockets are hard-coded into mount and resize job. Would be nice to turn them configurable.
The polling interval has been hard-coded into the codebase. It would be nice to make it configurable
Tracking issue for:
An auto snapshot of volumes looks a really low-hanging fruit
Currently, the node exporter exposes HTTP only. It would be nice to use a secure connection and/or a network policy to protect metrics endpoint.
Currently, the first PVC of a PVC group (/mountpoint-1, /mountpoint-2, /mountpoint-N) is resized by the CSI driver, and others via resize job.
It would be nice to double-check what happens when an additional volume needs to be resized after a restart. I guess CSI driver also
resizes the volume. What would happen with our resize job? Does it fail? Does it necessary at all? Should we detect this case to avoid unnecessary job execution?
%d
in mount point makes Discoblock un-flexible. Would be nice to turn it into optional.
PVC controller uses cached client with a custom indexer to find persistent volumes by claim name. Maybe (this needs to be proven) an uncached client should have better performance.
It would be nice to have a proper health check instead of healthz.Ping
Currently, the pod mutation webhook has side effects, but we tell Kubernetes it hasn't. Would be nice to set side effect of mutationWebhookConfiguration to Some
Tracking issue for:
Currently only develop version of Ondat supports online resize. Once we release the feature it would be nice to use GA release instead of develop in e2e tests.
ReadWriteDeamon mode gives back the same volume if exists per node. If some deletes the diskconfig and then restarts daemonset pods, Disco block gives the volume back, but that volume doesn't have finalizer (because of the delete), so auto scaling would be skipped.
Currently, service for metrics has been created for each pod and volume monitor uses service endpoints to fetch metrics of disks. Would be nice to make service optional (for customer usage) and change monitor to use pod ip instead.
Currently, node exporter exposes all the drives. It would be nice to limit drives with ignored-mount-points
options
Only PVC names are updated at PVC.Status. It would be nice to update Conditions
too
Once disk capacity is maximum per PVC would be nice to create a new disk.
Initial creation and scaling are solved on the Kube native way, but a new disk for a running pod isn't trivial.
We need to:
As i see this project does something similar with persistent volumes: https://github.com/kubernetes-sigs/sig-storage-local-static-provisioner
Currently, there are two availability options, readwriteonce, readwritesame. In the case of daemonset readwritesame isn't an option, because all pods are scheduled to the same node. On the other hand with readwriteonce, the daemonset pods get fresh new PVCs, so they lose connection with the volume after the restart. It would be nice to support daemonsets.
Currently, volumes created by Discoblocks are mounted forever. So termination of PersistentVolume objects stuck forever.
Currently, volume info travels in plain text. Would be nice some encryption. Also would be nice to use RBAC proxy to protect endpoint.
Currently, both containerd and docker sockets are hard-coded into the mount and resize jobs. It works nicely but creates unnecessary directories on the host if any of the sockets are missing. It would be nice to mount only the socket available on the host.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.