Would it make sense to treat a kubernetes cluster as a "cloud" and expose information

I would love to see stuff like select label from k8s_pods wher

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Feature request: kubernetes "cloud" about cloudquery HOT 14 CLOSED

cloudquery commented on May 18, 2024

Feature request: kubernetes "cloud"

from cloudquery.

Comments (14)

commented on May 18, 2024

I think it makes much sense and we should put it in the near term roadmap. Can you elaborate a bit about the use-case? Also, Do you think it would make sense to connect directly to k8s API or use GKE, AWS-ECS API?

from cloudquery.

mkmik commented on May 18, 2024

Disclaimer: I didn't think this through, it's just the first thing I noticed after a colleague of mine shared the link to this project.

I have the feeling that many teams (like mine) have to deal with a heterogeneous collection of workloads scattered around various clouds. Kubernetes got added to this picture but didn't replace the "legacy". There are many companies that try to offer a solution to this, a way to present a unified dashboard (I wouldn't be surprised if the company I work for does it as well, but I'm not working on that and here I speak for myself as an engineer).

Perhaps it would be useful to have a tool that allows teams to gather an up-to-date "inventory" of what's out there and perhaps build their own dashboards/abstractions on top of it.

from cloudquery.

commented on May 18, 2024

Thanks for the input. This is definitely one of the use-cases I had in mind to be able for teams to build custom inventories or dashboard on top of SQL tables using this tool.

What type of workloads you would like to see on k8s, clusters? pods? deployments? etc?

Also, what other proivders/integration you would like to see? I'm trying to get sense of the biggest pain-points and prioritise the features.

from cloudquery.

jbianquetti-nami commented on May 18, 2024

I would love to see stuff like

select label from k8s_pods where label like...
select * from k8s_deployments where namespace like ...
select * from k8s_secrets where type like...

So I guess all standard k8s object will be needed. Also, CRDs can be useful.
I guess that by taking advantage of kubectl api-resources underlying API call you can access all k8s features

from cloudquery.

commented on May 18, 2024

@jbianquetti-nami Thanks for describing those. The only issue that I'm having with kubernetes so far is that the underlying data is not stored in relational data but more like json. This will produce lot of tables and then a lot of joins when working with the data.

One possible solution is - limit the data to more high level data. This solution will save us from having many nested tables but we will miss on some data (maybe not very important or not frequently used).

what do you think?

from cloudquery.

obowersa commented on May 18, 2024

Happy to try and provide some information around this if I can, I know the kubernetes API pretty well ( although I've got a couple of other projects I'm wrapped up on at the moment so might not be able to contribute from a code perspective for a month or two ). , and appologies if I'm retreating old ground for folk! Just gettting my thoughts out

A great starting point is the kubernetes API schema: https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/

A few things to take into account:

There are differences between API versions
You can add custom API extesnions.

The custom API extensions are an area where I could see support being super valuable in the future, especially if you are using an operator based pattern to create deployments etc.

I think a high level approach might work as an initial approach ? The difficulties with just how extensible kubernetes is, is that it would potentially be a constant moving target.

A minimal view could be the resource name, resource type ( pod, deployment, service, etc) and any labels associated with it. It wouldn't allow for drilling into further details ( such as what is the schedule of a cron job ), and might become tricky when you have resources which are high level abstractions of other resources, like deployments/cronjobs/etc containing a pod spec as a nested data structure.

Kube query does some interesting stuff around acting as a bridge between the kubernetes api and osquery ( https://github.com/aquasecurity/kube-query ) which might be useful as some inspiration.

from cloudquery.

commented on May 18, 2024

@obowersa Thanks. I agree that as an initial approach this could work. My main concerns here - is SQL the right database for storing highly nested k8s data - maybe we should use noSQL for that case? we can release an experimental support for k8s with SQL backend and see if this is helpful to someone.

from cloudquery.

obowersa commented on May 18, 2024

Valid question! For me, a big thing would be to be able to maintain the same query syntax. As an example, in the future I'd love to be able to do the equivelent of query azure for load balancers with public ip's, and then match that up to services/pods which are behind those load balancers. That's a longer term dream, but gives an idea of where having both parts of the equation would be super useful.

from cloudquery.

commented on May 18, 2024

@obowersa This is an excellent example which makes much sense to me now. We will try to schedule an initial version for k8s in a few weeks as we roll-out Azure next week.

from cloudquery.

yevgenypats commented on May 18, 2024

@obowersa @jbianquetti-nami @mkmik We've added basic support for k8s. Currently, only pods and services are supported but I'd love to hear early feedback before we add more resources. When I mentioned only two resources - it creates about 32 tables - https://schema.cloudquery.io/tables/k8s_pods.html, https://schema.cloudquery.io/tables/k8s_services.html

from cloudquery.

dancompton commented on May 18, 2024

I haven't reviewed all of the above commentary, but I noticed this when I submitted a PR earlier. I found it odd that k8s was listed alongside AWS, GCP, and other cloud vendors.

Kubernetes is a container orchestration platform which could be offered as a PaaS on top of AWS or GCP, but is not by itself a PaaS. It does not belong on a level of a hierarchy which primarily represents orgs that provide IaaS primitives.

from cloudquery.

yevgenypats commented on May 18, 2024

Hi @dan-compton thanks for the feedback! I believe in the future the provider implementation will reside in different repositories to have a more pluggable architecture (kinda like in terraform).

from cloudquery.

dancompton commented on May 18, 2024

No problem. I suppose it wasn't particularly useful feedback for driving features :P Have you taken a look at github.com/google/go-cloud? It addresses a somewhat similar problem across platforms. Also it might be worth checking out instana's graph explorer if that is open sourced. I've only ever done this for AWS with graphql. Support for all of these services seems daunting.

from cloudquery.

yevgenypats commented on May 18, 2024

Closing - k8s provider moved to moved to https://github.com/cloudquery/cq-provider-k8s

from cloudquery.

Feature request: kubernetes "cloud" about cloudquery HOT 14 CLOSED

Comments (14)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent