Comments (14)
I think it makes much sense and we should put it in the near term roadmap. Can you elaborate a bit about the use-case? Also, Do you think it would make sense to connect directly to k8s API or use GKE, AWS-ECS API?
from cloudquery.
Disclaimer: I didn't think this through, it's just the first thing I noticed after a colleague of mine shared the link to this project.
I have the feeling that many teams (like mine) have to deal with a heterogeneous collection of workloads scattered around various clouds. Kubernetes got added to this picture but didn't replace the "legacy". There are many companies that try to offer a solution to this, a way to present a unified dashboard (I wouldn't be surprised if the company I work for does it as well, but I'm not working on that and here I speak for myself as an engineer).
Perhaps it would be useful to have a tool that allows teams to gather an up-to-date "inventory" of what's out there and perhaps build their own dashboards/abstractions on top of it.
from cloudquery.
Thanks for the input. This is definitely one of the use-cases I had in mind to be able for teams to build custom inventories or dashboard on top of SQL tables using this tool.
What type of workloads you would like to see on k8s, clusters? pods? deployments? etc?
Also, what other proivders/integration you would like to see? I'm trying to get sense of the biggest pain-points and prioritise the features.
from cloudquery.
I would love to see stuff like
- select label from k8s_pods where label like...
- select * from k8s_deployments where namespace like ...
- select * from k8s_secrets where type like...
So I guess all standard k8s object will be needed. Also, CRDs can be useful.
I guess that by taking advantage of kubectl api-resources
underlying API call you can access all k8s features
from cloudquery.
@jbianquetti-nami Thanks for describing those. The only issue that I'm having with kubernetes so far is that the underlying data is not stored in relational data but more like json. This will produce lot of tables and then a lot of joins when working with the data.
One possible solution is - limit the data to more high level data. This solution will save us from having many nested tables but we will miss on some data (maybe not very important or not frequently used).
what do you think?
from cloudquery.
Happy to try and provide some information around this if I can, I know the kubernetes API pretty well ( although I've got a couple of other projects I'm wrapped up on at the moment so might not be able to contribute from a code perspective for a month or two ). , and appologies if I'm retreating old ground for folk! Just gettting my thoughts out
A great starting point is the kubernetes API schema: https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/
A few things to take into account:
- There are differences between API versions
- You can add custom API extesnions.
The custom API extensions are an area where I could see support being super valuable in the future, especially if you are using an operator based pattern to create deployments etc.
I think a high level approach might work as an initial approach ? The difficulties with just how extensible kubernetes is, is that it would potentially be a constant moving target.
A minimal view could be the resource name, resource type ( pod, deployment, service, etc) and any labels associated with it. It wouldn't allow for drilling into further details ( such as what is the schedule of a cron job ), and might become tricky when you have resources which are high level abstractions of other resources, like deployments/cronjobs/etc containing a pod spec as a nested data structure.
Kube query does some interesting stuff around acting as a bridge between the kubernetes api and osquery ( https://github.com/aquasecurity/kube-query ) which might be useful as some inspiration.
from cloudquery.
@obowersa Thanks. I agree that as an initial approach this could work. My main concerns here - is SQL the right database for storing highly nested k8s data - maybe we should use noSQL for that case? we can release an experimental support for k8s with SQL backend and see if this is helpful to someone.
from cloudquery.
Valid question! For me, a big thing would be to be able to maintain the same query syntax. As an example, in the future I'd love to be able to do the equivelent of query azure for load balancers with public ip's, and then match that up to services/pods which are behind those load balancers. That's a longer term dream, but gives an idea of where having both parts of the equation would be super useful.
from cloudquery.
@obowersa This is an excellent example which makes much sense to me now. We will try to schedule an initial version for k8s in a few weeks as we roll-out Azure next week.
from cloudquery.
@obowersa @jbianquetti-nami @mkmik We've added basic support for k8s. Currently, only pods
and services
are supported but I'd love to hear early feedback before we add more resources. When I mentioned only two resources - it creates about 32 tables - https://schema.cloudquery.io/tables/k8s_pods.html, https://schema.cloudquery.io/tables/k8s_services.html
from cloudquery.
I haven't reviewed all of the above commentary, but I noticed this when I submitted a PR earlier. I found it odd that k8s was listed alongside AWS, GCP, and other cloud vendors.
Kubernetes is a container orchestration platform which could be offered as a PaaS on top of AWS or GCP, but is not by itself a PaaS. It does not belong on a level of a hierarchy which primarily represents orgs that provide IaaS primitives.
from cloudquery.
Hi @dan-compton thanks for the feedback! I believe in the future the provider implementation will reside in different repositories to have a more pluggable architecture (kinda like in terraform).
from cloudquery.
No problem. I suppose it wasn't particularly useful feedback for driving features :P Have you taken a look at github.com/google/go-cloud? It addresses a somewhat similar problem across platforms. Also it might be worth checking out instana's graph explorer if that is open sourced. I've only ever done this for AWS with graphql. Support for all of these services seems daunting.
from cloudquery.
Closing - k8s provider moved to moved to https://github.com/cloudquery/cq-provider-k8s
from cloudquery.
Related Issues (20)
- bug: Snyk Rate Limiting
- bug: S3 ContentType
- Visibility AWS ENIs usage per aws service
- bug: Running Sequential Syncs with MySQL results in error
- bug: Build: `CGO_ENABLED` doesn't seem to have an effect in the release process HOT 1
- feat: Document a "rich" docker image is needed to run the SQLite, DuckDB and Snowflake plugins
- Feedback for Official Stripe Plugin. HOT 12
- feat: Support more Reserved Instances tables HOT 2
- bug: Azure Compliance pack - Issue with Model / View in Snowflake HOT 5
- bug: `snyk_sbom` table only syncs data from last organisation HOT 6
- feat: Update CLI docs for addon downloads HOT 2
- feat(github-source): Make issues table incremental based on `updated` field HOT 3
- feat(gcp): Support GCS HMAC Keys HOT 1
- bug: `azure_compute_restore_point_collections` is missing list of restore points
- bug: Oracle: missing defined_tags on oracle_compute_instances resource HOT 7
- feat: Support Azure storage account access keys resource
- Validate Snowflake DSN
- bug: aws_s3_buckets does not specify Server Access log format HOT 3
- bug: Github Plugin not returning repo custom properties HOT 3
- bug: Jira, not showing issue parentage HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cloudquery.