k8sgpt-ai / k8sgpt Goto Github PK
View Code? Open in Web Editor NEWGiving Kubernetes Superpowers to everyone
Home Page: http://k8sgpt.ai
License: Apache License 2.0
Giving Kubernetes Superpowers to everyone
Home Page: http://k8sgpt.ai
License: Apache License 2.0
Checklist:
It could be highly beneficial to conduct an analysis of pdb resources to ensure that everything is configured correctly and functioning as expected.
My initial idea is to perform a simple check to ensure that the pdb is applicable to at least one existing resource.
Users could detect and troubleshoot pdb issues
As discussed with @AlexsJones , this analyzer may not be enabled by default, but only if the user specifies it by running k8sgpt filters add pdb
. This issue depend #161
When you run k8sgpt analyze
and get multiple errors, the first one's number is stuck at the top with the progress bar. The same happens when running with output set to JSON, but with the opening code bracket.
❯ ./k8sgpt analyze
0% | | (0/4, 0 it/hr) [0s:0s]0
kube-system/kube-prometheus-stack-kube-scheduler(kube-prometheus-stack-kube-scheduler): Service has no endpoints, expected label component=kube-scheduler
1 kube-system/kube-prometheus-stack-kube-controller-manager(kube-prometheus-stack-kube-controller-manager): Service has no endpoints, expected label component=kube-controller-manager
2 kube-system/kube-prometheus-stack-kube-etcd(kube-prometheus-stack-kube-etcd): Service has no endpoints, expected label component=etcd
3 kube-system/kube-prometheus-stack-kube-proxy(kube-prometheus-stack-kube-proxy): Service has no endpoints, expected label k8s-app=kube-proxy
❯ ./k8sgpt analyze -o json
0% | | (0/4, 0 it/hr) [0s:0s]{
"kind":"Service","name":"kube-system/kube-prometheus-stack-kube-etcd","error":["Service has no endpoints, expected label component=etcd"],"details":"","parentObject":"kube-prometheus-stack-kube-etcd"}
{"kind":"Service","name":"kube-system/kube-prometheus-stack-kube-proxy","error":["Service has no endpoints, expected label k8s-app=kube-proxy"],"details":"","parentObject":"kube-prometheus-stack-kube-proxy"}
{"kind":"Service","name":"kube-system/kube-prometheus-stack-kube-scheduler","error":["Service has no endpoints, expected label component=kube-scheduler"],"details":"","parentObject":"kube-prometheus-stack-kube-scheduler"}
{"kind":"Service","name":"kube-system/kube-prometheus-stack-kube-controller-manager","error":["Service has no endpoints, expected label component=kube-controller-manager"],"details":"","parentObject":"kube-prometheus-stack-kube-controller-manager"}
This is on macOS Ventura 13.2, tried with both fish and bash, and with both regular built in terminal and Hyper terminal.
When I run brew install k8sgpt
, it gives me following error:
==> Installing k8sgpt from k8sgpt-ai/k8sgpt Error: The following formula cannot be installed from bottle and must be built from source. k8sgpt Install Clang or run brew install gcc.
I did run brew install gcc
but it's still the same.
Checklist:
I think it would be useful, if we'd be able to analyze ingress resources to find out if everything is configured correctly and obviously if it should be working technically.
In the first step, we could try to find out if services referred to in the ingress object are existing, and maybe if they are responsive.
Users could detect and troubleshoot ingress issues
This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.
These updates are currently rate-limited. Click on a checkbox below to force their creation now.
These updates have all been created already. Click a checkbox below to force a retry/rebase of any.
k8s.io/api
, k8s.io/apiextensions-apiserver
, k8s.io/apimachinery
, k8s.io/client-go
)These are blocked by an existing closed PR and will not be recreated unless you click a checkbox below.
container/Dockerfile
golang 1.22-alpine3.19
.github/workflows/build_container.yaml
actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
keptn/gh-action-extract-branch-name main
actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
docker/setup-buildx-action v3@d70bba72b1f3fd22344832f00baa16ece964efeb
docker/build-push-action v5@af5a7ed5ba88268d5278f7203fb52cd833f66d6e
actions/upload-artifact v4@5d5d22a31266ced268874388b861e4b58bb5c2f3
actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
docker/login-action v3@e92390c5fb421da1463c202d546fed0ec5c39f20
docker/setup-buildx-action v3@d70bba72b1f3fd22344832f00baa16ece964efeb
docker/build-push-action v5@af5a7ed5ba88268d5278f7203fb52cd833f66d6e
ubuntu 22.04
ubuntu 22.04
ubuntu 22.04
.github/workflows/golangci_lint.yaml
actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
reviewdog/action-golangci-lint v2@00311c26a97213f93f2fd3a3524d66762e956ae0
.github/workflows/release.yaml
actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
google-github-actions/release-please-action v4.0.2@cc61a07e2da466bebbc19b3a7dd01d6aecb20d1e
actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
actions/setup-go v5@0c52d547c9bc32b1aa3301fd7a9cb496313a4491
anchore/sbom-action v0.15.10@ab5d7b5f48981941c4c5d6bf33aeb98fe3bae38c
goreleaser/goreleaser-action v5@7ec5c2b0c6cdda6e8bbb49444bc797dd33d74dd8
actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
docker/setup-buildx-action v3@d70bba72b1f3fd22344832f00baa16ece964efeb
docker/login-action v3@e92390c5fb421da1463c202d546fed0ec5c39f20
docker/build-push-action v5@af5a7ed5ba88268d5278f7203fb52cd833f66d6e
anchore/sbom-action v0.15.10@ab5d7b5f48981941c4c5d6bf33aeb98fe3bae38c
softprops/action-gh-release v1@de2c0eb89ae2a093876385947365aca7b0e5f844
ubuntu 22.04
.github/workflows/semantic_pr.yaml
amannn/action-semantic-pull-request v5.4.0@e9fabac35e210fea40ca5b14c0da95a099eff26f
ubuntu 22.04
.github/workflows/test.yaml
actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
actions/setup-go v5@0c52d547c9bc32b1aa3301fd7a9cb496313a4491
codecov/codecov-action v3
go.mod
go 1.21
github.com/aquasecurity/trivy-operator v0.17.1
github.com/fatih/color v1.16.0
github.com/magiconair/properties v1.8.7
github.com/mittwald/go-helm-client v0.12.5
github.com/sashabaranov/go-openai v1.20.4
github.com/schollz/progressbar/v3 v3.14.2
github.com/spf13/cobra v1.8.0
github.com/spf13/viper v1.18.2
github.com/stretchr/testify v1.9.0
golang.org/x/term v0.18.0
helm.sh/helm/v3 v3.13.3
k8s.io/api v0.28.4
k8s.io/apimachinery v0.28.4
k8s.io/client-go v0.28.4
github.com/adrg/xdg v0.4.0
buf.build/gen/go/k8sgpt-ai/k8sgpt/grpc-ecosystem/gateway/v2 v2.19.1-20240213144542-6e830f3fdf19.1@6e830f3fdf19
buf.build/gen/go/k8sgpt-ai/k8sgpt/grpc/go v1.3.0-20240213144542-6e830f3fdf19.2@6e830f3fdf19
buf.build/gen/go/k8sgpt-ai/k8sgpt/protocolbuffers/go v1.33.0-20240406062209-1cc152efbf5c.1@1cc152efbf5c
cloud.google.com/go/storage v1.40.0
cloud.google.com/go/vertexai v0.7.1
github.com/Azure/azure-sdk-for-go/sdk/azidentity v1.5.1
github.com/Azure/azure-sdk-for-go/sdk/storage/azblob v1.3.1
github.com/aws/aws-sdk-go v1.51.21
github.com/cohere-ai/cohere-go/v2 v2.7.1
github.com/google/generative-ai-go v0.10.0
github.com/grpc-ecosystem/grpc-gateway/v2 v2.19.1
github.com/hupe1980/go-huggingface v0.0.15
github.com/olekukonko/tablewriter v0.0.5
github.com/prometheus/prometheus v0.49.1
github.com/pterm/pterm v0.12.79
google.golang.org/api v0.170.0
gopkg.in/yaml.v2 v2.4.0
sigs.k8s.io/controller-runtime v0.16.3
sigs.k8s.io/gateway-api v1.0.0
github.com/google/gnostic v0.7.0
github.com/prometheus/client_golang v1.19.0
github.com/robfig/cron/v3 v3.0.1
go.uber.org/zap v1.27.0
golang.org/x/net v0.23.0
google.golang.org/grpc v1.62.1
k8s.io/apiextensions-apiserver v0.28.4
k8s.io/utils v0.0.0-20240310230437-4693a0247e57@4693a0247e57
oras.land/oras-go v1.2.4
charts/k8sgpt/values.yaml
Checklist:
Relevant to issue #114 that was closed in PR #155.
Primary issue: The kubecontext
flag has no effect. Current context is always used despite any given input.
Secondary issue: The functionality isn't described in the docs - happy to submit chore PRs to document how this works if this is an issue from my end.
Client Version: v1.26.1
Kustomize Version: v4.5.7
Server Version: v1.24.10-eks-48e63af
MacOS Ventura 13.2.1
Assumes you have a kubeconfig
with a functional current context - e.g. you can run kubectl get nodes
successfully.
./k8sgpt auth
, put some random key as we wont use it anyway./k8sgpt analyze
- works fine, uses your default context./k8sgpt analyze --kubecontext aContextThatDoesNotExist
- runs without issue, as it uses your default context despite any given valueIf I provide a context that exists, that context should be used for the analysis.
If I provide a context that does not exist, the tool should crash or fail (hopefully with a nice error message explaining why.)
Current context of your kubeconfig
is always used. Providing a context that doesn't exist will still use the default, leading to misleading behaviour as shown below.
# without context flag - uses current context, this cluster is healthy
❯ ~/Desktop/k8sgpt_Darwin_arm64_0.1.4/k8sgpt analyze
{ "status": "OK" }
# with context flag - current context is used anyway
❯ ~/Desktop/k8sgpt_Darwin_arm64_0.1.4/k8sgpt analyze --kubecontext aContextThatDoesNotExist
{ "status": "OK" } # this is a lie - the given context doesn't even exist
Checklist:
After changing kubectl context to another cluster, it seems as if k8sgpt just respons with a cache without actually checking. I get the same (identical) response eventhough kubeconfig context was changed.
analyze
analyze
I expected to get a new analyze result from the new cluster context.
I got the identical response as from when I first ran the analyze
I've also tried deleting ~/.k8sgpt.yaml
as well as ~/.kube/config
and thereafter reset a totally fresh kubeconfig context and reinitated the k8sgpt auth
. The respons I got after a k8sgpt analyze
was the exact same as before, from the cluster I no longer have in my context.
The progress loading is at 0%
and the time usage output says | (0/4, 0 it/hr) [0s:0s]0
Additionally, also tried from a different terminal session. Also tried switching to Terminal.app instead of iTerm. Also switched from omzsh to bash. I get the cached response immediately...
I verified that kubectl had correctly switch context between each step and in all the different shell sessions, and that. k8sgpt
had not switched. It still gave the identical response.
I would like to run the analyze for a specific namespace instead of the entire cluster. --filter by namespace or just a --namespace flag would be ideal.
To enable people to leave better issues, we need to use an issue template to help provide the right information
There should be a section in the readme on how to install the binaries for windows and Linux from the releases.
This path is incorrectly printed out in the CLI and needs updating
Create a shell script to act as installer, either hosting directly in github or remotely
When using --explain
, the text from each issue is too close together. For some of us, this gets problematic, so it makes sense to get a little bit of space in between these issues by adding a newline.
Since many (including myself) with organizations will be using ChatGPT as part of Azure OpenAI Service. The feature should also allow the ability to point to ChatGPT running on Azure OpenAI Services.
It is using the same API but different endpoint https://openaichatgpt.openai.azure.com, could for instance be a featureflag to point it to another endpoint?
Since the refactor, it looks like the --explain time has sky rocketted, but the reality is that it's making all the API calls prior to displaying the results rather than incrementally, this might be worth adding a count on or printing our incrementally somehow
e.g.
|---> |
1 of 10 tasks
Currently, k8sgpt does not understand British spelling e.g. analyse vs analyze https://prowritingaid.com/analyse-or-analyze
Flag --kube-context
for quick switching between clusters without global system update of kube context
At the moment, only the affected resource is printed out, when a problem is found. As the configuration often has to be changed in a parent object, we should investigate the path to the root and print it out afterward.
pod ->
replicaset->
deployment`Checklist:
It could be highly beneficial to conduct an analysis of hpa resources to ensure that everything is configured correctly and functioning as expected.
Initially, we could focus on identifying whether the scaleTargetRef
mentioned in the hpa object exists.
Users could detect and troubleshoot hpa issues
As discussed with @AlexsJones , this analyzer may not be enabled by default, but only if the user specifies it by running k8sgpt filters add hpa
. This issue depend #161
Error: fatal: repository 'https://github.com/k8sgpt-ai/k8sgpt/' not found
Checklist:
I noticed that when a filter is added during analysis, all analyzers are executed, which makes the analysis very long even if we only want to retrieve the services.
The filter is applied during the merging of errors to display only what the user has indicated in the filter argument. Shouldn't the filter be applied upstream so that only the analyzer we are interested in is executed?
Improvement of Performance in Adding Filters
Checklist:
As we start to build out more capabilities, there are those that are not necessarily catching "errors" but more best-practice or good guidance on how to operate with Kubernetes. One such example os PDB ( Pod disruption budget ), it is not an error not to have this on a deployment/sts/ds but it is good practice for when you wish to roll a small deployment without loss of availability.
I propose that we have a mechanism to list/set default filters
e.g.
k8sgpt filters list
> Pod
> Service
> Ingress
etc..
Then you could add or remove a filter
k8sgpt filters add: <Resource>
// Check if its found
k8sgpt filters remove: <Resource>
// Check if its found
This would then form the basis of a new array we would pull from viper and loop over
e.g.
func RunAnalysis(ctx context.Context, filters []string, config *AnalysisConfiguration,
client *kubernetes.Client,
aiClient ai.IAI, analysisResults *[]Analysis) error {
// HERE WE WOULD BUILD THE ANALYZER MAP BASED OFF OF viper.Get("default_filters")
// if there are no filters selected then run all of them
if len(filters) == 0 {
for _, analyzer := range analyzerMap {
if err := analyzer(ctx, config, client, aiClient, analysisResults); err != nil {
return err
}
}
return nil
}
for _, filter := range filters {
if analyzer, ok := analyzerMap[filter]; ok {
if err := analyzer(ctx, config, client, aiClient, analysisResults); err != nil {
return err
}
}
}
return nil
}
I'm facing this issue while running command: k8sgpt analyze on Macos arm64 and connecting to GKE.
Error initialising kubernetes client: no Auth Provider found for name "gcp"
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x2 addr=0x0 pc=0x101e6c600]
When deploying services, one of the things which are faced very often is that endpoints are not available, caused by pods that are not able to start or incorrect labels on the service object.
Create a brew installer for k8sgpt
e.g. brew tap k8sgpt-ai && brew install k8sgpt
Checklist:
I would like to know how my private cluster-data is treated. I would not like to disclose any personal info to GTP/OpenAI. I miss some information about this topic in the README.
Is the collected data anonymized before sending it to GPT? Which data is collected? Are my private and sensible clusters going to keep safe? Why? Is this tools GDPR compliant? Etc...
Checklist:
if you set deny-all nw-policies for your namespaces per default and you need to define each allow-policy it could be really frustrating when your frontend service cannot connect to the backend service. you need to check
in the first place we could check if the "from" and "to" selectors in network policies match existing objects, probably also checking targetPorts
analyzing network policy issues gets simpler
We need to have at least the default model as a parameter that is in the config for long term support of models like davinci via openai
e.g.
ai:
providers:
- name: openai
model: GPT3-turbo
Checklist:
At present, the link to the slack channel mentioned in the Readme.md is redirects to the old slack channel. However, a new slack workspace for K8GPT has created Ref: https://k8sgpt.slack.com. We need to update the readme and refer to the new link.
Should we keep the old link also, so that new users can refer to the old discussions?
Description
go modules are pointing to github.com/cloud-native-skunkworks/k8sgpt
. We should change this to point to the k8sgpt-ai
org.
Checklist:
The proposed functionality is to add a check to the ingress analyzer that verifies whether the secret declared in the ingress exists on the cluster.
Users could detect and troubleshoot ingress issues
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.