athenabot / k8s-issues Goto Github PK

View Code? Open in Web Editor NEW

3.0 3.0 5.0 50 KB

License: MIT License

Go 97.34% Dockerfile 2.66%

k8s-issues's People

Stargazers

Forkers

stafot banban9999 gorkemmulayim sylr

k8s-issues's Issues

Experiment with ML classification

Keyword matching is quite limited, as so much of determining issue ownership is contextual. For example:

A vSphere-specific network problem probably isn't actually a sig-network issue.
The kubelet is a sig-node component, but many other sigs own related code and features.
Is the mentioned component described as the root problem area, or just described as a symptom?

My skills in ML are very limited, so I would appreciate help if anyone feels like experimenting! issue #5 may help.

Set lifecycle hold if user counters fejta-bot

fejta-bot automatically cycles issues through a lifecycle, which ends in closing. Users can manually reset the lifecycle, or set a hold (which stops the issue from degrading).

When users set a lifecycle to counter fejta-bot, they usually want to keep the issue open indefinitely. Athena should automatically put lifecycle/frozen on an issue if a user sets the lifecycle following a fejta-bot lifecycle action.

Create a README

Differentiate keyword scoring on main text and code block text

blocks of code throw off keyword analysis, especially when there's huge chunks of kubectl or YAML.

Add "anti-match" strings for SIGs

String matching leads to false positives.

EG service account -> service
EG cloud provider specifics means the issue is more likely, a cloud provider one and less likely to be sig-network, sig-auth, etc.

Add anti-match fields to sigs, which subtract likeliness upon match.

Clean up classification function flow

Clean up the structure of classification code. Functions should have a clear purpose, and not decide "too much". A good flow might be:

Check issue is open -> Decide scores -> Translate scores to labels-> Remove bad/duplicate labels

Record introspection data to Firestore

I started a basic Google Firestore class for posting data, but willing to take other integrations if they're easy to manage.

The k8s-issue bot should include classification debug data, issue data, and what the final comment was. Enough info that we can later pull stats and make comparisons w/r/t effectiveness. It would also be nice to check up to the "previously seen issue", rather than blindly running against the backlog.

Only comment on open issues

Ignore issues that already have a SIG

Just saw @athenabot comment on this issue: kubernetes/kubernetes#75263

While it's true that vSphere is a vmware product, I'm pretty sure the bug is Windows-related, and not vmware related. Maybe a general rule for the bot should be: "don't add sigs to anything that already has them"? It has the potential to be more annoying than useful if humans have already considered which SIG something belongs to and a lil bot comes along and disagrees :)

Keeps having false-positive matches for sig-node

The bot keeps matching sig-node on issues that are a tenuous match, or are completely incorrect. Contributing factors:

Other components (EG storage) are integrated with the kubelet, diluting that keyword.
"node" is very common terminology, and are frequently mentioned outside the scope of node management.

Don't comment if ANY sigs are already on the issue

When people categorize an issue, they usually do it correctly. Upwards of 80% of comments on issues with sigs are wrong.

The code already biases against commenting on issues with sigs, but that bias is not proving effective.

athenabot / k8s-issues Goto Github PK

k8s-issues's People

Stargazers

Forkers

k8s-issues's Issues

Experiment with ML classification

Set lifecycle hold if user counters fejta-bot

Create a README

Differentiate keyword scoring on main text and code block text

Add "anti-match" strings for SIGs

Clean up classification function flow

Record introspection data to Firestore

Only comment on open issues

Ignore issues that already have a SIG

Keeps having false-positive matches for sig-node

Don't comment if ANY sigs are already on the issue

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent