Code Monkey home page Code Monkey logo

k8s-issues's People

Stargazers

 avatar  avatar  avatar

k8s-issues's Issues

Experiment with ML classification

Keyword matching is quite limited, as so much of determining issue ownership is contextual. For example:

  • A vSphere-specific network problem probably isn't actually a sig-network issue.
  • The kubelet is a sig-node component, but many other sigs own related code and features.
  • Is the mentioned component described as the root problem area, or just described as a symptom?

My skills in ML are very limited, so I would appreciate help if anyone feels like experimenting! issue #5 may help.

Set lifecycle hold if user counters fejta-bot

fejta-bot automatically cycles issues through a lifecycle, which ends in closing. Users can manually reset the lifecycle, or set a hold (which stops the issue from degrading).

When users set a lifecycle to counter fejta-bot, they usually want to keep the issue open indefinitely. Athena should automatically put lifecycle/frozen on an issue if a user sets the lifecycle following a fejta-bot lifecycle action.

Add "anti-match" strings for SIGs

String matching leads to false positives.

EG service account -> service
EG cloud provider specifics means the issue is more likely, a cloud provider one and less likely to be sig-network, sig-auth, etc.

Add anti-match fields to sigs, which subtract likeliness upon match.

Clean up classification function flow

Clean up the structure of classification code. Functions should have a clear purpose, and not decide "too much". A good flow might be:

Check issue is open -> Decide scores -> Translate scores to labels-> Remove bad/duplicate labels

Record introspection data to Firestore

I started a basic Google Firestore class for posting data, but willing to take other integrations if they're easy to manage.

The k8s-issue bot should include classification debug data, issue data, and what the final comment was. Enough info that we can later pull stats and make comparisons w/r/t effectiveness. It would also be nice to check up to the "previously seen issue", rather than blindly running against the backlog.

Ignore issues that already have a SIG

Just saw @athenabot comment on this issue: kubernetes/kubernetes#75263

While it's true that vSphere is a vmware product, I'm pretty sure the bug is Windows-related, and not vmware related. Maybe a general rule for the bot should be: "don't add sigs to anything that already has them"? It has the potential to be more annoying than useful if humans have already considered which SIG something belongs to and a lil bot comes along and disagrees :)

Keeps having false-positive matches for sig-node

The bot keeps matching sig-node on issues that are a tenuous match, or are completely incorrect. Contributing factors:

  • Other components (EG storage) are integrated with the kubelet, diluting that keyword.
  • "node" is very common terminology, and are frequently mentioned outside the scope of node management.

Don't comment if ANY sigs are already on the issue

When people categorize an issue, they usually do it correctly. Upwards of 80% of comments on issues with sigs are wrong.

The code already biases against commenting on issues with sigs, but that bias is not proving effective.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.