kubernetes-sigs / apisnoop Goto Github PK

⭕️Snooping on the Kubernetes OpenAPI communications

License: Apache License 2.0

HTML 0.35% CSS 1.32% JavaScript 18.24% PLpgSQL 15.45% Dockerfile 0.73% Shell 11.37% Emacs Lisp 5.49% Smarty 1.90% Python 15.18% Svelte 26.49% Starlark 0.66% Go 2.84%

k8s-sig-architecture

apisnoop's People

Contributors

Stargazers

Watchers

apisnoop's Issues

Setup Prow for APISnoop

As a APISnoop developer
I want to be able to utilize the CI infrastructure provided by k8s.io
And populate the GCS buckets used by test-grid and APISnoop
In order to innovate on the type of k8s conformance data available to our community

Given our repo
When we submit a PR
Then we should trigger the prow plugins configured for our repo

We'll document this as we go, so we can ensure others can do the same. The person currently on k8s oncall can be found at: https://go.k8s.io/oncall

Grant k8s-ci-robot owner (admin?) level access to the Github orgs that prow will operate on
Ask k8s oncall to accept the invitation for k8s-ci-robot to be a collaborator
Add k8s oncall as an admin to your repo to allow them to setup HMAC + webhook
Ask k8s oncall to configure a webhook to your github org pointing to https://prow.k8s.io/hook, content type to application/json including the HMAC
Submit a PR against https://github.com/kubernetes/test-infra enabling prow/plugins.yaml by adding this repo
Verify k8s oncall has accepted the invitation for k8s-ci-robot to be an admin
Verify webhook has been configured, and that events are flowing
Verify plugin works (/cat and /dog are fine)

K8s API Object usage within Statically Rendered Helm Charts

Render all helm charts and generate k8s Kind usage.

This will give us an idea for stable charts:

How many need PVCs, use beta/alpha objects etc

Example CNCF Project Request for CI

As a CNCF project admin
I would like to request CI automation from the CNCF
In order to benefit from the CI infrastructure maintained by our community

Given a repo/org that is included in the CNCF Landscape
When I create a ticket (within my own repo) and tag @cncf-ci
And I grant @cncf-ci admin to my repo (or org)
Then @cncf-ci will automatically configure my project for Prow
And other services

/cc @cncf/cncf-ci-working-group @cncf-ci

Web-UI: Visually distinguish tested endpoints tagged [Conformance]

Currently, all tests are visually presented without distinction in the sunburst graph.

By default we want to highlight the tests which are tagged [Conformance], and group them together.

Example of an endpoint with tests containing the [Conformance] tag:

Create a prow jobs for APISnoop

As an APISnoop contributor
I want to be able to create artifacts for test-grid / gcs
In order to provide interesting changes to data sets

Given a new style of data generation
When I create a new prow job for APISnoop
And it includes new forms of data (related to conformance)
Then it will be available for everyone to access
And perform their own analysis

Map existing e2e tests to APIs they cover

In analyzing some of the existing e2e tests, specifically POD related APIs, it's become evident that the e2e tests are not one to one with the APIs, as most of the well written tests (emulating user stories) hit more than one API.

To assist with Identifying e2e tests to upgrade to Conformance tests, we need find a way to correlate our raw audit log entries to the tests that created them.

Some options to explore:

Inject a parameter into the http requests identifying the e2e test that ends up in the audit-log entry
Dual time-series data syncing (record timestamps for start/finish for each test -- less exciting, but this got us across the line for our first dataset)

Open to other options, some type of injection is the cleanest.
I'll start by exploring what injection options are available.

/cc @fedebongio @AishSundar @wangzhen127

apisnoop.cncf.io shows meaningful loading information upon first visiting site

When I visit the website, it will take a moment for the javascript to load, and then for all the graphs and data to appear. During this load time, I should still see something on the screen so I know the page loaded correctly, and now we are just loading the app. In other words, a loading screen that is rendered in static html, that is then replaced by the javascript-enabled visualizations.

Volume bars to indicate how many times an endpoint is hit

The current sunburst visualisation does not provide any indication of the volume of hits an endpoint receives meaning there is no easy way to prioritise or make decisions about where effort needs to be spent.

Adding a volume indicator to the sunburst allows us to identify which endpoints are receiving the most hits from the test suite (and apps in the future) and will allow prioritisation of effort, in particular if an app is hitting non-tested/confirmant endpoints.

Further options once volume data can be charted could include filtering and/or sorting the sunburst by the volume information (a future consideration).

An approach for the volume bars is shown below:

Continuous update of graph on README.md and apisnoop.cncf.ci from K8s master

At the moment to update the graph we have to manually update both the image on the repo and the data on the web. This involves building from source, spinning up a K8s test cluster, running tests etc.

On a regular basis, we could automatically run the e2e conformance suite and collect logs on K8s master.
We could also automatically create an image for README.md so its picture is always up to date.

Prerequisites: Sessions, easy deploy and log collection.

Interested people: oomichi

Export conformance stats into velodrome.k8s.io

http://velodrome.k8s.io/dashboard
https://github.com/kubernetes/test-infra/tree/master/velodrome

Velodrome is the dashboard, monitoring and metrics for Kubernetes Developer Productivity

Could be nice to export conformance stats to velodrome

K8s API Usage by pods deployed from popular Helm Charts

For populate helm charts, where the deployed pods interact with the k8s API, generate audit-logs for loading into APISnoop.

Conformance Audit-Log's for 1.9.x, 1.10.x, 1.11.x (w/ user-agent=test)

As a community member interested in conformance coverage
I want to understand the changes in e2e coverage across releases (and master)
In order to have a simple measure of increasing coverage

Given I visit apisnoop.cncf.io
When I click on the drop downs for 1.9,1.10,1.11, etc
Then I will see a coverage chart for those releases

User agent filtering selector

The existing APISnoop implementation illustrates the endpoints accessed by a single app (the e2e conformance tests), however it is possible to capture audit information for all apps that are hitting endpoints and as such we can load this data and then allow it to be filtered by user agent.

Introducing this filter will allow for more in depth data interrogation and provide valuable information to app developers about which endpoints they are hitting and their conformance status (in conjunction with additional visualisation features).

The functionality of the selector is illustrated below:

Outer Spike on Sunburst based on hit count

When visualizing API endpoint usage data, it would be useful to show which endpoints are accessed most frequently.

This could be visualized by drawing the height of the endpoint color bar, in proportion to amount of API hits.

Given a testgrid job bucket, show me the coverage results

There have been requests for loading various GCS buckets into APISnoop.

Being able to provide an bucket via the webui for loading would be useful.

This may require that we start basing our URLs on /gcs/bucket/name/jobid etc

Identify e2e tests to upgrade to Conformance tests

Let's distinguish missing conformance tests that have underlying e2e tests both in our graphs and prioritized lists.

Combined with SIG to API mapping, we could provide each sig with a focused list and visualization of possibly 'free' conformance tests.

Identify kapics+methods to drive POD api usage

The CNCF is contracting help to write missing e2e tests for stable apis, specifically ones that make sense to promote to conformance tests.

POD API endpoints are extremely feature rich, have multiple implementations, and are used by everybody. This combination makes them of particular interest as we collect user journey data to prioritize which tests to write next. SIG-Architecture May 10th discussion

The APISnoop team would like provide a list of popular POD K8s API endpoints and additionally the parameters/responses used by real world API consumers at the next sig-node meeting on June 13th.

To do so, we need to collect audit-logs while driving applications that use the POD API to it's fullest.

We are asking for help in identifying which applications we should focus on, in addition to help driving those applications to give us meaningful POD endpoint/verb parameters/responses.

Projects with existing e2e tests would be best, but we are willing to manually drive them if doing so can produce actionable data (based on real user journeys) to drive our conformance efforts.

/cc @cncf/kubernetes-conformance-team @kubernetes/sig-node-proposals

request feedback on how to drive kapics to use the pod api, and suggested kapics to choose first
https://groups.google.com/d/topic/kubernetes-sig-node/6ZfFECC1sjc/discussion
identify kapics that drive pods
identify methods to drive pod api using kapics
Present initial results to sig-node on June 13th meeting

Overlay - Additional Details

We collect a range of data within the audit logs that can provide further, detailed information about each of the endpoints i.e. what tests are accessing them, who the owners are and what SIGs they belong to.

In addition for the E2E tests we can present more detailed information about each of the tests that are running against the endpoint and their purpose.

All of this information can be presented back to the user when they click on an endpoint from within the sunburst visualisation and to do this we would present an overlay that loads as required. Depending on the available data we can structure the screen in a number of ways - which is being explored.

Illustration below:

Checkboxes to show/hide stable, beta, alpha endpoints on the sunburst

Our sunburst does not currently allow for the endpoint data to be filtered in any way which can make interrogating the data a challenge when dealing with over 900 potential data points. A first step to helping users focus in on areas of interest is to allow them to select which high level category of endpoints (stable, beta, alpha) they'd like to view on the sunburst.

To enable this functionality the top level categories will be shown as colour coded checkboxes underneath the sunburst allowing the user to select/unselect which ones appear on the visualisation. If an item is unchecked that endpoint data is removed from the sunburst and the rings re-displayed accordingly i.e. if alpha is unchecked then we remove this data and the inner ring will only show stable and beta with the outer rings updated accordingly.

There must be one checkbox selected at all times to ensure the sunburst has data to present.

The test tags section is contained to a fixed height, with a more button to see all of them.

When you hover over an endpoint,k a summary appears to the right that shows the path and a comnplete list of test tags for the tests run on it. This list can be quite long, and when this happens it stretches in a thin line down the length of the page.

Instead, it should remain the same size no matter what, but if the tests overflow, you should see a 'more' option and upon clicking that grow it to its entire size.

bottom margin added to sunburst section, to provide more space between it and footer

When I visit the page for the first time, without setting any additional filters, the sunburst section is sometimes cut off by the footer. Adding a margin to the bottom of our sunburst section would provide necessary spacing.

There is a hover interaction on each tests, to make them more obviously clickable.

When you are scrolling through the sequence of tests, and your mouse hovers over one, there should be some indication that it was meant to be hovered-upon and clickable. Having the color dim slightly would achieve this.

Load Audit Data

To allow others to contribute their audit data to APISnoop, we need to provide an ability to upload audit-logs for analysis. These could then be loaded into the sunburst visualization, or other APISnoop tools allowing for a deeper dive into their own data data.

It might also be possible, once test-infra jobs support storing audit.logs, to use gs://bucket/path/to/audit.log urls as as source.

To allow this I propose a drag/drop if the user wishes to upload a local file or and alternative to input a URL/Link to the data to be consumed by the APISnoop service, as illustrated below:

Log all k8s client-go usage src+line using runtime.Caller

As we add user-agent to audit-logging and look into setting user-agent per test in e2e framework it got me thinking about wider reaching approaches.

If we add support in kubernetes/client-go to set the user-agent to the calling src code file and line number via runtime.Caller, the resulting data would allow a community wide index of not only all API usage parameters and responses, but the links to the precise source code lines that were involved.

Check Podspec data of requests against OpenAPI spec

At the moment we are just looking at methods and requestURIs of communications with kube-apiserver. We could expand this by looking at the request Podspec data submitted. Doing this we could:

Obtain stats on individual field coverage in the Podspec
Validate individual requests to the endpoint against the OpenAPI spec

There has been interest in using this feature to help efforts surrounding increasing conformance test coverage for POD APIs.

People interested: aish, zhenw

Figure out the mapping for API Endpoint to SIG/OWNERS

This would allow us to narrow scope down a a SIG and give them a view into the specific data that interests them. We could possible extend that to specific github user/owners.

This will require understanding the structure of the api-machinery mappings, likely beyond what is available in kubernetes/community/sigs.yaml

Consistent positioning and alignment of sections from top to bottom of page.

When I visit the site, I want it to be aesthetically pleasing. There is dissonance currently where some sections are given a lot of white space, and some are not given any at all. It is okay for the site to have new amounts of padding, but it should be consistent and intentional.

Give an audit to the site to make sure we have a set centering and spacing for all elements, to give visual consistency to the page.

When you click on an endpoint, it has a different interaction and visual appereaance than clicking and zooming into level or category

Currently, when you click on a level or category, it 'zooms' into that section, where the section becomes the entire ring. When you click on an endpoint, it does the same thing. This behavior gives a nice appearance for level/category, but looks strange when zoomed to level/category/endpoint. A better interaction would be that the level/category zoom in, but the endpoint just locks into place.

[[screenshots go here]]

Upon loading the master release, the gathered date does not show as blank.

When you click on any release, in the heading of the sunburst you see the release's name, whether it's conformance or sig, and the date it was gathered. Except for the master release. This shows the name, and 'Gathered On', but then it is a blank line.

We should either remove the gathered on line for the master release, or add an explicit date to it so it can be displayed.

Explain discrepancy between oomichi's coverage stats and ours

At the moment there is a discrepancy between the test coverage statistics between oomichi's tool and ours. Why this is occurring needs to be investigated.

Test Tag Filtering (Show me everything related to PVC)

Filtering for existing tests would should show only tags we are interested in.

This would likely be combined with filtering endpoints by regexp #67

Optimize web app code to decrease our bundle size and load times.

As a visitor to the apisnoop site, I would like to see the data and visualizations as quickly as possible. The biggest hurdle to this is the initial retrieving of our bundle.js file. Lowering this file's size will speed up the initial page load. We can do this through keeping fewer dependencies, and optimizing the underlying logic.

Add Advanced Auditing Webhook Backend Configuration to kops/kubeadm/minikube

Advanced Auditing needs a policy file passed to --audit-policy-file to apiserver.

Enabling a webhook also requires configuration file via --audit-webhook-config-file

This ticket is to track support for audit-webhook provisioning:

Refactor/Import apisnoop.cncf.io into cncf/apisnoop repo

As a contributor to APISnoop
I want to bring develop / contribute to the webui
In order to provide alternative interpretations / visualiziations of the data

Given a checkout of cncf/apisnoop
When I run the correct sequence of commands
Then I will be able to browse the latest versions of test-grid jobs
And modify my local copy of APISnoop to interpret that data differently

Modularize and Document APISnoop webui codebase
Ensure Loading of Data directly from disk as before
Maintain current visualizations, but allow for easy experimentation with other data sources
Ensure Loading of Data remotely from test-grid

Filter API endpoints via regexp

For untested API endpoints, we need to be able to filter without having any test tags or useragent data.

Add user-agent support to audit-logging

Add user-agent to audit-logging:
kubernetes/kubernetes#64791
kubernetes/kubernetes#64812

This should be approved after the code-freeze melts.

Align Summary Card to Sunburst Center

The "summary card" to the left of the sunburst graph, is currently aligned to match the top of the entire sunburst box, including the title. However, this causes the "summary card" look off-center on the page.

I think it would be more appropriate to align the "summary card" to the center of the sunburst graph itself.

Direct links lead to 404

Direct links to views lead to 404s, eg. https://apisnoop.cncf.ci/master

The URIs seem to be loading correctly within the WebUI, but the direct link is broken.

Rollover endpoint (on outer ring) reveals label/name

Each line on the outer ring of the sunburst visualisation represents an endpoint and when you rollover each available one with your mouse we are showing the name in a box at the top left. However, the name/label box is fixed in place meaning there is a disconnect between the endpoints being rolled over and the information being shown.

In order to make it easier for the user to interrogate each endpoint on the outer ring, when they rollover them a label/name will appear connected to each in closer proximity as per the illustration below.

There is clear indication of tests you can browse, when you click on an endpoint with tests.

When you browse an endpoint, it will often have tests you can look at too. This isn't immediately clear, since the list of tests is beneath the window screen. There should be a tightening of whitespace so the test header is seen, but also a well-placed link that says 'check out tests' to accommodate diff. screen sizes.

Exploring Objects and Spec fields

Analyzing the parameters for the various kinds of API objects for patterns.

Some initial exploration is required before we can apply some logic / visualizations.

Fix Sunburst Alignment

Currently the bottom of the sunburst graph is off-screen. We should align it to be always on-screen.

Use admission webhooks as an alternative to audit logs

Currently auditing webhooks are difficult to use as they require configuration before kubernetes is started. There are plans to allow configuration of webhooks at runtime (maybe v1.12).

Audit logs require ssh access to the master and require manual collection.

And so, we are looking for other ways of collecting request information.

One possible way would be using Admission webhooks
https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/#prerequisites

An admission controller is a piece of code that intercepts requests to the Kubernetes API server prior to persistence of the object, but after the request is authenticated and authorized.

Admission webhooks are HTTP callbacks that receive admission requests and do something with them.

An admission webhook that

Records the request
Responds immediately with OK
Sends the request to apisnoop

could be something worth exploring

Filter out deprecated api fields from coverage info

It's unclear to me whether there is an open field for this purpose that kubernetes should be using, or if apisnoop should be looking for the word "deprecated" in descriptions

ref: kubernetes/kubernetes#65147

Add user-agent creation to e2e tests (name+step)

Go beyond our go-client library and try to find test name and step.

Show ALL individual endpoints within the sunburst

The current sunburst visualisation only shows (in colour) the endpoints that are 'hit' by an app or the E2E test suite with the remaining (non-hit) endpoints shown as solid grey blocks.

In order to allow us to filter and manipulate all of the endpoint data, each individual endpoint should be represented within the sunburst by an individual block/line that can be interacted with (and colour coded by category or other filters chosen by the user).

An example of how this could be visualised is below:

Test Href's load the test id, and not just '${test[test_item_id]}

When you click on an individual test, you can see the url change, but it shows as #{test[test_item]}. This is the variable, instead of the value of the variable. This should change to the actual value.

Update to pull latest job data for bucket

The latest job ids should be pulled semi-regularly, and the new audit logs processed and pushed.

Setup daily job to run the following for master
Run --update-sources (will update data-gen/sources.yaml)
Run --update-cache, --process-cache, --upload-results (create apisnoops and upload to GCS)
If the above is successful, update or create a branch w/ a PR for updated sources.yaml

Tested/untested filtering

When the sunburst visualisation is filtered to show only endpoints accessed by a particular app (besides the E2E test suite) it would be valuable to be able to see which of the accessed endpoints have been tested and which of those haven't.

Combined with volume of hits data this information would provide useful insights as to which endpoints are priorities for conformance testing based on their status and how they are being used.

To that end we need to introduce a selector that allows the sunburst visualisation to be updated based on whether the user wishes to see All endpoints accessed by a particular app, only the Tested endpoints or only the Untested endpoints.

This can be achieved through the addition of a radio button as below:

Fix up apisnoop.cncf.ci

feature the all-in chart
link to the repo
generally look nicer
https / let's encrypt

kubernetes-sigs / apisnoop Goto Github PK

apisnoop's People

Contributors

Stargazers

Watchers

Forkers

apisnoop's Issues

Recommend Projects

Recommend Topics

Recommend Org