While looking through past conformance test results for comparison of our own I notice

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

I don't have time to dig into why there is a delta <a class="user-mention notranslate"

Making notes as I review this: The line <code class="notransla

/assign <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

The config here: <a href="https://github.com/gardener/test-infra/blob/68c3d60171fcb7d3

Accepted conformance tests run without full test suite about k8s-conformance HOT 12 CLOSED

cncf commented on July 17, 2024 3

Accepted conformance tests run without full test suite

from k8s-conformance.

Comments (12)

OlegLoewen commented on July 17, 2024 1

@johnSchnake Thanks for pointing that out, I have found the root cause for the missing testcases. It was an unfortunate misconfiguration of our test case list file. I will fix it soon and reupload the test results. Let me explain how this test results have been obtained:

At first, the description of how to reproduce the test results sap-cp-aws is outdated. I will update it soon. Further below I also describe how we currently run the conformance tests (and other) in detail.

We are continuously working on test coverage of gardener, which is why readme descriptioins like the mentioned one are getting outdated pretty fast. At the beginning we were using suonoboy, which was fine for a moment. But since we wanted to test additional k8s e2e test cases, besides conformance and we wanted to automate test runs to ensure quality, we introduced Test Machinery and a kubetest wrapper which allow us to test clusters in a structured and consistant way. Currently we are running over 260 e2e test suites a day (on different landscsape, k8s versions, cloud providers and operating systems) which covers 1099 unique testscases and 113025 executed testcases in total per day. We are uploading conformance test results to testgrid (conformance, others) on daily basis for k8s versions 1.10 to 1.16 and multiple cloud provider.

Reasons why we were forced to build a kubetest wrapper were to gain more control over testcase grouping and test result evaluation. With that we now have:

The possibility to define custom testcase groups with individual tagging like fast, slow, flaky, etc. because we sometimes want custom grouping and since existing tags in test case names are not reliable. E.g. some testcases are slow, although they don't have a [Slow] tag, and same for [Flaky].
Filter false positive test cases
Parse resulting junit.xml files and e2e logs to our JSON format to ingest into Elasticsearch
Annotate testcases to run only for dedicated cloud providers
Merge junit.xml files of parallel kubetest executions (this is not used for conformance)

Regarding this discussion here, would be great to have some validation on testgrid side for e.g. testcase numbers to avoid such incomplete executions.

You will be able to reproduce the test results if you run:

#first set KUBECONFIG to your cluster
docker run -ti -e --rm -v $KUBECONFIG:/mye2e/shoot.config golang:1.13 bash

# run all commands below within container
export E2E_EXPORT_PATH=/tmp/export; export KUBECONFIG=/mye2e/shoot.config; export GINKGO_PARALLEL=false
go get github.com/gardener/test-infra; cd /go/src/github.com/gardener/test-infra
export GO111MODULE=on
go run -mod=vendor ./integration-tests/e2e -debug=true -k8sVersion=1.16.1 -cloudprovider=azure -testcasegroup="conformance"
echo "testsuite finished"

Which internally runs:

kubetest --provider=skeleton  --extract=v1.16.2 --deployment=local --test --check-version-skew=false --test_args=--clean-start --ginkgo.dryRun=false --ginkgo.focus=\\[k8s\\.io\\]\\sSecurity\\sContext\\sWhen\\screating\\sa\\spod\\swith\\sprivileged\\sshould\\srun\\sthe\\scontain... -dump=/tmp/e2e.log

from k8s-conformance.

timothysc commented on July 17, 2024

I don't have time to dig into why there is a delta @johnSchnake can you look quick to see what the delta is here?

from k8s-conformance.

johnSchnake commented on July 17, 2024

Making notes as I review this:

The line Conformance test: not doing test setup. comes from hack/ginkgo_e2e.sh which would make me assume this wasn't generated via running Sonobuoy (which is fine, but just FYI). Sonobuoy is recommended but not required.
The instructions for the first item in your list sap-cp-aws gives instructions that refer to a URL that gives me a 404. It indicates that they ran the tests via Sonobuoy using some stored YAML manifest. This, in theory, would be an acceptable way to run the tests (you dont have to use the CLI) but again the logs dont seem to indicate it was run via Sonobuoy (which doesn't use the ginkgo_e2e.sh script)
I ran sonobuoy run --plugin-env e2e.E2E_DRYRUN=true --kube-conformance-image-version=v1.15.2 to check the list of conformance tests for that version and confirmed (as stated in the ticket) that 215 tests are conformance tests despite only 212 being run on these.
The missing tests in the first case (we'd have to confirm them for the others) are:
- [sig-apps] Job should delete a job [Conformance]
- [sig-apps] ReplicationController should surface a failure condition on a common issue like exceeded quota [Conformance]
- [sig-network] DNS should provide DNS for ExternalName services [Conformance]

I will take a look and see if I can find a reason why those wouldn't have been chosen either by that hack script or some other reason.

What needs to happen IMO going forward is:

the logs should print the focus/skip values used for ginkgo. This is printed in the pod logs but not as part of the e2e.log output.
A validation script automates the checks that should occur (# of tests pass/skip/etc and the versions of the server/test match)

from k8s-conformance.

hh commented on July 17, 2024

/assign @hh @johnSchnake

from k8s-conformance.

johnSchnake commented on July 17, 2024

All 3 of those tests were also added to Conformance in v1.15; but the SHA in the logs for the test matches the SHA in my test run with 215 test so I don't think that could account for it.

@OlegLoewen You were the one to submit for at least some of these; could you clarify how these runs were obtained? They indicate they used a sonobuoy yaml script which hasn't existed for some time and also wouldn't have matched the version of tests used here.

from k8s-conformance.

hh commented on July 17, 2024

@johnSchnake and I will take a closer look at this together tomorrow

from k8s-conformance.

MarcelMue commented on July 17, 2024

@johnSchnake and I will take a closer look at this together tomorrow

Let me know if I can helpful in any way.

I looked through the docs for conformance tests and noticed that the Reviewing Guidelines only make loose statements on the number of test cases in number 5: https://github.com/cncf/k8s-conformance/blob/master/reviewing.md#technical-requirements
Would it make sense to set the number of test cases? Changing the number of test cases would then also require an update to the reviewing guidelines. But some verification through CI as you mentioned previously would be superior.

Additionally I noticed that the wording around sonobuoy usage is quite vague. I understand that it is only recommended from what @johnSchnake wrote - however the content description for a PR here makes it seem like sonobuoy is required (which was my understanding so far).

Some guidelines around which execution methods for the test cases are actually allowed would help debugging situations like this IMO. I am not sure how feasible that would be as I have no overview of the currently used execution methods except sonobuoy.

from k8s-conformance.

johnSchnake commented on July 17, 2024

The config here: https://github.com/gardener/test-infra/blob/68c3d60171fcb7d36b39d86935d14f5ea55064d6/integration-tests/e2e/kubetest/description/1.16/working.json#L3 makes it look the link between the testcasegroup you mention and how to set the focus/skip on the actual run, is that correct?

FWIW, for conformance certification runs you should be just setting the focus to [Conformance] and not setting a skip value at all. I understand that in basic CI situations you may want to skip serial/slow though (Sonobuoy does something similar to avoid disruptive tests).

Here is the dir for all the v1.15 tests https://github.com/gardener/test-infra/tree/68c3d60171fcb7d36b39d86935d14f5ea55064d6/integration-tests/e2e/kubetest/description/1.15 and it just seems like those tests aren't listed in any of the working/skip/false-positive files so I must be misunderstanding a component of how the test list is run.

from k8s-conformance.

OlegLoewen commented on July 17, 2024

The mentioned test cases weren't conformance-tagged in previous k8s release version. So since I have reused the description file of previous k8s release version for 1.15, these test cases were already assigned to the group fast/slow but not conformance, which is the reason why they have been discarded during the test suite execution. This commit is the quick fix, but I will think of something to prevent this kind of an issue in future. Also the commit somehow helps to understand the actual issue/bug.

Regarding fokus skip fields in the description file:

Our main test case groups are fast, slow and conformance (which can be combined). Test cases of the fast group are executed parallely with --ginkgo-parallel=8 to have faster results. If we run only the fast group, we still want to have the fast conformance test cases to be run as well but we don't want to have any serial or slow tagged conformance tests. To accomplish this, we use the skip and focus fields.

If we run e2e with -testcasegroup=conformance, then these line are taking into account to calculate the conformance testcases:

  { "testcase": "[Conformance]", "focus": "Serial|Slow", "groups": ["slow", "conformance"]},
  { "testcase": "[Conformance]", "skip": "Serial|Slow", "groups": ["fast", "conformance"]},

FWIW, for conformance certification runs you should be just setting the focus to [Conformance] and not setting a skip value at all [...]

If this is a requirement and using a list of test cases is not an alternative, we can do this as well.

from k8s-conformance.

johnSchnake commented on July 17, 2024

I think this raises good questions about what is required for conformance certification.

As I read it, Sonobuoy is optional but that the conformance run should be executed in a single test run rather than multiple which are spliced together. I think that makes the most sense as the instructions request a single log and junit result file.

You could argue that you want to merge the junit results together but in that case you'd at least need to upload all of the log files separately.

In addition, allowing N runs to be run at separate times but combined to meet certification seems like a loosening of requirements since clusters could undergo changes between the different runs. Everything is much more clear/standardized if we just require a single run (though again, parallelizing for CI makes sense).

from k8s-conformance.

hh commented on July 17, 2024

I agree with @johnSchnake regarding enforcement a specific list of tests in a single test run.

I've created a ticket for creating a Prow Job to run before submissions can be accepted.

#760

from k8s-conformance.

MarcelMue commented on July 17, 2024

AFAICT test runs are now automatically checked for completeness. Should this issue be closed?

from k8s-conformance.

Accepted conformance tests run without full test suite about k8s-conformance HOT 12 CLOSED

Comments (12)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent