Code Monkey home page Code Monkey logo

Comments (12)

OlegLoewen avatar OlegLoewen commented on July 17, 2024 1

@johnSchnake Thanks for pointing that out, I have found the root cause for the missing testcases. It was an unfortunate misconfiguration of our test case list file. I will fix it soon and reupload the test results. Let me explain how this test results have been obtained:

At first, the description of how to reproduce the test results sap-cp-aws is outdated. I will update it soon. Further below I also describe how we currently run the conformance tests (and other) in detail.

We are continuously working on test coverage of gardener, which is why readme descriptioins like the mentioned one are getting outdated pretty fast. At the beginning we were using suonoboy, which was fine for a moment. But since we wanted to test additional k8s e2e test cases, besides conformance and we wanted to automate test runs to ensure quality, we introduced Test Machinery and a kubetest wrapper which allow us to test clusters in a structured and consistant way. Currently we are running over 260 e2e test suites a day (on different landscsape, k8s versions, cloud providers and operating systems) which covers 1099 unique testscases and 113025 executed testcases in total per day. We are uploading conformance test results to testgrid (conformance, others) on daily basis for k8s versions 1.10 to 1.16 and multiple cloud provider.

Reasons why we were forced to build a kubetest wrapper were to gain more control over testcase grouping and test result evaluation. With that we now have:

  • The possibility to define custom testcase groups with individual tagging like fast, slow, flaky, etc. because we sometimes want custom grouping and since existing tags in test case names are not reliable. E.g. some testcases are slow, although they don't have a [Slow] tag, and same for [Flaky].
  • Filter false positive test cases
  • Parse resulting junit.xml files and e2e logs to our JSON format to ingest into Elasticsearch
  • Annotate testcases to run only for dedicated cloud providers
  • Merge junit.xml files of parallel kubetest executions (this is not used for conformance)

Regarding this discussion here, would be great to have some validation on testgrid side for e.g. testcase numbers to avoid such incomplete executions.

You will be able to reproduce the test results if you run:

#first set KUBECONFIG to your cluster
docker run -ti -e --rm -v $KUBECONFIG:/mye2e/shoot.config golang:1.13 bash

# run all commands below within container
export E2E_EXPORT_PATH=/tmp/export; export KUBECONFIG=/mye2e/shoot.config; export GINKGO_PARALLEL=false
go get github.com/gardener/test-infra; cd /go/src/github.com/gardener/test-infra
export GO111MODULE=on
go run -mod=vendor ./integration-tests/e2e -debug=true -k8sVersion=1.16.1 -cloudprovider=azure -testcasegroup="conformance"
echo "testsuite finished"

Which internally runs:

kubetest --provider=skeleton  --extract=v1.16.2 --deployment=local --test --check-version-skew=false --test_args=--clean-start --ginkgo.dryRun=false --ginkgo.focus=\\[k8s\\.io\\]\\sSecurity\\sContext\\sWhen\\screating\\sa\\spod\\swith\\sprivileged\\sshould\\srun\\sthe\\scontain... -dump=/tmp/e2e.log

from k8s-conformance.

timothysc avatar timothysc commented on July 17, 2024

I don't have time to dig into why there is a delta @johnSchnake can you look quick to see what the delta is here?

from k8s-conformance.

johnSchnake avatar johnSchnake commented on July 17, 2024

Making notes as I review this:

  • The line Conformance test: not doing test setup. comes from hack/ginkgo_e2e.sh which would make me assume this wasn't generated via running Sonobuoy (which is fine, but just FYI). Sonobuoy is recommended but not required.
  • The instructions for the first item in your list sap-cp-aws gives instructions that refer to a URL that gives me a 404. It indicates that they ran the tests via Sonobuoy using some stored YAML manifest. This, in theory, would be an acceptable way to run the tests (you dont have to use the CLI) but again the logs dont seem to indicate it was run via Sonobuoy (which doesn't use the ginkgo_e2e.sh script)
  • I ran sonobuoy run --plugin-env e2e.E2E_DRYRUN=true --kube-conformance-image-version=v1.15.2 to check the list of conformance tests for that version and confirmed (as stated in the ticket) that 215 tests are conformance tests despite only 212 being run on these.
  • The missing tests in the first case (we'd have to confirm them for the others) are:
    • [sig-apps] Job should delete a job [Conformance]
    • [sig-apps] ReplicationController should surface a failure condition on a common issue like exceeded quota [Conformance]
    • [sig-network] DNS should provide DNS for ExternalName services [Conformance]

I will take a look and see if I can find a reason why those wouldn't have been chosen either by that hack script or some other reason.

What needs to happen IMO going forward is:

  • the logs should print the focus/skip values used for ginkgo. This is printed in the pod logs but not as part of the e2e.log output.
  • A validation script automates the checks that should occur (# of tests pass/skip/etc and the versions of the server/test match)

from k8s-conformance.

hh avatar hh commented on July 17, 2024

/assign @hh @johnSchnake

from k8s-conformance.

johnSchnake avatar johnSchnake commented on July 17, 2024

All 3 of those tests were also added to Conformance in v1.15; but the SHA in the logs for the test matches the SHA in my test run with 215 test so I don't think that could account for it.

@OlegLoewen You were the one to submit for at least some of these; could you clarify how these runs were obtained? They indicate they used a sonobuoy yaml script which hasn't existed for some time and also wouldn't have matched the version of tests used here.

from k8s-conformance.

hh avatar hh commented on July 17, 2024

@johnSchnake and I will take a closer look at this together tomorrow

from k8s-conformance.

MarcelMue avatar MarcelMue commented on July 17, 2024

@johnSchnake and I will take a closer look at this together tomorrow

Let me know if I can helpful in any way.

I looked through the docs for conformance tests and noticed that the Reviewing Guidelines only make loose statements on the number of test cases in number 5: https://github.com/cncf/k8s-conformance/blob/master/reviewing.md#technical-requirements
Would it make sense to set the number of test cases? Changing the number of test cases would then also require an update to the reviewing guidelines. But some verification through CI as you mentioned previously would be superior.

Additionally I noticed that the wording around sonobuoy usage is quite vague. I understand that it is only recommended from what @johnSchnake wrote - however the content description for a PR here makes it seem like sonobuoy is required (which was my understanding so far).

Some guidelines around which execution methods for the test cases are actually allowed would help debugging situations like this IMO. I am not sure how feasible that would be as I have no overview of the currently used execution methods except sonobuoy.

from k8s-conformance.

johnSchnake avatar johnSchnake commented on July 17, 2024

The config here: https://github.com/gardener/test-infra/blob/68c3d60171fcb7d36b39d86935d14f5ea55064d6/integration-tests/e2e/kubetest/description/1.16/working.json#L3 makes it look the link between the testcasegroup you mention and how to set the focus/skip on the actual run, is that correct?

FWIW, for conformance certification runs you should be just setting the focus to [Conformance] and not setting a skip value at all. I understand that in basic CI situations you may want to skip serial/slow though (Sonobuoy does something similar to avoid disruptive tests).

Here is the dir for all the v1.15 tests https://github.com/gardener/test-infra/tree/68c3d60171fcb7d36b39d86935d14f5ea55064d6/integration-tests/e2e/kubetest/description/1.15 and it just seems like those tests aren't listed in any of the working/skip/false-positive files so I must be misunderstanding a component of how the test list is run.

from k8s-conformance.

OlegLoewen avatar OlegLoewen commented on July 17, 2024

The mentioned test cases weren't conformance-tagged in previous k8s release version. So since I have reused the description file of previous k8s release version for 1.15, these test cases were already assigned to the group fast/slow but not conformance, which is the reason why they have been discarded during the test suite execution. This commit is the quick fix, but I will think of something to prevent this kind of an issue in future. Also the commit somehow helps to understand the actual issue/bug.

Regarding fokus skip fields in the description file:

Our main test case groups are fast, slow and conformance (which can be combined). Test cases of the fast group are executed parallely with --ginkgo-parallel=8 to have faster results. If we run only the fast group, we still want to have the fast conformance test cases to be run as well but we don't want to have any serial or slow tagged conformance tests. To accomplish this, we use the skip and focus fields.

If we run e2e with -testcasegroup=conformance, then these line are taking into account to calculate the conformance testcases:

  { "testcase": "[Conformance]", "focus": "Serial|Slow", "groups": ["slow", "conformance"]},
  { "testcase": "[Conformance]", "skip": "Serial|Slow", "groups": ["fast", "conformance"]},

FWIW, for conformance certification runs you should be just setting the focus to [Conformance] and not setting a skip value at all [...]

If this is a requirement and using a list of test cases is not an alternative, we can do this as well.

from k8s-conformance.

johnSchnake avatar johnSchnake commented on July 17, 2024

I think this raises good questions about what is required for conformance certification.

As I read it, Sonobuoy is optional but that the conformance run should be executed in a single test run rather than multiple which are spliced together. I think that makes the most sense as the instructions request a single log and junit result file.

You could argue that you want to merge the junit results together but in that case you'd at least need to upload all of the log files separately.

In addition, allowing N runs to be run at separate times but combined to meet certification seems like a loosening of requirements since clusters could undergo changes between the different runs. Everything is much more clear/standardized if we just require a single run (though again, parallelizing for CI makes sense).

from k8s-conformance.

hh avatar hh commented on July 17, 2024

I agree with @johnSchnake regarding enforcement a specific list of tests in a single test run.

I've created a ticket for creating a Prow Job to run before submissions can be accepted.

#760

from k8s-conformance.

MarcelMue avatar MarcelMue commented on July 17, 2024

AFAICT test runs are now automatically checked for completeness. Should this issue be closed?

from k8s-conformance.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.