Code Monkey home page Code Monkey logo

bots's Introduction

Cockpit Bots

These are automated bots and tools that work on Cockpit. This includes updating operating system images, updating translations or NPM modules, testing PRs, and more.

Images

In order to test Cockpit-related projects, they are staged into an operating system image. These images are tracked in the images/ directory. For example, you might want to test a scenario where Cockpit on one machine talks to FreeIPA on another, and you want those two machines to use different images.

This is handled by passing a specific image to image-create and other scripts that work with test machine images. Available images include:

  • fedora-*, rhel-*, debian-*, etc: Various operating systems for testing Cockpit related projects
  • services: Auxiliary network services for tests which are independent from the OS where Cockpit runs: FreeIPA, Samba AD, candlepin, Grafana

These well known image names are expected to contain no . characters and have no file name extension.

Individual projects are expected to locally build their code into packages, and install them as overlay on top of these pristine images, with image-customize or using the machine Python API.

For managing these images:

  • image-download: Download selected or all test images
  • image-create: Create test machine images from scratch (usually through downloading a cloud image), with common build and test dependencies for Cockpit projects preinstalled
  • image-upload: Upload a locally built test image to the official image servers

For running and debugging the images:

  • image-customize: Install packages, upload files, or run commands in a test machine image; this keeps the original image intact, and puts the changes into an image overlay into test/images/.
  • vm-run: Run a test machine image; by default this happens in an ephemeral overlay. You can use the --maintain option to write into the persistent overlay in test/images/ instead.
  • vm-reset: Remove all overlays from test/images/

Image location

Downloaded images are stored into ~/.cache/cockpit-images/ by default. If you want to change that, you can set $COCKPIT_IMAGES_DATA_DIR or the cockpit.bots.images-data-dir variable with git config to a directory where to store the pristine virtual machine images. For example:

git config cockpit.bots.images-data-dir /srv/cockpit/images

Tests

The bots automatically run the tests as needed on pull requests and branches. To check when and where tests will be run, use the tests-scan tool:

./tests-scan -vd

Note on eslintrc interaction

As eslint looks for additional configurations, eslintrc.(json|yaml) files, in parent directories, it is recommended to have "root": true in the eslint configuration of any project which is using eslint and is tested through cockpit-bots.

Integration with GitHub

A number of machines are watching our GitHub repositories and are executing tests for pull requests as well as making new images.

Most of this happens automatically, but you can influence their actions with the tests-trigger utility in this directory.

Setup

You need a GitHub token in ~/.config/cockpit-dev/github-token or from the GitHub CLI configuration in ~/.config/gh/config.yml. You can create one for your account at Developer Settings โ†’ Personal access tokens.

When generating a new personal access token, the scopes should contain repo:status and read:org. Note in particular, that repo and public_repo scopes each grant full push access, and should not be used.

You need at least "Write" access to the project for triggering statuses, either individually per repo (e.g. cockpit or for all cockpit-project repos.

If you'd like to download Red Hat-only internal images from S3, you'll need to create a key file in ~/.config/cockpit-dev/s3-keys/[domain]. The [domain] can be any non-toplevel domain which contains the S3 URL in question. The contents of this file should be a single line containing the "access key" and the "secret key" separated by whitespace.

For the currently configured mirrors this means that you'd likely have the following file:

  • ~/.config/cockpit-dev/s3-keys/linodeobjects.com

For more control, you could also use the following:

  • ~/.config/cockpit-dev/s3-keys/cockpit-images.eu-central-1.linodeobjects.com
  • ~/.config/cockpit-dev/s3-keys/eu-central-1.linodeobjects.com
  • either of the above, with us-east instead of eu-central

each file would be a single line which looks like

EEVIDIDFSOQ0ABJ2LGTT    009rKOypIoqO44Q3VQGRyYPfugi84zANHF0pOW9f

The "access key" and "secret key" is unique per-developer and can be obtained by talking to Allison.

Test contexts

For describing tests which we want to run we use contexts. A context has the form:

image[/scenario][@bots#bots_pr][@owner/project/ref]

where items have the following meaning:

  • image: Name of the image on which tests should run (e.g. 'fedora-coreos').
  • scenario: Name of a specific test. This is specific for each separate project and is passed verbatim to 'test/run' in $TEST_SCENARIO.
  • bots_pr: Number of pull request that exists in bots repository. When specified, bots from this PR would be used instead of main.
  • owner/project: Name of github project (e.g. 'cockpit-project/cockpit'). This part can be omitted when testing in the same project and no 'ref' is needed.
  • ref: Reference in the project (usually branch) (e.g. 'rhel-8.2'). Default is the project's primary branch.

For example, context for scenario 'firefox' on 'fedora-coreos' is:

fedora-coreos/firefox

If we want to trigger it on 'cockpit-project/cockpit':

fedora-coreos/firefox@cockpit-project/cockpit

If we want to also not run it on the primary branch, but on 'rhel-8-0' branch:

fedora-coreos/firefox@cockpit-project/cockpit/rhel-8-0

If we want to run tests on 'fedora-coreos' but with bots from pull request '169':

fedora-coreos@bots#169

Retrying a failed test

If you want to run the "fedora-coreos" testsuite again for pull request #1234 of cockpit-project/cockpit, run tests-trigger like so:

./tests-trigger --repo cockpit-project/cockpit 1234 fedora-coreos

You can also invoke bots/tests/trigger from any project checkout, in which case you don't need the explicit --repo -- it will default to the GitHub origin of the current directory's project.

Testing a pull request by a non-allowed user

If you want to run all tests on pull request #1234 that has been opened by someone who does not have push access to the repository nor isn't in the allowlist run tests-trigger with --allow:

./tests-trigger --allow [...]

Of course, you should make sure that the pull request is proper and doesn't execute evil code during tests.

tests-trigger with a different origin

If you need to specify --repo in tests-trigger as your remote is different from cockpit-project/cockpit, you can set a git configuration option from which tests-trigger reads the repo. This has to be set per cockpit project.

git config cockpit.bots.github-repo cockpit-project/cockpit

Refreshing a test image

Test images are refreshed automatically once per week, and even if the last refresh has failed, the machines wait one week before trying again.

If you want the machines to refresh the fedora-coreos image immediately, run image-trigger like so:

./image-trigger fedora-coreos

Creating new images for a pull request

If as part of some new feature you need to change the content of some or all images, you can ask the machines to create those images.

If you want to have a new fedora-coreos image for pull request #1234, add a bullet point to that pull request's description like so, and add the "bot" label to the pull request.

* [ ] image-refresh fedora-coreos

The machines will post comments to the pull request about their progress and at the end there will be links to commits with the new images. You can then include these commits into the pull request in any way you like.

Updating CI to a new Fedora release

TEST_OS_DEFAULT is usually set to the latest (stable) Fedora released, used as default OS for test VMs.

  1. If this is a new image, add _manual test contexts for the new image to lib/testmap.py, and land that into main.
  2. Create a PR that updates TEST_OS_DEFAULT in lib/constants.py, and trigger all tests for that image there.

Fedora CoreOS

The Fedora CoreOS image is updated to a new Fedora release out of our control, when this occurs:

  1. Update the naughty symlink naughty/fedora-coreos to the release CoreOS uses.
  2. Update OSTREE_BUILD_IMAGE to point to the Fedora release CoreOS uses.

Pixel tests

The pixel tests used in Cockpit projects use test/reference-image to determine what image to run the pixel tests on.

  1. Create a PR which updates test/reference-image.
  2. Update the pixel tests if required.

bots's People

Contributors

allisonkarlitskaya avatar atodorov avatar bcl avatar cockpituous avatar croissanne avatar henrywang avatar jelly avatar jikortus avatar jkonecny12 avatar jkozol avatar jrusz avatar jscotka avatar kkoukiou avatar larskarlitski avatar lgtm-migrator avatar lunarequest avatar m4rtink avatar martinpitt avatar marusak avatar mvollmer avatar nykseli avatar ptoscano avatar skobyda avatar subhoghoshx avatar tomasmatus avatar velezd avatar vladimirslavik avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bots's Issues

Update stale known issues

Check for any stale known issues and open a pull request to close them

  • naughty-prune

edit to test

RHEL 8.2 regression: setting time fails: avc: denied { sys_time } for comm="timedatex"

This breaks three tests. Symptom:

# timedatectl set-time '2020-01-01 15:30:00'
audit: type=1400 audit(1574167631.911:5): avc:  denied  { sys_time } for  pid=1501 comm="timedatex" capability=25  scontext=system_u:system_r:timedatex_t:s0 tcontext=system_u:system_r:timedatex_t:s0 tclass=capability permissive=0

Failed to set time: Failed to set system clock: Operation not permitted

Downstream report: https://bugzilla.redhat.com/show_bug.cgi?id=177401

Follow-up report: https://bugzilla.redhat.com/show_bug.cgi?id=1779098

image-customize --install is not idempotent

It uses dnf install by default, which doesn't reinstall the package if it's already installed.

This is not easily solved by using --install-command, because one needs to remove the package first and the install it (dnf reinstall fails in the other case: when the package is not yet installed or at an older version).

Traceback with tests-trigger without GITHUB_BASE

$ ./bots/tests-trigger --repo weldr/lorax 840 fedora-30/vmware
error: unknown option `default='
usage: git config [options]

Config file location
    --global              use global config file
    --system              use system config file
    --local               use repository config file
    -f, --file <file>     use given config file
    --blob <blob-id>      read config from given blob object

Action
    --get                 get value: name [value-regex]
    --get-all             get all values: key [value-regex]
    --get-regexp          get values for regexp: name-regex [value-regex]
    --replace-all         replace all matching variables: name value [value_regex]
    --add                 add a new variable: name value
    --unset               remove a variable: name [value-regex]
    --unset-all           remove all matches: name [value-regex]
    --rename-section      rename section: old-name new-name
    --remove-section      remove a section: name
    -l, --list            list all
    -e, --edit            open an editor
    --get-color <slot>    find the color configured: [default]
    --get-colorbool <slot>
                          find the color setting: [stdout-is-tty]

Type
    --bool                value is "true" or "false"
    --int                 value is decimal number
    --bool-or-int         value is --bool or --int
    --path                value is a path (file or directory name)

Other
    -z, --null            terminate values with NUL byte
    --includes            respect include directives on lookup

Traceback (most recent call last):
  File "./bots/tests-trigger", line 8, in <module>
    from task import github
  File "/home/atodorov/repos/git/weldr/lorax/bots/task/__init__.py", line 60, in <module>
    BASE = BOTS if api.repo == "cockpit-project/bots" else os.path.normpath(os.path.join(BOTS, ".."))
  File "/home/atodorov/repos/git/weldr/lorax/bots/task/github.py", line 152, in repo
    self._repo = os.environ.get("GITHUB_BASE", None) or get_repo() or get_origin_repo()
  File "/home/atodorov/repos/git/weldr/lorax/bots/task/github.py", line 105, in get_repo
    res = subprocess.check_output(['git', 'config', '--default=', 'cockpit.bots.github-repo'])
  File "/opt/rh/rh-python36/root/usr/lib64/python3.6/subprocess.py", line 336, in check_output
    **kwargs).stdout
  File "/opt/rh/rh-python36/root/usr/lib64/python3.6/subprocess.py", line 418, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['git', 'config', '--default=', 'cockpit.bots.github-repo']' returned non-zero exit status 129.

My git version is git-1.8.3.1-20.el7.x86_64 (RHEL 7 latest) which doesn't support the --default option for git config. However the same command in the same form was working for me before (I haven't tried it in the past few weeks I think, not before bots moved to this repo).

Workaround:

$ GITHUB_BASE=weldr/lorax ./bots/tests-trigger --repo weldr/lorax 840 fedora-30/vmware
fedora-30/vmware: triggering on pull request 840

cc @martinpitt

rhsmd gets lots of write SELinux violations on Atomic

See downstream issue: https://bugzilla.redhat.com/show_bug.cgi?id=1556763

Spotted in cockpit-project/cockpit#8822, example failure:

Traceback (most recent call last):
  File "/build/cockpit/test/common/testlib.py", line 664, in tearDown
    self.check_journal_messages()
  File "/build/cockpit/test/common/testlib.py", line 802, in check_journal_messages
    raise Error(first)
Error: type=1400 audit(1521047735.710:5): avc:  denied  { write } for  pid=1506 comm="rhsmd" name="encodings" dev="dm-0" ino=6446470 scontext=system_u:system_r:rhsmcertd_t:s0-s0:c0.c1023 tcontext=system_u:object_r:lib_t:s0 tclass=dir

not ok 19 testExternalPage (check_multi_machine.TestMultiMachine) # duration: 142s

bots: Move test maps out of tests-scan into the projects

Now that we have other repositories using bots/ and depending on Cockpit CI infra, namely lorax, I think it makes sense to try and read the context from a file inside the current repository, not hard-code it in bots/tests-scan.

Shouldn't be too hard to implement, let me know what you think ?

Podman cannot load stats from rootless containers

Reported: containers/podman#4268
Logs in console: > warning: Failed to update container stats: {"error":"io.podman.ErrorOccurred","parameters":{"reason":"Link not found"}}
Fails with:

Traceback (most recent call last):
  File "test/check-application", line 441, in testLifecycleOperationsUser
    self._testLifecycleOperations(False)
  File "test/check-application", line 477, in _testLifecycleOperations
    self.assertIn('%', cpu)
AssertionError: '%' not found in ''

Selinux prevents setting timezone via timedatectl, prevents write to /etc/localtime

OS: fedora-31

Selinux prevents setting timezone via timedatectl.

timedatectl set-timezone Europe/Helsinki

> Failed to set time zone: Failed to update /etc/localtime

audit: type=1400 audit(1567689788.617:317): avc:  denied  { write } for  pid=7096 comm="timedatex" name="etc" dev="dm-0" ino=130051 scontext=system_u:system_r:timedatex_t:s0 tcontext=system_u:object_r:etc_t:s0 tclass=dir permissive=0

https://bugzilla.redhat.com/show_bug.cgi?id=1749375

Debian: parted clobbers in-memory state of extended partitions

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=842269

See also https://bugzilla.redhat.com/show_bug.cgi?id=1135493 for the original.

This has been fixed with a path in Fedora only, it wasn't fixed upstream. Thus, debian-unstable suffers from it.

Basically, parted tells the kernel the wrong size for extended partitions.

Apparently, the bug was masked on debian-unstable by systemd-udevd independently re-reading the partition table. However, recent versions of Storaged block systemd-udevd from re-reading since that causes other bugs. Thus, this bug is now visible.

udisksd crash in check-storage-mdraid

Refreshing the debian-testing image (PR cockpit-project/cockpit#7812) updates udisks from 2.6.5 to 2.7.3 (glib and polkit stay the same), which triggers a new crash:

# testNotRemovingDisks (check_storage_mdraid.TestStorage)
#
Warning: Permanently added '[127.0.0.2]:2301' (ECDSA) to the list of known hosts.
Traceback (most recent call last):
  File "/build/cockpit/bots/../test/verify/check-storage-mdraid", line 284, in testNotRemovingDisks
    b.wait_in_text(info_field("State"), "Running")
  File "/build/cockpit/test/common/testlib.py", line 257, in wait_in_text
    return self.wait_js_func('ph_in_text', selector, text)
  File "/build/cockpit/test/common/testlib.py", line 224, in wait_js_func
    return self.phantom.wait("%s(%s)" % (func, ','.join(map(jsquote, args))))
  File "/build/cockpit/test/common/testlib.py", line 825, in <lambda>
    return lambda *args: self._invoke(name, *args)
  File "/build/cockpit/test/common/testlib.py", line 851, in _invoke
    raise Error(res['error'])
Error: timeout

This is due to udisksd crashing. I filed an upstream report at storaged-project/udisks#422 and use this one as naughty override.

RHEL, CentOS, Fedora, Debian, Ubuntu: PCP libraries crash in __pmFindProfile()

PR cockpit-project/cockpit#6102 apparently aggravates (due to changed timing) this flaky test failure:

not ok 3 testFrameNavigation (__main__.TestMultiMachine) duration: 27s
Traceback (most recent call last):
  File "test/verify/check-multi-machine", line 202, in tearDown
    MachineCase.tearDown(self)
  File "/home/martin/upstream/cockpit/test/common/testlib.py", line 533, in tearDown
    self.check_journal_messages()
  File "/home/martin/upstream/cockpit/test/common/testlib.py", line 689, in check_journal_messages
    raise Error(first)
Error: /usr/libexec/cockpit-pcp: bridge was killed: 11

This was reported a while ago as https://bugzilla.redhat.com/show_bug.cgi?id=1235962 and I confirmed it with pcp 3.11.8-2.fc25.x86_64. That downstream bug has the strack trace and some initial analysis. Filing this one to use it as "known issue" naughty quirk.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.