cockpit-project / bots Goto Github PK

*beep* *boop* *beep*

License: GNU Lesser General Public License v2.1

Shell 14.65% Python 76.23% Dockerfile 0.01% HTML 9.11%

bots's Introduction

Cockpit Bots

These are automated bots and tools that work on Cockpit. This includes updating operating system images, updating translations or NPM modules, testing PRs, and more.

Images

In order to test Cockpit-related projects, they are staged into an operating system image. These images are tracked in the images/ directory. For example, you might want to test a scenario where Cockpit on one machine talks to FreeIPA on another, and you want those two machines to use different images.

This is handled by passing a specific image to image-create and other scripts that work with test machine images. Available images include:

fedora-*, rhel-*, debian-*, etc: Various operating systems for testing Cockpit related projects
services: Auxiliary network services for tests which are independent from the OS where Cockpit runs: FreeIPA, Samba AD, candlepin, Grafana

These well known image names are expected to contain no . characters and have no file name extension.

Individual projects are expected to locally build their code into packages, and install them as overlay on top of these pristine images, with image-customize or using the machine Python API.

For managing these images:

image-download: Download selected or all test images
image-create: Create test machine images from scratch (usually through downloading a cloud image), with common build and test dependencies for Cockpit projects preinstalled
image-upload: Upload a locally built test image to the official image servers

For running and debugging the images:

image-customize: Install packages, upload files, or run commands in a test machine image; this keeps the original image intact, and puts the changes into an image overlay into test/images/.
vm-run: Run a test machine image; by default this happens in an ephemeral overlay. You can use the --maintain option to write into the persistent overlay in test/images/ instead.
vm-reset: Remove all overlays from test/images/

Image location

Downloaded images are stored into ~/.cache/cockpit-images/ by default. If you want to change that, you can set $COCKPIT_IMAGES_DATA_DIR or the cockpit.bots.images-data-dir variable with git config to a directory where to store the pristine virtual machine images. For example:

git config cockpit.bots.images-data-dir /srv/cockpit/images

Tests

The bots automatically run the tests as needed on pull requests and branches. To check when and where tests will be run, use the tests-scan tool:

./tests-scan -vd

Note on eslintrc interaction

As eslint looks for additional configurations, eslintrc.(json|yaml) files, in parent directories, it is recommended to have "root": true in the eslint configuration of any project which is using eslint and is tested through cockpit-bots.

Integration with GitHub

A number of machines are watching our GitHub repositories and are executing tests for pull requests as well as making new images.

Most of this happens automatically, but you can influence their actions with the tests-trigger utility in this directory.

Setup

You need a GitHub token in ~/.config/cockpit-dev/github-token or from the GitHub CLI configuration in ~/.config/gh/config.yml. You can create one for your account at Developer Settings → Personal access tokens.

When generating a new personal access token, the scopes should contain repo:status and read:org. Note in particular, that repo and public_repo scopes each grant full push access, and should not be used.

You need at least "Write" access to the project for triggering statuses, either individually per repo (e.g. cockpit or for all cockpit-project repos.

If you'd like to download Red Hat-only internal images from S3, you'll need to create a key file in ~/.config/cockpit-dev/s3-keys/[domain]. The [domain] can be any non-toplevel domain which contains the S3 URL in question. The contents of this file should be a single line containing the "access key" and the "secret key" separated by whitespace.

For the currently configured mirrors this means that you'd likely have the following file:

~/.config/cockpit-dev/s3-keys/linodeobjects.com

For more control, you could also use the following:

~/.config/cockpit-dev/s3-keys/cockpit-images.eu-central-1.linodeobjects.com
~/.config/cockpit-dev/s3-keys/eu-central-1.linodeobjects.com
either of the above, with us-east instead of eu-central

each file would be a single line which looks like

EEVIDIDFSOQ0ABJ2LGTT    009rKOypIoqO44Q3VQGRyYPfugi84zANHF0pOW9f

The "access key" and "secret key" is unique per-developer and can be obtained by talking to Allison.

Test contexts

For describing tests which we want to run we use contexts. A context has the form:

image[/scenario][@bots#bots_pr][@owner/project/ref]

where items have the following meaning:

image: Name of the image on which tests should run (e.g. 'fedora-coreos').
scenario: Name of a specific test. This is specific for each separate project and is passed verbatim to 'test/run' in $TEST_SCENARIO.
bots_pr: Number of pull request that exists in bots repository. When specified, bots from this PR would be used instead of main.
owner/project: Name of github project (e.g. 'cockpit-project/cockpit'). This part can be omitted when testing in the same project and no 'ref' is needed.
ref: Reference in the project (usually branch) (e.g. 'rhel-8.2'). Default is the project's primary branch.

For example, context for scenario 'firefox' on 'fedora-coreos' is:

fedora-coreos/firefox

If we want to trigger it on 'cockpit-project/cockpit':

fedora-coreos/firefox@cockpit-project/cockpit

If we want to also not run it on the primary branch, but on 'rhel-8-0' branch:

fedora-coreos/firefox@cockpit-project/cockpit/rhel-8-0

If we want to run tests on 'fedora-coreos' but with bots from pull request '169':

fedora-coreos@bots#169

Retrying a failed test

If you want to run the "fedora-coreos" testsuite again for pull request #1234 of cockpit-project/cockpit, run tests-trigger like so:

./tests-trigger --repo cockpit-project/cockpit 1234 fedora-coreos

You can also invoke bots/tests/trigger from any project checkout, in which case you don't need the explicit --repo -- it will default to the GitHub origin of the current directory's project.

Testing a pull request by a non-allowed user

If you want to run all tests on pull request #1234 that has been opened by someone who does not have push access to the repository nor isn't in the allowlist run tests-trigger with --allow:

./tests-trigger --allow [...]

Of course, you should make sure that the pull request is proper and doesn't execute evil code during tests.

tests-trigger with a different origin

If you need to specify --repo in tests-trigger as your remote is different from cockpit-project/cockpit, you can set a git configuration option from which tests-trigger reads the repo. This has to be set per cockpit project.

git config cockpit.bots.github-repo cockpit-project/cockpit

Refreshing a test image

Test images are refreshed automatically once per week, and even if the last refresh has failed, the machines wait one week before trying again.

If you want the machines to refresh the fedora-coreos image immediately, run image-trigger like so:

./image-trigger fedora-coreos

Creating new images for a pull request

If as part of some new feature you need to change the content of some or all images, you can ask the machines to create those images.

If you want to have a new fedora-coreos image for pull request #1234, add a bullet point to that pull request's description like so, and add the "bot" label to the pull request.

* [ ] image-refresh fedora-coreos

The machines will post comments to the pull request about their progress and at the end there will be links to commits with the new images. You can then include these commits into the pull request in any way you like.

Updating CI to a new Fedora release

TEST_OS_DEFAULT is usually set to the latest (stable) Fedora released, used as default OS for test VMs.

If this is a new image, add _manual test contexts for the new image to lib/testmap.py, and land that into main.
Create a PR that updates TEST_OS_DEFAULT in lib/constants.py, and trigger all tests for that image there.

Fedora CoreOS

The Fedora CoreOS image is updated to a new Fedora release out of our control, when this occurs:

Update the naughty symlink naughty/fedora-coreos to the release CoreOS uses.
Update OSTREE_BUILD_IMAGE to point to the Fedora release CoreOS uses.

Pixel tests

The pixel tests used in Cockpit projects use test/reference-image to determine what image to run the pixel tests on.

Create a PR which updates test/reference-image.
Update the pixel tests if required.

bots's People

Contributors

Stargazers

Watchers

bots's Issues

Update naughties for `testlib.Error`, or fix the format

PR cockpit-project/cockpit#9571 moves verify tests to Python 3, which changes the error output from Error: ... to testlib.Error:. tests-policy normalizes that to Error:. Either find a clever way to make the exception print out the old format, or change the naughties and normalize into the other direction for Python 2 test runs.

Update stale known issues

Check for any stale known issues and open a pull request to close them

naughty-prune

_{_{edit to test}}

Update stale known issues

Check for any stale known issues and open a pull request to close them

naughty-prune

Update stale known issues

Check for any stale known issues and open a pull request to close them

naughty-prune

Update stale known issues

Check for any stale known issues and open a pull request to close them

naughty-prune

RHEL 8.2 regression: setting time fails: avc: denied { sys_time } for comm="timedatex"

This breaks three tests. Symptom:

# timedatectl set-time '2020-01-01 15:30:00'
audit: type=1400 audit(1574167631.911:5): avc:  denied  { sys_time } for  pid=1501 comm="timedatex" capability=25  scontext=system_u:system_r:timedatex_t:s0 tcontext=system_u:system_r:timedatex_t:s0 tclass=capability permissive=0

Failed to set time: Failed to set system clock: Operation not permitted

Downstream report: https://bugzilla.redhat.com/show_bug.cgi?id=177401

Follow-up report: https://bugzilla.redhat.com/show_bug.cgi?id=1779098

NetworkManager: regression with autoconnect-slaves at creation time

https://bugzilla.redhat.com/show_bug.cgi?id=1548265

debian-stable: UDisks2 fails to create snapshots of thin volumes

Calling CreateSnapshot for thin volumes ends up using uninitialized memory, which sends UDisks2 on the wrong code path: storaged-project/udisks#617

The symptom is this error message:

Error creating snapshot: Process reported exit code 3:   --size may not be zero.  Run `lvcreate --help' for more information.

SELinux denies map/search/write to dnsmasq for NetworkManager

Downstream bug: ~~https://bugzilla.redhat.com/show_bug.cgi?id=1598506~~ (fixed now), and https://bugzilla.redhat.com/show_bug.cgi?id=1697227

type=1400 audit(1530804601.908:4): avc:  denied  { map } for  pid=3052 comm="dnsmasq" path="/usr/sbin/dnsmasq" dev="dm-0" ino=714988 scontext=system_u:system_r:NetworkManager_t:s0 tcontext=system_u:object_r:dnsmasq_exec_t:s0 tclass=file permissive=0

Image refresh for debian-testing

FAIL: image-refresh debian-testing

rpm-ostreed: DownloadUpdateRpmDiff leaves CachedUpdate property empty

coreos/rpm-ostree#1300

rhel-atomic: SELinux is preventing chronyd from sendto access on the unix_dgram_socket

SELinux violations like:
type=1400 audit(1556197202.146:5): avc: denied { sendto } for pid=2758 comm="chronyd" path="/host/run/chrony/chronyc.5500.sock" scontext=system_u:system_r:chronyd_t:s0 tcontext=system_u:system_r:spc_t:s0 tclass=unix_dgram_socket permissive=0

Downstream report: https://bugzilla.redhat.com/show_bug.cgi?id=1703192

Image refresh for fedora-29

FAIL: image-refresh fedora-29

image-customize --install is not idempotent

It uses dnf install by default, which doesn't reinstall the package if it's already installed.

This is not easily solved by using --install-command, because one needs to remove the package first and the install it (dnf reinstall fails in the other case: when the package is not yet installed or at an older version).

rhel-7-9, centos-7: SELinux: avc: denied { getattr } comm="iscsid"

Downstream issue: https://bugzilla.redhat.com/show_bug.cgi?id=1698702

Update stale known issues

Check for any stale known issues and open a pull request to close them

naughty-prune

Denied NetworkManager write to /var/tmp/dracut.*/systemd-cat while generating initrd image

https://bugzilla.redhat.com/show_bug.cgi?id=1750428

rhel-7-9, Centos-7: SELinux is preventing /usr/lib/systemd/systemd-machined from 'search' accesses

caused by restarting libvirt (from PR cockpit-project/cockpit#8493)

SELinux

Error: type=1400 audit(1521740509.455:5): avc:  denied  { search } for  pid=1514 comm="systemd-machine" name="1513" dev="proc" ino=27512 scontext=system_u:system_r:systemd_machined_t:s0 tcontext=system_u:system_r:svirt_tcg_t:s0:c222,c949 tclass=dir

Downstream issue: https://bugzilla.redhat.com/show_bug.cgi?id=1444754

Image refresh for debian-testing

FAIL: image-refresh debian-testing

Image refresh for rhel-7-8

FAIL: image-refresh rhel-7-8

abrtd doesn't report crash function on fedora 29 (with debuginfo installed)

Fails in check-journal, python3-debuginfo is installed, systemd-coredump reports accurate backtrace, yet abrtd-notification reports ??().

Traceback with tests-trigger without GITHUB_BASE

$ ./bots/tests-trigger --repo weldr/lorax 840 fedora-30/vmware
error: unknown option `default='
usage: git config [options]

Config file location
    --global              use global config file
    --system              use system config file
    --local               use repository config file
    -f, --file <file>     use given config file
    --blob <blob-id>      read config from given blob object

Action
    --get                 get value: name [value-regex]
    --get-all             get all values: key [value-regex]
    --get-regexp          get values for regexp: name-regex [value-regex]
    --replace-all         replace all matching variables: name value [value_regex]
    --add                 add a new variable: name value
    --unset               remove a variable: name [value-regex]
    --unset-all           remove all matches: name [value-regex]
    --rename-section      rename section: old-name new-name
    --remove-section      remove a section: name
    -l, --list            list all
    -e, --edit            open an editor
    --get-color <slot>    find the color configured: [default]
    --get-colorbool <slot>
                          find the color setting: [stdout-is-tty]

Type
    --bool                value is "true" or "false"
    --int                 value is decimal number
    --bool-or-int         value is --bool or --int
    --path                value is a path (file or directory name)

Other
    -z, --null            terminate values with NUL byte
    --includes            respect include directives on lookup

Traceback (most recent call last):
  File "./bots/tests-trigger", line 8, in <module>
    from task import github
  File "/home/atodorov/repos/git/weldr/lorax/bots/task/__init__.py", line 60, in <module>
    BASE = BOTS if api.repo == "cockpit-project/bots" else os.path.normpath(os.path.join(BOTS, ".."))
  File "/home/atodorov/repos/git/weldr/lorax/bots/task/github.py", line 152, in repo
    self._repo = os.environ.get("GITHUB_BASE", None) or get_repo() or get_origin_repo()
  File "/home/atodorov/repos/git/weldr/lorax/bots/task/github.py", line 105, in get_repo
    res = subprocess.check_output(['git', 'config', '--default=', 'cockpit.bots.github-repo'])
  File "/opt/rh/rh-python36/root/usr/lib64/python3.6/subprocess.py", line 336, in check_output
    **kwargs).stdout
  File "/opt/rh/rh-python36/root/usr/lib64/python3.6/subprocess.py", line 418, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['git', 'config', '--default=', 'cockpit.bots.github-repo']' returned non-zero exit status 129.

My git version is git-1.8.3.1-20.el7.x86_64 (RHEL 7 latest) which doesn't support the --default option for git config. However the same command in the same form was working for me before (I haven't tried it in the past few weeks I think, not before bots moved to this repo).

Workaround:

$ GITHUB_BASE=weldr/lorax ./bots/tests-trigger --repo weldr/lorax 840 fedora-30/vmware
fedora-30/vmware: triggering on pull request 840

cc @martinpitt

libvirtd sometimes crashes on rhel-8-1 when destroying guests

Downstream bug: https://bugzilla.redhat.com/show_bug.cgi?id=1728530

rhsmd gets lots of write SELinux violations on Atomic

See downstream issue: https://bugzilla.redhat.com/show_bug.cgi?id=1556763

Spotted in cockpit-project/cockpit#8822, example failure:

Traceback (most recent call last):
  File "/build/cockpit/test/common/testlib.py", line 664, in tearDown
    self.check_journal_messages()
  File "/build/cockpit/test/common/testlib.py", line 802, in check_journal_messages
    raise Error(first)
Error: type=1400 audit(1521047735.710:5): avc:  denied  { write } for  pid=1506 comm="rhsmd" name="encodings" dev="dm-0" ino=6446470 scontext=system_u:system_r:rhsmcertd_t:s0-s0:c0.c1023 tcontext=system_u:object_r:lib_t:s0 tclass=dir

not ok 19 testExternalPage (check_multi_machine.TestMultiMachine) # duration: 142s

bots: Move test maps out of tests-scan into the projects

Now that we have other repositories using bots/ and depending on Cockpit CI infra, namely lorax, I think it makes sense to try and read the context from a file inside the current repository, not hard-code it in bots/tests-scan.

Shouldn't be too hard to implement, let me know what you think ?

Image refresh for fedora-i386

FAIL: image-refresh fedora-i386

libvirtd crash on rhel-8-1 when destroying/deleting guests

Downstream bug: https://bugzilla.redhat.com/show_bug.cgi?id=1739564

Image refresh for fedora-31

FAIL: image-refresh fedora-31

Update stale known issues

Check for any stale known issues and open a pull request to close them

naughty-prune

SELinux is preventing rpcbind from 'name_bind' accesses

Downstream bug: https://bugzilla.redhat.com/show_bug.cgi?id=1758147
Example failure: https://209.132.184.41:8493/logs/pull-12971-20191106-084747-86f484cb-cockpit-project-cockpit--fedora-30-firefox/log.html#185

podman crashes on fedora-30 when doing bulk deletion of images and calling ListImages in parallel

Issue in libpod repository containers/podman#3316

Image refresh for continuous-atomic

FAIL: image-refresh continuous-atomic

centos-8-stream selinux failure starting a custom service denied

Update stale known issues

Check for any stale known issues and open a pull request to close them

naughty-prune

Podman cannot load stats from rootless containers

Reported: containers/podman#4268
Logs in console: > warning: Failed to update container stats: {"error":"io.podman.ErrorOccurred","parameters":{"reason":"Link not found"}}
Fails with:

Traceback (most recent call last):
  File "test/check-application", line 441, in testLifecycleOperationsUser
    self._testLifecycleOperations(False)
  File "test/check-application", line 477, in _testLifecycleOperations
    self.assertIn('%', cpu)
AssertionError: '%' not found in ''

krb5 regression: SSH GSS authentication fails with "Ticket expired"

Downstream report: https://bugzilla.redhat.com/show_bug.cgi?id=1757299 , duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1757224

Update stale known issues

Check for any stale known issues and open a pull request to close them

naughty-prune

Update stale known issues

Check for any stale known issues and open a pull request to close them

naughty-prune

centos-8-stream unprivileged selinux user login fails

Naughty tracking https://bugzilla.redhat.com/show_bug.cgi?id=1727382 trickling into centos-8

SELinux is preventing rpcbind from 'name_bind' accesses on the udp_socket port 64657

Downstream bug: https://bugzilla.redhat.com/show_bug.cgi?id=1758147

Selinux prevents setting timezone via timedatectl, prevents write to /etc/localtime

OS: fedora-31

Selinux prevents setting timezone via timedatectl.

timedatectl set-timezone Europe/Helsinki

> Failed to set time zone: Failed to update /etc/localtime

audit: type=1400 audit(1567689788.617:317): avc:  denied  { write } for  pid=7096 comm="timedatex" name="etc" dev="dm-0" ino=130051 scontext=system_u:system_r:timedatex_t:s0 tcontext=system_u:object_r:etc_t:s0 tclass=dir permissive=0

https://bugzilla.redhat.com/show_bug.cgi?id=1749375

SELinux: avc: denied { dac_override } comm="firewalld"

Downstream report: https://bugzilla.redhat.com/show_bug.cgi?id=1713561 (RHEL 8) and https://bugzilla.redhat.com/show_bug.cgi?id=1723923 (Fedora)

Image refresh for fedora-30

FAIL: image-refresh fedora-30

Debian: parted clobbers in-memory state of extended partitions

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=842269

See also https://bugzilla.redhat.com/show_bug.cgi?id=1135493 for the original.

This has been fixed with a path in Fedora only, it wasn't fixed upstream. Thus, debian-unstable suffers from it.

Basically, parted tells the kernel the wrong size for extended partitions.

Apparently, the bug was masked on debian-unstable by systemd-udevd independently re-reading the partition table. However, recent versions of Storaged block systemd-udevd from re-reading since that causes other bugs. Thus, this bug is now visible.

bots: "import task" crashes when not in a recognized GitHub repo

Since 718be03 import task will crash when the current directory is not in a git repo with a origin that points to GitHub.

This happens because import task will unconditionally run api = github.GitHub() and this will now raise an exception when it can't figure out the GitHub repo from the origin remote.

See cockpit-project/cockpit#10554 for a concrete case.

udisksd crash in check-storage-mdraid

Refreshing the debian-testing image (PR cockpit-project/cockpit#7812) updates udisks from 2.6.5 to 2.7.3 (glib and polkit stay the same), which triggers a new crash:

# testNotRemovingDisks (check_storage_mdraid.TestStorage)
#
Warning: Permanently added '[127.0.0.2]:2301' (ECDSA) to the list of known hosts.
Traceback (most recent call last):
  File "/build/cockpit/bots/../test/verify/check-storage-mdraid", line 284, in testNotRemovingDisks
    b.wait_in_text(info_field("State"), "Running")
  File "/build/cockpit/test/common/testlib.py", line 257, in wait_in_text
    return self.wait_js_func('ph_in_text', selector, text)
  File "/build/cockpit/test/common/testlib.py", line 224, in wait_js_func
    return self.phantom.wait("%s(%s)" % (func, ','.join(map(jsquote, args))))
  File "/build/cockpit/test/common/testlib.py", line 825, in <lambda>
    return lambda *args: self._invoke(name, *args)
  File "/build/cockpit/test/common/testlib.py", line 851, in _invoke
    raise Error(res['error'])
Error: timeout

This is due to udisksd crashing. I filed an upstream report at storaged-project/udisks#422 and use this one as naughty override.

podman Commit Container is extremely slow

Issue in libpod repository: containers/podman#3526

Can't get PXE boot logs from libvirt guest on a serial console

Downstream debian report: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=924423
Downstream rhel report: https://bugzilla.redhat.com/show_bug.cgi?id=2007257

Image refresh for rhel-atomic

FAIL: image-refresh rhel-atomic

RHEL, CentOS, Fedora, Debian, Ubuntu: PCP libraries crash in __pmFindProfile()

PR cockpit-project/cockpit#6102 apparently aggravates (due to changed timing) this flaky test failure:

not ok 3 testFrameNavigation (__main__.TestMultiMachine) duration: 27s
Traceback (most recent call last):
  File "test/verify/check-multi-machine", line 202, in tearDown
    MachineCase.tearDown(self)
  File "/home/martin/upstream/cockpit/test/common/testlib.py", line 533, in tearDown
    self.check_journal_messages()
  File "/home/martin/upstream/cockpit/test/common/testlib.py", line 689, in check_journal_messages
    raise Error(first)
Error: /usr/libexec/cockpit-pcp: bridge was killed: 11

This was reported a while ago as https://bugzilla.redhat.com/show_bug.cgi?id=1235962 and I confirmed it with pcp 3.11.8-2.fc25.x86_64. That downstream bug has the strack trace and some initial analysis. Filing this one to use it as "known issue" naughty quirk.

cockpit-project / bots Goto Github PK

bots's Introduction

Cockpit Bots

Images

Image location

Tests

Note on eslintrc interaction

Integration with GitHub

Setup

Test contexts

Retrying a failed test

Testing a pull request by a non-allowed user

tests-trigger with a different origin

Refreshing a test image

Creating new images for a pull request

Updating CI to a new Fedora release

Fedora CoreOS

Pixel tests

bots's People

Contributors

Stargazers

Watchers

Forkers

bots's Issues

Recommend Projects

Recommend Topics

Recommend Org