Code Monkey home page Code Monkey logo

test-infra's Introduction

test-infra

Falco Infra Repository Stable License

GitHub Workflow & Testing Infrastructure

DBG

DBG stands for Drivers Build Grid.

It's a tool that we created to prebuilt a set of Falco drivers (both kernel module and eBPF probe) for various target distro and kernel releases, by using driverkit.

You can find more about it here.

Contribute

You can contribute in order to distribute prebuilt Falco drivers for new Linux kernel releases by following this guide.

Prow

Prow is a CI/CD system running on Kubernetes.

This directory contains the resources composing the Falco's workflow & testing infrastructure.

Are you looking for Deck to check the merge queue and prow jobs?

Adding a Job on Prow

Falco is the first Public Prow instance running 100% on AWS infrastructure. This means there are slight differences when it comes to adding jobs to Falco's Prow.

Job Types

There are three types of prow jobs:

  • Presubmits run against code in PRs

  • Postsubmits run after merging code

  • Periodics run on a periodic basis

Create a Presubmits job that run's tests on PR's.

  1. We add a file at config/jobs/build-drivers/build-drivers.yaml

 presubmits:
  falcosecurity/test-infra: #Name of the org/repo
  - name: build-drivers-amazonlinux-presubmit
    decorate: true
    skip_report: false
    agent: kubernetes
    branches:
      - ^master$
    spec:
      containers:
      - command:
        - /workspace/build-drivers.sh
        - amazonlinux
        env:
        - name: AWS_REGION
          value: eu-west-1
        image: 292999226676.dkr.ecr.eu-west-1.amazonaws.com/test-infra/build-drivers:latest
        imagePullPolicy: Always
        securityContext:
          privileged: true

A few things to call out.

  • branches: ^master$ is telling prow to run this on any branch but Master
  • command: /workspace/build-drivers.sh this is telling the docker container to run as the test script. See the script
  • privileged: true This is required when using Docker in Docker, or Docker builds.
  • decorate: true is adding pod utilities to the prow jobs as an init container. This pulls in source code for the job, to leverage scripts and files in the pull request.
  1. Once we add this job, we're going to create our PR, and test this via Github / commands.

test-infra's People

Contributors

abroglesc avatar alacuku avatar andreagit97 avatar arminioa avatar cappellinsamuele avatar cpanato avatar dwindsor avatar fededp avatar fntlnz avatar ianrobertson-wpe avatar ismailyenigul avatar issif avatar jasondellaluce avatar jonahjon avatar kaizhe avatar leodido avatar leogr avatar lowaiz avatar lucaguerra avatar lucianocarranza avatar maxgio92 avatar michalschott avatar movd avatar mstemm avatar nibalizer avatar poiana avatar temikus avatar therealbobo avatar yogisec avatar zuc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

test-infra's Issues

New repository for advocating for Falco

What would you like to be added:

A new repository called falcosecurity/advocacy that has @poiana installed.

Why is this needed:

So we can begin tracking advocacy requests and events and automating as a community.

This would be issues like:

  • Request to speak
  • Request for publication
  • Request for Media

Declaratively setup branch protections

I think the issue's title is self-explanatory.

For each repository of the organisation we'd like to choose a git workflow model (hopefully the same of every repository in the org).

And according to this we'd like to setup one (or more) branches as protected.

Setup poiana for various new repositories

Motivation

Soon, we are gonna work on a set of new repositories.
Namely,

  • libscap
  • libsinsp
  • ebpf-probe
  • kernel-module
  • build-service -> driverkit

Feature

I'd like to have Prow setup for these.

Alternatives

None.

Additional context

Maybe the repo names will change.

pre-built eBPF probe seems to be halfway missing for amazonlinux

Describe the bug

When I try to run falco on an EKS cluster, it tries to download https://dl.bintray.com/falcosecurity/driver/96bd9bc560f67742738eb7255aeb4d03046b8045/falco_amazonlinux2_4.14.177-139.254.amzn2.x86_64_1.o, which does not exist. However, https://dl.bintray.com/falcosecurity/driver/96bd9bc560f67742738eb7255aeb4d03046b8045/falco_amazonlinux2_4.14.177-139.254.amzn2.x86_64_1.ko does, so maybe the build process needs to be fixed?

How to reproduce it

helm repo add falcosecurity https://falcosecurity.github.io/charts
helm repo update
kubectl config set-context --current --namespace=kube-system
cat <<EOF > falco-values.yml
ebpf:
  enabled: true
  settings:
    mountEtcVolume: false
EOF
helm template --name-template=falco falcosecurity/falco -f falco-values.yml > falco.yml
kubectl apply -f falco.yml
kubectl logs daemonset.apps/falco -n kube-system

Expected behaviour

I expect the falco image to be able to successfully download the appropriate amazonlinux .o file so that it can load it and run.

Screenshots

laptop:~$ kubectl logs daemonset.apps/falco -n kube-system 
Found 4 pods, using pod/falco-8xb4p
* Setting up /usr/src links from host
* Running falco-driver-loader with: driver=bpf, compile=yes, download=yes
* Mounting debugfs
* Found kernel config at /host/boot/config-4.14.177-139.254.amzn2.x86_64
* Trying to compile the eBPF probe (falco_amazonlinux2_4.14.177-139.254.amzn2.x86_64_1.o)
make[1]: *** /lib/modules/4.14.177-139.254.amzn2.x86_64/build: No such file or directory.  Stop.
make: *** [Makefile:18: all] Error 2
mv: cannot stat '/usr/src/falco-96bd9bc560f67742738eb7255aeb4d03046b8045/bpf/probe.o': No such file or directory
* Trying to download a prebuilt eBPF probe from https://dl.bintray.com/falcosecurity/driver/96bd9bc560f67742738eb7255aeb4d03046b8045/falco_amazonlinux2_4.14.177-139.254.amzn2.x86_64_1.o
curl: (22) The requested URL returned error: 404 Not Found
Download failed
Wed Jun 10 20:34:47 2020: Falco initialized with configuration file /etc/falco/falco.yaml
Wed Jun 10 20:34:47 2020: Loading rules from file /etc/falco/falco_rules.yaml:
Wed Jun 10 20:34:48 2020: Loading rules from file /etc/falco/falco_rules.local.yaml:
Wed Jun 10 20:34:49 2020: Unable to load the driver. Exiting.
Wed Jun 10 20:34:49 2020: Runtime error: can't open BPF probe '/root/.falco/falco-bpf.o': Errno 2. Exiting.
laptop:~$ 

Environment

AWS EKS 1.16

  • Falco version: image: docker.io/falcosecurity/falco:0.23.0
  • System info:
  • Cloud provider or hardware configuration:
  • OS:
  • Kernel: falco_amazonlinux2_4.14.177-139.254.amzn2.x86_64_1
  • Installation method: Kubernetes, from rendered helm template (documented above)

Additional context

None that I can think of! Thanks for your help!

Document the driver build grid

What to document

Document the contents and the behavior of the driverkit folder - ie., the driver build grid.

I also propose to shorten it like DBG which stands for Dark Builder Gang. ๐Ÿคฃ

Error applying workspace during drivers publish (DBG)

Which jobs are failing:

driverkit/publish

Which test(s) are failing:

Attaching Workspace

Since when has it been failing:

Since commit d3f5365

Test link:

https://app.circleci.com/pipelines/github/falcosecurity/test-infra/113/workflows/550f3e3a-409c-49a4-a71b-d8566a6cbe0e/jobs/156/steps

Reason for failure:

Downloading workspace layers
  workflows/workspaces/550f3e3a-409c-49a4-a71b-d8566a6cbe0e/0/60a068d4-61aa-44c2-b4c6-040f5de56421/0/107.tar.gz - 89 MB
  workflows/workspaces/550f3e3a-409c-49a4-a71b-d8566a6cbe0e/0/1acc1536-40c5-42c3-a5fb-bed7eb0b69e4/0/107.tar.gz - 108 MB
  workflows/workspaces/550f3e3a-409c-49a4-a71b-d8566a6cbe0e/0/435171e9-f950-49e0-a68f-a0c148176f6c/0/107.tar.gz - 98 MB
  workflows/workspaces/550f3e3a-409c-49a4-a71b-d8566a6cbe0e/0/c487cc12-4767-4120-aa6b-855fb08f4736/0/107.tar.gz - 88 MB
  workflows/workspaces/550f3e3a-409c-49a4-a71b-d8566a6cbe0e/0/bbda5c6b-4fe0-452e-8911-a10e47fe7b9b/0/107.tar.gz - 122 MB
  workflows/workspaces/550f3e3a-409c-49a4-a71b-d8566a6cbe0e/0/bd6b346c-c5b5-49a8-8807-fc330bfa3d1f/0/107.tar.gz - 15 MB
Applying workspace layers
  60a068d4-61aa-44c2-b4c6-040f5de56421
Concurrent upstream jobs persisted the same file(s) into the workspace:
  - output/failing.log
  - output/.gitignore

Error applying workspace layer for job 60a068d4-61aa-44c2-b4c6-040f5de56421: Concurrent upstream jobs persisted the same file(s)

Anything else we need to know:

Prow job for formatters

Lately we are introducing into Falco formatters for C++ code and Cmake files.

We would like to have a prow job that checks the code changes in PR are formatted according to the established coding conventions.

Alternatively it could also automatically format code.

What do we prefer?
Do we want the the prowjobs creates a status check (green/red) that blocks PR with unformatted code?
Or do we prefer to have a prowjobs that automatically formats the code if not already formatted and pushes it (with user poiana)?

Express your thoughts @mstemm @fntlnz !

Configure tide merge and squash labels

Investigate the use of merge_label tide's config in order to specify PR that have to be merged in with merge_commits.

Same for PRs to merge squashing commits.

Create separate GCP account for Falco

Currently I can only add users that have a valid sysdig email. This issue is track separating that out so we can grant maintainers access where needed

Fail to download falco-probe module for falco v_0.22.1

As I install the Falco using DaemonSet (DS) on OCP cluster (platform RHCOS), DS creates the pods at each node, but state of Pods are CrashLoopBackOff.
When I check the logs for pod I got the below logs:

Your kernel headers for kernel 3.10.0-1127.el7.x86_64 cannot be found at
/lib/modules/3.10.0-1127.el7.x86_64/build or /lib/modules/3.10.0-1127.el7.x86_64/source.

  • Running dkms build failed, couldn't find /var/lib/dkms/falco/a259b4bf49c3330d9ad6c3eed9eb1a31954259a6/build/make.log
  • Trying to load a system falco-probe, if present
  • Trying to find precompiled falco-probe for 3.10.0-1127.el7.x86_64
    Found kernel config at /host/boot/config-3.10.0-1127.el7.x86_64
  • Trying to download precompiled module from https://s3.amazonaws.com/download.draios.com/stable/sysdig-probe-binaries/falco-probe-a259b4bf49c3330d9ad6c3eed9eb1a31954259a6-x86_64-3.10.0-1127.el7.x86_64-5223157266b9a46702625b834cadc6ab.ko
    curl: (35) error:1408F10B:SSL routines:ssl3_get_record:wrong version number
    Download failed, consider compiling your own falco-probe and loading it or getting in touch with the Falco community
    Wed Apr 29 06:47:52 2020: Falco initialized with configuration file /etc/falco/falco.yaml
    Wed Apr 29 06:47:52 2020: Loading rules from file /etc/falco/falco_rules.yaml:
    Wed Apr 29 06:47:52 2020: Loading rules from file /etc/falco/falco_rules.local.yaml:
    Wed Apr 29 06:47:52 2020: Loading rules from file /etc/falco/k8s_audit_rules.yaml:
    Wed Apr 29 06:47:52 2020: Unable to load the driver. Exiting.

Falco version: 0.22.1
Driver version: a259b4bf49c3330d9ad6c3eed9eb1a31954259a6

uname -r

4.18.0-147.5.1.el8_1.x86_64

As module is not present at specified URL in the above log.

I referred the installation steps given at: https://falco.org/docs/installation/

Declarative org settings, teams, and memberships

What would you like to be added:

Following up from falcosecurity/falco#621 - ie., prow - I think we should also implement a declarative way to handle organisation settings, teams and memberships.

Why is this needed:

This approach is particularly useful to scale the operations about the management of the open-source projects.

Also it is a way to transparently automatically document the organization of the open-source.

This way we'll remove the manual duties in the GitHub UI which are difficult to document and maintain over time for growing projects.

Add poiana to cloud native security hub repos

We need to setup Prow workflow for these repositories:

Subtasks

  • OWNERS (approvals, and reviewers)
  • kind labels
  • protect master branches
  • enforce rebase merging strategy
  • release notes (changelog) mechanism from PRs
  • do not merge strategies (WIP, needs rebase, missing release notes, invalid owners)
  • detect missing kind/* labels (issues, PR)
  • plugins

cc @nestorsalceda @bencer

falco_amazonlinux_4.9.75-25.55.amzn1.x86_64_1.ko not dowloading

when trying to setup falco it is not able to download
https://dl.bintray.com/falcosecurity/driver/96bd9bc560f67742738eb7255aeb4d03046b8045/falco_amazonlinux_4.9.75-25.55.amzn1.x86_64_1.ko

to create the ec2 instance
aws ec2 run-instances --image-id ami-ba722dc0 --key-name my-key --instance-type t2.micro --region us-east-1 --subnet-id mysubnet --count 1 --profile dev --associate-public-ip-address

[ec2-user@ip-10-0-0-120 ~]$ cat /etc/os-release
NAME="Amazon Linux AMI"
VERSION="2017.09"
ID="amzn"
ID_LIKE="rhel fedora"
VERSION_ID="2017.09"
PRETTY_NAME="Amazon Linux AMI 2017.09"
ANSI_COLOR="0;33"
CPE_NAME="cpe:/o:amazon:linux:2017.09:ga"
HOME_URL="http://aws.amazon.com/amazon-linux-ami/"

can this be fixed?

[ec2-user@ip-10-0-0-120 ~]$ docker run --rm -i -t \
>     --privileged \
>     -v /dev:/host/dev \
>     -v /proc:/host/proc:ro \
>     -v /boot:/host/boot:ro \
>     -v /lib/modules:/host/lib/modules:ro \
>     -v /usr:/host/usr:ro \
>     -v /etc:/host/etc:ro \
>     falcosecurity/falco:latest
* Setting up /usr/src links from host
* Running falco-driver-loader with: driver=module, compile=yes, download=yes
* Unloading falco module, if present
* Trying to dkms install falco module
* Running dkms build failed, couldn't find /var/lib/dkms/falco/96bd9bc560f67742738eb7255aeb4d03046b8045/build/make.log
* Trying to load a system falco driver, if present
* Trying to find locally a prebuilt falco module for kernel 4.9.75-25.55.amzn1.x86_64, if present
* Trying to download prebuilt module from https://dl.bintray.com/falcosecurity/driver/96bd9bc560f67742738eb7255aeb4d03046b8045/falco_amazonlinux_4.9.75-25.55.amzn1.x86_64_1.ko
curl: (22) The requested URL returned error: 404 Not Found
Download failed, consider compiling your own falco module and loading it or getting in touch with the Falco community
2020-06-15T14:53:21+0000: Falco initialized with configuration file /etc/falco/falco.yaml
2020-06-15T14:53:21+0000: Loading rules from file /etc/falco/falco_rules.yaml:
2020-06-15T14:53:21+0000: Loading rules from file /etc/falco/falco_rules.local.yaml:
2020-06-15T14:53:21+0000: Loading rules from file /etc/falco/k8s_audit_rules.yaml:
2020-06-15T14:53:21+0000: Unable to load the driver. Exiting.
2020-06-15T14:53:21+0000: Runtime error: error opening device /host/dev/falco0. Make sure you have root credentials and that the falco module is loaded.. Exiting.

Kernel module and eBPF probe test and build grid

I'm not sure if we had an issue already for this, I wasn't able to find one.

One of the most important components of Falco itself is the mechanism that collects syscalls from the kernel. Such mechanism is based on a Kernel module or on an eBPF probe.

In both cases, the artifact needs to be built specifically for the kernel in use (sources and config flags).

What we want to do is to have a CI mechanism that can run a grid of jobs to build all those different combinations and push the final result to s3 or docker hub.

I'm saying s3 because that's the way we've been distributing those right now, we have a similar thing that it's hosted by Sysdig and we want to replace that with a community owned one.

I'm saying docker hub because the docker registry is a very good way to distribute artifacts and I know that @mfdii was doing some work on this in his PR about slim images falcosecurity/falco#776

@markyjackson-taulia is working on a new CI mechanism for Falco based on JenkinsX #48 - we might want to integrate this there.

If integrating this in JenkinsX is not an option (we might not have enough fire power to do it or it's difficult to do a grid) another option since this week is to just continue using travis for this part.

@AkihiroSuda on the rootless containers project, recently discovered that full VMs can be run on Travis, we could do a kernel vmlinuz for each config and kernel version + a rootfs and build the probes on each one of them, then just run it on KVM on travis.

(opening this w/ @leodido)

Re-enable approving review

In #37 we disabled the strict approving reviewers requirement because the bot wasn't behaving in the way expected it to.

It was requesting the review from every reviewer in order to merge.

The way it's configured it should work in a way that at least one reviewer is required and then let do the merge.

Check driver configs validity (formally + kernel headers) in advance

Motivation

Currently, the DBG (driver build grid) starts only when a PR (editing files under driverkit folder) gets merged into the master.

Feature

In order to avoid having to merge to see if a set of prebuilt drivers build, I'd like the build step of DBG to run with --dryrun option on PR branches.

This will, at least, enforce in advance that the DBG config files are formally correct and that driverkit tool can find the kernel headers for every config.

Additional context

Do this as soon falcosecurity/driverkit#52 is done.

Lifecycle bot

Create a stale "bot" using a set of prowjobs in order to automatically tag the lifecycle phase of issues.

Currently, we have the stale bot (a GitHub action) that checks issues and pings authors if they haven't been updated in a while.
The way it's configured and the way it works is not clear neither transparent to our users.

In fact, we experienced some complaints in the past about it.

We should remove such bot and implement a better one to automate the lifecycle of issues (and maybe pull requests too) of the whole falcosecurity organization.

Lifecycle

  • frozen: issue or PR should not be autoclosed due to rottenness
  • stale: issue or PR open with no activity (90days) and has become stale
  • rotten: 30days have passed since the issue or the PR was marked as stale, it will be automatically closed soon
  • close: after other 30days the rotten issue is automatically closed

Thus, 90+30+30 = an issue would be closed without any activity after 150 days.

Using /lifecycle frozen an issue will be excluded from this process.

/cc @falcosecurity/test-infra-maintainers

Check config

Integrate checkconfig tool into the workflow of this repository (as a presubmit job) in order to avoid that wrong configuration gets deployed.

Let tide (prow component) respect the OWNERS file w.r.t. subdirectories

Describe the bug

It seems that poiana (prow) is not fully respecting the OWNERS file.

If an OWNER of a subdirectory approves a PR that touches file in another responsibility area (directory) of the projects, poiana does not detect it correctly and counts that approval as a one counting for auto-merging (tide responsibility).

How to reproduce it

In this PR we have an example of what I'm describing above.

falcosecurity/falco#1292

There have been various other cases in the past. Not reporting them for brevity.

Expected behavior

Poiana to keep into consideration the fact that PR approvals from a GitHub user are counted for merging only in case such user is into the ONWERS file of the directory (directories) the PR refers to.

Additional context

This can be very misleading (and potentially dangerous) misbehavior. We should investigate further if prow lacks such ability at all or if it's our fault (in how we configured it).

Implement lifecycle stale and rotten in Prow

Motivation

At the moment we use a GitHub bot to manage stale PRs and Issues, everyone is annoyed by that because it's not very configurable and it's difficult to manage from the issue/Pr itself.

We need something that is more flexible - using a ProwJob seem a good idea because we manage everything from a single point and because with ProwJobs you can essentially execute whatever you like while using a framework that has interaction points with issues and PRs.

Feature

To achieve that we need to:

  • Enable Crier and have it reporting back to GitHub - needed for ProwJobs to work correctly.
  • Create a ProwJob to change the status if issues. A very good example is how the kubernetes community does this - see here

We want to obtain something like this

Alternatives

Continue using the actual stale bot as a Github app.

Additional context

The actual deployments are in a Kubernetes cluster managed by the maintainers' team, whoever works on this will need to replicate the deployment on their own development machine and then send the PR - we don't do development directly on the prod infrastructure.

New Kubernetes cluster for Prow on the CNCF cluster

Motivation

The actual Kubernetes cluster we use for Prow (Poiana) is having issues.

There's no better moment than now to switch to the CNCF cluster for this resource, in this way 100% of the development process lives under the CNCF hat.

Feature

  • Create a kubernetes cluster in the CNCF packet.net account
  • It will need an ingress to configure Github webhooks
  • No need for persistent volumes at the moment, Prow is used in a stateless way for now
  • Configure the needed resources for Prow to work, we have an handy makefile to accomplish that.

Alternatives

Fix the cluster we have now.

Additional context

CNCF cluster request issue: cncf/cluster#123

Rename test-infra

Motivation

I believe the name test-infra is misleading.

This repository seems to hold configuration for falco infrastructure. I don't even see anything related to testing in here.

Can we rename this to infrastructure or something a little more relevant? There are issues like https://github.com/falcosecurity/falco/issues/1262 that I believe could be mitigated if folks had a better understanding of this repository and what it's used for.

Feature

Alternatives

Additional context

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.