Code Monkey home page Code Monkey logo

libseccomp's Introduction

Enhanced Seccomp Helper Library

https://github.com/seccomp/libseccomp

CII Best Practices Build Status Coverage Status CodeQL Analysis

The libseccomp library provides an easy to use, platform independent, interface to the Linux Kernel's syscall filtering mechanism. The libseccomp API is designed to abstract away the underlying BPF based syscall filter language and present a more conventional function-call based filtering interface that should be familiar to, and easily adopted by, application developers.

Online Resources

The library source repository currently lives on GitHub at the following URL:

The Go language bindings repository currently lives on GitHub at the following URL:

Supported Architectures

The libseccomp library currently supports the architectures listed below:

  • 32-bit x86 (x86)
  • 64-bit x86 (x86_64)
  • 64-bit x86 x32 ABI (x32)
  • 32-bit ARM EABI (arm)
  • 64-bit ARM (aarch64)
  • 64-bit LoongArch (loongarch64)
  • 32-bit Motorola 68000 (m68k)
  • 32-bit MIPS (mips)
  • 32-bit MIPS little endian (mipsel)
  • 64-bit MIPS (mips64)
  • 64-bit MIPS little endian (mipsel64)
  • 64-bit MIPS n32 ABI (mips64n32)
  • 64-bit MIPS n32 ABI little endian (mipsel64n32)
  • 32-bit PA-RISC (parisc)
  • 64-bit PA-RISC (parisc64)
  • 32-bit PowerPC (ppc)
  • 64-bit PowerPC (ppc64)
  • 64-bit PowerPC little endian (ppc64le)
  • 32-bit s390 (s390)
  • 64-bit s390x (s390x)
  • 64-bit RISC-V (riscv64)
  • 32-bit SuperH big endian (sheb)
  • 32-bit SuperH (sh)

Documentation

The "doc/" directory contains all of the currently available documentation, mostly in the form of manpages. The top level directory also contains a README file (this file) as well as the LICENSE, CREDITS, CONTRIBUTING, and CHANGELOG files.

Those who are interested in contributing to the project are encouraged to read the CONTRIBUTING in the top level directory.

Verifying Release Tarballs

Before use you should verify the downloaded release tarballs and checksums using the detached signatures supplied as part of the release; the detached signature files are the "*.asc" files. If you have GnuPG installed you can verify detached signatures using the following command:

# gpg --verify file.asc file

At present, only the following keys, specified via the fingerprints below, are authorized to sign official libseccomp releases:

Paul Moore <[email protected]>
7100 AADF AE6E 6E94 0D2E  0AD6 55E4 5A5A E8CA 7C8A

Tom Hromatka <[email protected]>
47A6 8FCE 37C7 D702 4FD6  5E11 356C E62C 2B52 4099

More information on GnuPG can be found at their website, https://gnupg.org.

Building and Installing the Library

If you are building the libseccomp library from an official release tarball, you should follow the familiar three step process used by most autotools based applications:

# ./configure
# make [V=0|1]
# make install

However, if you are building the library from sources retrieved from the source repository you may need to run the autogen.sh script before running configure. In both cases, running "./configure -h" will display a list of build-time configuration options.

Testing the Library

There are a number of tests located in the "tests/" directory and a make target which can be used to help automate their execution. If you want to run the standard regression tests you can execute the following after building the library:

# make check

These tests can be safely run on any Linux system, even those where the kernel does not support seccomp-bpf (seccomp mode 2). However, be warned that the test run can take a while to run and produces a lot of output.

The generated seccomp-bpf filters can be tested on a live system using the "live" tests; they can be executed using the following commands:

# make check-build
# (cd tests; ./regression -T live)

These tests will fail if the running Linux Kernel does not provide the necessary support.

Developer Tools

The "tools/" directory includes a number of tools which may be helpful in the development of the library, or applications using the library. Not all of these tools are installed by default.

Bug and Vulnerability Reporting

Problems with the libseccomp library can be reported using the GitHub issue tracking system. Those who wish to privately report potential vulnerabilities should follow the directions in SECURITY.md.

libseccomp's People

Contributors

amluto avatar andreas-schwab avatar androm3da avatar daviddrysdale avatar debfx avatar drakenclimber avatar eparis avatar giuseppe avatar glaubitz avatar hallyn avatar hdeller avatar hrw avatar kees avatar kolyshkin avatar lucab avatar manasugi avatar maxcrees avatar minipli avatar msmeissn avatar pcmoore avatar tklauser avatar tych0 avatar tyhicks avatar vapier avatar vi avatar whereswaldon avatar xen0n avatar xyene avatar yetist avatar zippy2 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

libseccomp's Issues

BUG: __NR_SYSCALL_BASE conflicts between arm/mips

if you build libseccomp on an arm system, then its kernel headers define __NR_SYSCALL_BASE. this conflicts with the defines libseccomp uses like in the mips code:

libtool: compile:  armv7a-hardfloat-linux-gnueabi-gcc -DHAVE_CONFIG_H \
   -I. -I.../libseccomp-2.2.3/src -I.. -I.../libseccomp-2.2.3/include -I../include \
   -Wall -O2 -pipe -march=armv7-a -mfpu=neon -fPIC -DPIC -fvisibility=hidden \
   -O2 -pipe -march=armv7-a -mfpu=neon -c .../libseccomp-2.2.3/src/arch-mips-syscalls.c \
   -fPIC -DPIC -o .libs/libseccomp_la-arch-mips-syscalls.o
.../libseccomp-2.2.3/src/arch-mips-syscalls.c:31:0: warning: "__NR_SYSCALL_BASE" redefined [enabled by default]
 #define __NR_SYSCALL_BASE 4000
 ^
In file included from .../libseccomp-2.2.3/include/seccomp.h:27:0,
                 from .../libseccomp-2.2.3/src/arch-mips-syscalls.c:25:
/usr/include/asm/unistd.h:19:0: note: this is the location of the previous definition
 #define __NR_SYSCALL_BASE 0
 ^

BUG: tests fail x86 due to direct wired socket syscalls on Linux 4.3

test mode: c
test type: bpf-valgrind
batch name: 15-basic-resolver
test mode: c
test type: basic
Test 15-basic-resolver%%001-00001 result: FAILURE 15-basic-resolver rc=1
batch name: 16-sim-arch_basic
test mode: c
test type: bpf-sim
test arch: x86
Test 16-sim-arch_basic%%001-00001 result: ERROR 16-sim-arch_basic rc=14
test arch: x86
Test 16-sim-arch_basic%%002-00001 result: ERROR 16-sim-arch_basic rc=14
test arch: x86
Test 16-sim-arch_basic%%003-00001 result: ERROR 16-sim-arch_basic rc=14
test arch: x86
Test 16-sim-arch_basic%%004-00001 result: ERROR 16-sim-arch_basic rc=14
test arch: x86
Test 16-sim-arch_basic%%005-00001 result: ERROR 16-sim-arch_basic rc=14
test arch: x86
Test 16-sim-arch_basic%%006-00001 result: ERROR 16-sim-arch_basic rc=14
Test 16-sim-arch_basic%%007-00001 result: ERROR 16-sim-arch_basic rc=14
Test 16-sim-arch_basic%%008-00001 result: ERROR 16-sim-arch_basic rc=14
Test 16-sim-arch_basic%%009-00001 result: ERROR 16-sim-arch_basic rc=14
Test 16-sim-arch_basic%%010-00001 result: ERROR 16-sim-arch_basic rc=14
Test 16-sim-arch_basic%%011-00001 result: ERROR 16-sim-arch_basic rc=14
Test 16-sim-arch_basic%%012-00001 result: ERROR 16-sim-arch_basic rc=14
test mode: c
test type: bpf-valgrind
batch name: 17-sim-arch_merge
test mode: c
test type: bpf-sim
Test 17-sim-arch_merge%%001-00001 result: ERROR 17-sim-arch_merge rc=14
Test 17-sim-arch_merge%%002-00001 result: ERROR 17-sim-arch_merge rc=14
Test 17-sim-arch_merge%%003-00001 result: ERROR 17-sim-arch_merge rc=14
Test 17-sim-arch_merge%%004-00001 result: ERROR 17-sim-arch_merge rc=14
Test 17-sim-arch_merge%%005-00001 result: ERROR 17-sim-arch_merge rc=14
Test 17-sim-arch_merge%%006-00001 result: ERROR 17-sim-arch_merge rc=14
Test 17-sim-arch_merge%%007-00001 result: ERROR 17-sim-arch_merge rc=14
Test 17-sim-arch_merge%%008-00001 result: ERROR 17-sim-arch_merge rc=14
Test 17-sim-arch_merge%%009-00001 result: ERROR 17-sim-arch_merge rc=14

BUG: building for mips produces many __NR_cacheflush redefined warnings

building in a mips/n32 userland and current git (7f3ae6e):

In file included from system.c:26:0:
../include/seccomp.h:1249:0: warning: "__NR_cacheflush" redefined [enabled by default]
 #define __NR_cacheflush  __PNR_cacheflush
 ^
In file included from ../include/seccomp.h:27:0,
                 from system.c:26:
/usr/include/asm/unistd.h:928:0: note: this is the location of the previous definition
 #define __NR_cacheflush   (__NR_Linux + 197)
 ^

BUG: problems using the Python bindings

This is what I have so far:

$ sudo make install
  PYTHON   build
make[1]: Entering directory '/home/gene/libseccomp/src/python'
VERSION_RELEASE="0.0.0" CPPFLAGS="-I\../../include -I../../include " CFLAGS="-Wall -g -O2" LDFLAGS="-Wl,-z -Wl,relro " /usr/bin/env python ./setup.py install --prefix=//usr/local
running install
running build
running build_ext
skipping 'seccomp.c' Cython extension (up-to-date)
running install_lib
running install_egg_info
Removing //usr/local/lib64/python2.7/site-packages/seccomp-0.0.0-py2.7.egg-info
Writing //usr/local/lib64/python2.7/site-packages/seccomp-0.0.0-py2.7.egg-info
make[1]: Nothing to be done for 'install-data-am'.
make[1]: Leaving directory '/home/gene/libseccomp/src/python'

I can't seem to import it though:

$ python
Python 2.7.9 (default, Feb  6 2015, 14:42:41) 
[GCC 5.0.0 20150205 (Red Hat 5.0.0-0.7)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from seccomp import *
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named seccomp
>>> 

Probably I am missing something.

Q: syscall using vDSO

Hey!

I am trying to intercept some syscalls with libseccomp while they are implemented through vDSO.

seccomp_rule_add(ctx, SCMP_ACT_KILL, SCMP_SYS(clock_gettime), 1,
                         SCMP_A0(SCMP_CMP_EQ, CLOCK_REALTIME))

If my clock source is tsc, this doesn't kill anything. If this is acpi_pm, the process will get killed. Kernel documentation says:

New code will use the vDSO, and vDSO-issued system calls are indistinguishable from normal system calls.

There is also some stuff about vsyscalls. The documentation seems to say that catching vDSO syscalls should be the same as catching non-vDSO syscalls. Is it a limitation of libseccomp (in the way the BPF filter is built) or a limitation in the kernel or my own fault that I am unable to catch a vDSO-syscall?

RFE: support "maximum kernel version"

As system calls are added to the kernel, I feel there is not enough discussion by default of the wide variety of applications that will suddenly gain access to a new attack surface.

The canonical example here is perf_event_open(), the source of numerous CVEs. While perf is awesome, my (e.g.) web server should not (by default) be able to use it.

It's possible to use seccomp today to blacklist. whitelists can get very difficult to manage.

One thing that might be useful is a filter for any system calls newer than a particular kernel version, say 3.10. That way, each new system call would have to be verified for use in e.g. containers before it's added. Upgrading the kernel wouldn't suddenly expose containers to new attack surface.

In a discussion with @pcmoore he indicated this could be another annotation in the struct in e.g. arch-x86-syscalls.c.

BUG: seccomp_rule_add man page is incorrect

As reported by @stevegrubb the seccomp_rule_add man page incorrectly states that the individual rule actions are triggered does not match:

   SCMP_ACT_KILL
          The  thread will be killed by the kernel when it calls a syscall
          that does not match any of the configured seccomp filter rules.

All of the actions in the seccomp_rule_add man page need to be checked and it would probably be a good idea to check the other related man pages as well.

RFE: consider removing negative (non-canonical) __NR_xyz defines

IMO __NR_xyz for syscall xyz should either match the actual kernel-assigned number or should not be defined. This code, for example, can cause confusion:

#define __PNR_userfaultfd   -10200
#ifndef __NR_userfaultfd
#define __NR_userfaultfd    __PNR_userfaultfd
#endif /* __NR_userfaultfd */

libseccomp is not the sole user of __NR_xyz defines.

Using LIBSECCOMP_xyz or sticking with __PNR_syz would be fine.

Q: clarify the seccomp_syscall_resolve_name_rewrite() behavior

I was reading the doc and the source in parallel, and I'm still not sure about the behavior and recommended usage of seccomp_syscall_resolve_name_rewrite().

According to the manpage:

[seccomp_syscall_resolve_name*()] resolve the commonly used syscall
name to the syscall number used by the kernel and the rest of the libseccomp API,
with seccomp_syscall_resolve_name_rewrite() rewriting the syscall number for
architectures that modify the syscall.

Some doubts I have:

  • How does re-writing interact with pseudo-syscalls? Perl bindings are a bit more explicit here, but I'm not sure if correct and cannot reproduce that behavior on x86_64. I'll be happy to get some hints/pointer/examples here.
  • Which name resolving function should be recommended and used by consumers? "_rewrite" seems to be more precise in some cases (which?), but is the non-rewrite behavior a concern people should be aware? Does it have security implications?
  • The default/simpler seccomp_syscall_resolve_name has a non-rewrite behavior. Is that on purpose and what are the implications? Would it make sense to switch it to the rewrite behavior (and if not, why)?
  • When resolving with a non-rewrite behavior, will other libseccomp functions perform the rewriting internally or not?

Sorry for the long list of questions, but I didn't get a clear answer for this when navigating through the code. I think I got the general idea but I fear that some details in it may be wrong, so I'd prefer if they could be clarified here from a knowledgeable source. I'll be happy to expand the docs (if needed) with those inputs.

RFE: seccomp() syscall and TSYNC flag support never configured

Currently the tsync support is wrapped with HAVE_SECCOMP but there is actually no configure test, perhaps as no existing libc has the seccomp call available.

However it would be perfectly normal to just call seccomp(2) with syscall() as glibc policy seems to be mostly not to expose specialist syscalls at all.

In which case the library could attempt to use syscall(SYS_seccomp, ...) if tsync was set, and fail the filter load if not, rather than just having a tsync option which is not usable even if supported by the kernel.

This behaviour would seem more useful as tsync support is quite widespread, and it is important for many applications. Otherwise the Go libseccomp wrapper should have a warning saying always use runtime.LockOSThread() if you use this library or filtering may not be applied to the calls you expect.

RFE: provide detached signatures of source tarballs

Hello,
First of all thank you for taking the time to sign your hashes! Unfortunately, there is a bit of a problem downstream. Most packaging systems require a GPG signature and the original source file, otherwise they fail. Hence in-line GPG causes a packaging dilemma for maintainers.
As seen in the following script from a PKGBUILD below:

| ==> Making package: libseccomp 2.3.1-1 (Sat Nov 5 09:49:11 EDT 2016) | ==> Retrieving sources... | -> Found libseccomp-2.3.1.tar.gz | -> Found libseccomp-2.3.1.tar.gz.SHA256SUM.asc | ==> Validating source files with sha256sums... | libseccomp-2.3.1.tar.gz ... Passed | libseccomp-2.3.1.tar.gz.SHA256SUM.asc ... Passed | ==> Verifying source file signatures with gpg... | libseccomp-2.3.1.tar.gz.SHA256SUM ... SOURCE FILE NOT FOUND | ==> ERROR: One or more PGP signatures could not be verified!

The problem is there is no libseccomp-2.3.1.tar.gz.SHA256SUM for the packager to download.

I see two possible ways of fixing this issue, the second being the most proper.

Solution 1) Provide an unsigned libseccomp-2.3.1.tar.gz.SHA256SUM, although this still does not allow one to run validpgpkeys script to verify the source tarball, it does verify the sha256 is correct which could be manually compared with the PKGBUILD creators. (Not optimal)

Solution 2) Consider creating actual detached signatures for your source tarballs as well. Debian is pushing for this good standard and a tutorial for it is available here: https://wiki.debian.org/Creating%20signed%20GitHub%20releases

Thank you for your time.

RFE: sys/prctl.h should not be included when using musl

make -j5 --no-print-directory
make --quiet --no-print-directory all-recursive
Making all in include
Making all in src
Making all in .
In file included from /var/tmp/tmpfs/portage/sys-libs/libseccomp-2.2.3/work/libseccomp-2.2.3/src/system.h:26:0,
from /var/tmp/tmpfs/portage/sys-libs/libseccomp-2.2.3/work/libseccomp-2.2.3/src/arch.h:31,
from /var/tmp/tmpfs/portage/sys-libs/libseccomp-2.2.3/work/libseccomp-2.2.3/src/db.h:30,
from /var/tmp/tmpfs/portage/sys-libs/libseccomp-2.2.3/work/libseccomp-2.2.3/src/system.c:28:
/usr/include/linux/prctl.h:133:8: error: redefinition of ‘struct prctl_mm_map’
struct prctl_mm_map {
^
In file included from /var/tmp/tmpfs/portage/sys-libs/libseccomp-2.2.3/work/libseccomp-2.2.3/src/system.c:24:0:
/usr/include/sys/prctl.h:88:8: note: originally defined here
struct prctl_mm_map {
^
Makefile:642: recipe for target 'libseccomp_la-system.lo' failed

BUG: EFAULT returned on 32-bit x86 for some tests

Using v2.3.0, some tests fail when seccomp is built with -m32 running on amd64.

[  819s]  batch name: 15-basic-resolver
[  819s]  test mode:  c
[  819s]  test type:  basic
[  819s] Test 15-basic-resolver%%001-00001 result:   FAILURE 15-basic-resolver rc=1
[  819s]  batch name: 16-sim-arch_basic
[  819s]  test mode:  c
[  819s]  test type:  bpf-sim
[  819s]  test arch:  x86
[  820s] Test 16-sim-arch_basic%%001-00001 result:   ERROR 16-sim-arch_basic rc=14
[  820s]  test arch:  x86
[  820s] Test 16-sim-arch_basic%%002-00001 result:   ERROR 16-sim-arch_basic rc=14
[  820s]  test arch:  x86
[  820s] Test 16-sim-arch_basic%%003-00001 result:   ERROR 16-sim-arch_basic rc=14
[...]
[  824s] Test 16-sim-arch_basic%%012-00001 result:   ERROR 16-sim-arch_basic rc=14
[  824s]  test mode:  c
[  824s]  test type:  bpf-valgrind
[  829s] Test 16-sim-arch_basic%%013-00001 result:   FAILURE 16-sim-arch_basic rc=14
[  829s]  batch name: 17-sim-arch_merge
[  829s]  test mode:  c
[  829s]  test type:  bpf-sim
[  829s] Test 17-sim-arch_merge%%001-00001 result:   ERROR 17-sim-arch_merge rc=14
[  830s] Test 17-sim-arch_merge%%002-00001 result:   ERROR 17-sim-arch_merge rc=14
[  830s] Test 17-sim-arch_merge%%003-00001 result:   ERROR 17-sim-arch_merge rc=14
[  830s] Test 17-sim-arch_merge%%004-00001 result:   ERROR 17-sim-arch_merge rc=14
[ 1159s] Test 30-sim-socket_syscalls%%015-00001 result:   ERROR 30-sim-socket_syscalls rc=14
[ 1159s]  test mode:  c
[ 1159s]  test type:  bpf-valgrind
[ 1163s] Test 30-sim-socket_syscalls%%016-00001 result:   FAILURE 30-sim-socket_syscalls rc=14
abuild@ares40:/home/abuild/rpmbuild/BUILD/libseccomp-2.3.0/tests> ./30-sim-socket_syscalls; echo $?
14
abuild@ares40:/home/abuild/rpmbuild/BUILD/libseccomp-2.3.0/tests> ./29-sim-pseudo_syscall ; echo $?
#
# pseudo filter code start
#
# filter for arch x86 (1073741827)
if ($arch == 1073741827)
  # default action
  action ALLOW;
# invalid architecture action
action KILL;
#
# pseudo filter code end
#
0

So it does not output anything beyond 14 (EFAULT).

BUG: SCMP_CMP_GT/GE/LT/LE not working as expected for negative syscall arguments

Hi!

I'm not sure if the current behavior of SCMP_CMP_GT/GE/LT/LE is working as intended or if there is a bug in its implementation. The man page for seccomp_rule_add has only this to say about SCMP_CMP_GT:

SCMP_CMP_GT:
        Matches when the argument value is greater than the datum value,
        example:

        SCMP_CMP( arg , SCMP_CMP_GT , datum )

The man page does not specify the type for datum and has examples for various (implied) types (and one cast to scmp_datum_t).

Based on the man page, I expected something like this to work for any value given to setpriority's 3rd argument (assume default policy of SCMP_ACT_ALLOW for this):

rc = seccomp_rule_add(ctx, SCMP_ACT_ERRNO(EPERM),
		SCMP_SYS(setpriority),
		3,
		SCMP_A0(SCMP_CMP_EQ, PRIO_PROCESS),
		SCMP_A1(SCMP_CMP_EQ, 0),
		SCMP_A2(SCMP_CMP_GT, 0));

Instead, setpriority(PRIO_PROCESS, 0, -1) results in the syscall being blocked when '-1' is obviously less than '0'. setpriority(PRIO_PROCESS, 0, 0) and setpriority(PRIO_PROCESS, 0, 1) work as expected. What is happening is that '-1' is being converted to scmp_datum_t (uint64_t from secomp.h.in) which of course makes it positive, but SCMP_CMP_GT and friends aren't handling this conversion. SCMP_CMP_EQ works just fine with a negative datum (guessing datum is still positive (I didn't verify), but the comparison is between converted scmp_datum_t).

This behavior was confirmed with 2.1.0+dfsg-1 (Ubuntu 14.04 LTS, 3.13 kernel), 2.2.3-3ubuntu3 (Ubuntu 16.04 LTS, 4.9 kernel), 2.3.1-2ubuntu2 (Ubuntu 17.04 dev release, 4.9 kernel) and master from a few moments ago (on Ubuntu 17.04 dev release, 4.9 kernel), all on amd64.

AFAICT, there are no tests for SCMP_CMP_GT and SCMP_CMP_LE. The few tests for SCMP_CMP_LT don't seem to account for negative values and neither does the one for SCMP_CMP_GE (please correct me if I'm wrong).

The question is then: is this behavior intentional? If so, while I admit that it could be argued that the man page is accurate since these are working perfectly correctly when understanding scmp_datum_t is the data type, this situation is not immediately clear and the man page should probably say that applications need to account for this. Otherwise, this appears to be a bug in the implementation for SCMP_CMP_GT/GE/LT/LE.

Here is a small program that demonstrates this issue with SCMP_CMP_GT, though GE, LT and LE can all be observed to have the same behavior:

/*
 * gcc -o test-nice test-nice.c -lseccomp
 * sudo ./test-nice 0 1  # should be denied
 * sudo ./test-nice 0 0  # should be allowed
 * sudo ./test-nice 0 -1 # should be allowed?
 */
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <ctype.h>
#include <string.h>
#include <fcntl.h>
#include <stdarg.h>
#include <seccomp.h>
#include <sys/resource.h>

int main(int argc, char **argv)
{
	if (argc < 3) {
		fprintf(stderr, "test-nice N N\n");
		return 1;
	}

	int rc = 0;
	scmp_filter_ctx ctx = NULL;
	int filter_n = atoi(argv[1]);
	int n = atoi(argv[2]);

	// Allow everything by default for this test
	ctx = seccomp_init(SCMP_ACT_ALLOW);
	if (ctx == NULL)
		return ENOMEM;

	printf("set EPERM for nice(>%d)\n", filter_n);
	rc = seccomp_rule_add(ctx, SCMP_ACT_ERRNO(EPERM),
			SCMP_SYS(setpriority),
			3,
			SCMP_A0(SCMP_CMP_EQ, PRIO_PROCESS),
			SCMP_A1(SCMP_CMP_EQ, 0),
			SCMP_A2(SCMP_CMP_GT, filter_n));

	if (rc != 0) {
		perror("seccomp_rule_add failed");
		goto out;
	}

	rc = seccomp_load(ctx);
	if (rc != 0) {
		perror("seccomp_load failed");
		goto out;
	}

	// try to use the filtered syscall
	errno = 0;
	printf("Attempting nice(%d)\n", n);
	nice(n);
	if (errno != 0) {
		perror("could not nice");
		if (filter_n > n)
			fprintf(stderr, "nice(%d) unsuccessful. bug?\n", n);
		rc = 1;
		goto out;
	} else
		printf("nice(%d) successful\n", n);

out:
	seccomp_release(ctx);

	return rc;
}

BUG: errors in seccomp_rule_add(3) manpage example code

Using the example code in the manpage I get several errors:

error: ‘NULL’ undeclared (first use in this function) if (ctx == NULL)

solution: #include <stddef.h>

error: Bad system call

solution: exit_group() and exit() are needed. whitelisting will help

working example code

compiled with "gcc libeccompExample.c -lseccomp"

Now it seams to work but:
running "strace -xfc a.out" results in:

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
  0.00    0.000000           0         2           read
  0.00    0.000000           0         5           open
  0.00    0.000000           0         4           close
  0.00    0.000000           0         4           fstat
  0.00    0.000000           0         8           mmap
  0.00    0.000000           0         6           mprotect
  0.00    0.000000           0         1           munmap
  0.00    0.000000           0         4           brk
  0.00    0.000000           0         1           access
  0.00    0.000000           0         1           execve
  0.00    0.000000           0         1           prctl
  0.00    0.000000           0         1           arch_prctl
  0.00    0.000000           0         2         1 seccomp
------ ----------- ----------- --------- --------- ----------------
100.00    0.000000                    40         1 total

So there is an error with the seccomp syscall?
(seccomp still seems to work)

running "strace a.out" again:

execve("./a.out", ["./a.out"], [/* 28 vars */]) = 0
brk(NULL) = 0x1941990
access("/etc/ld.so.preload", R_OK) = 0
open("/etc/ld.so.preload", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
close(3) = 0
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=118589, ...}) = 0
mmap(NULL, 118589, PROT_READ, MAP_PRIVATE, 3, 0) = 0x3ceadf98000
close(3) = 0
open("/usr/lib/libseccomp.so.2", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\220\361\1\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=264456, ...}) = 0
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x3ceadf96000
mmap(NULL, 2359552, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x3ceadb56000
mprotect(0x3ceadb82000, 2093056, PROT_NONE) = 0
mmap(0x3ceadd81000, 90112, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x2b000) = 0x3ceadd81000
close(3) = 0
open("/usr/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\260\3\2\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1951744, ...}) = 0
mmap(NULL, 3791152, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x3cead7b8000
mprotect(0x3cead94d000, 2093056, PROT_NONE) = 0
mmap(0x3ceadb4c000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x194000) = 0x3ceadb4c000
mmap(0x3ceadb52000, 14640, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x3ceadb52000
close(3) = 0
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x3ceadf94000
arch_prctl(ARCH_SET_FS, 0x3ceadf94700) = 0
mprotect(0x3ceadb4c000, 16384, PROT_READ) = 0
mprotect(0x3ceadd81000, 86016, PROT_READ) = 0
mprotect(0x600000, 4096, PROT_READ) = 0
mprotect(0x3ceadfb9000, 4096, PROT_READ) = 0
munmap(0x3ceadf98000, 118589) = 0
brk(NULL) = 0x1941990
brk(0x1962990) = 0x1962990
brk(0x1963000) = 0x1963000
open("file.txt", O_RDONLY) = 3
prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0) = 0
seccomp(SECCOMP_SET_MODE_STRICT, 1, NULL) = -1 EINVAL (Invalid argument)
seccomp(SECCOMP_SET_MODE_FILTER, 0, {len=28, filter=0x1944010}) = 0
exit_group(0) = ?
+++ exited with 0 +++

we have two seccomp calls:

seccomp(SECCOMP_SET_MODE_STRICT, 1, NULL) = -1 EINVAL (Invalid argument)
seccomp(SECCOMP_SET_MODE_FILTER, 0, {len=28, filter=0x1944010}) = 0

It seems the first one fails. Why is that?

Tested on linux 3.8.11 with libseccomp 2.3.1

BUG: cannot add blacklist items to a whitelist

For docker we ship a default seccomp profile that is a whitelist using libseccomp-golang see https://github.com/docker/docker/tree/master/profiles/seccomp

However it seems to be impossible to have a default action of ERRNO and then add an ERRNO rule to blacklist a particular pattern as we get a "requested action matches default action of filter" error message passed through from libseccomp.

We want to block some particular argument values of a particular syscall (see moby/moby#23893 ), so eg setsockopt(x, 0, 41, x) and setsockopt(x, 0, 96, x) and setsockopt(x, 41, 41, x) should be denied, while allowing any other values, but although this can easily be written with a socket filter directly, it does not seem to be possible to write with libseccomp.

I was wondering if perhaps it could accept rules with the same action as the default and construct the appropriate bpf.

This is also mentioned in this issue comment, although there is workaround in that case #27 (comment)

RFE: add runtime autodetection

This should be useful eg, for systemd/systemd#3882 .

I did a tentative patch for systemd that reads if Seccomp is mentioned in /proc/self/status:

bool is_seccomp_enabled() {
        _cleanup_free_ char* field = NULL;
        return get_proc_field("/proc/self/status", "Seccomp", "\n", &field) == 0;
}

But I think it is best if libseccomp provides a canonical way to test for the availability of seccomp.

strange behaviour on v4.4 kernels with libseccomp v2.3.1

Hello,

I'm in a i686 chroot, and using scmp_sys_resolver I see multiple regression with syscall look-ups. The tests I run is to check that number->name->number lookups are roundtrip safe, and they appear to not be as per below:

# ./scmp_sys_resolver 373
shutdown
# ./scmp_sys_resolver shutdown
-113
# ./scmp_sys_resolver '\-113'
-1
# ./scmp_sys_resolver -t shutdown
102
# ./scmp_sys_resolver 102
socketcall

My expectation was for the output to just be:

# ./scmp_sys_resolver 373
shutdown
# ./scmp_sys_resolver shutdown
373

I see this regression for other syscalls too, on i686 and on s390x architectures.
amd64 and ppc64el pass correctly.

s390x failures:

FAIL: 357 (recvmmsg) != recvmmsg (-119)
FAIL: 358 (sendmmsg) != sendmmsg (-120)
FAIL: 359 (socket) != socket (-101)
FAIL: 360 (socketpair) != socketpair (-108)
FAIL: 361 (bind) != bind (-102)
FAIL: 362 (connect) != connect (-103)
FAIL: 363 (listen) != listen (-104)
FAIL: 364 (accept4) != accept4 (-118)
FAIL: 365 (getsockopt) != getsockopt (-115)
FAIL: 366 (setsockopt) != setsockopt (-114)
FAIL: 367 (getsockname) != getsockname (-106)
FAIL: 368 (getpeername) != getpeername (-107)
FAIL: 369 (sendto) != sendto (-111)
FAIL: 370 (sendmsg) != sendmsg (-116)
FAIL: 371 (recvfrom) != recvfrom (-112)
FAIL: 372 (recvmsg) != recvmsg (-117)
FAIL: 373 (shutdown) != shutdown (-113)

i686 failures:

FAIL: 337 (recvmmsg) != recvmmsg (-119)
FAIL: 345 (sendmmsg) != sendmmsg (-120)
FAIL: 359 (socket) != socket (-101)
FAIL: 360 (socketpair) != socketpair (-108)
FAIL: 361 (bind) != bind (-102)
FAIL: 362 (connect) != connect (-103)
FAIL: 363 (listen) != listen (-104)
FAIL: 364 (accept4) != accept4 (-118)
FAIL: 365 (getsockopt) != getsockopt (-115)
FAIL: 366 (setsockopt) != setsockopt (-114)
FAIL: 367 (getsockname) != getsockname (-106)
FAIL: 368 (getpeername) != getpeername (-107)
FAIL: 369 (sendto) != sendto (-111)
FAIL: 370 (sendmsg) != sendmsg (-116)
FAIL: 371 (recvfrom) != recvfrom (-112)
FAIL: 372 (recvmsg) != recvmsg (-117)
FAIL: 373 (shutdown) != shutdown (-113)

Are my expectation somehow wrong? is v4.5 kernel required for v2.3.1 release? Is there a bug in compat between kernel versions?

RFE: a logo would be nice

It would be really nice to have a logo for libseccomp; something we could use for project pages and presentations.

RFE: add "last known good" test BPF output to the tree for regression testing

We have some basic tooling to generate (tests/testgen) and compare (tests/testdiff) BPF output from multiple test runs, we should investigate adding known good BPF output from the tests to the tree to help catch problems with BPF generation.

These "known good" outputs will need to be maintained/updated as improvements are made to libseccomp, but the overhead shouldn't be prohibitive.

BUG: misleading name of test 18-sim-basic_whitelist

Hi,

I was reading tests/18-sim-basic_whitelist.c.

If I understand it correctly, it does the following:

  • Disallow some read, write, close, and rt_sigreturn syscalls (only if they act on stdin, stdout, stderr).
  • Allow everything else (in particular, reading/writing to any other file descriptor is allowed)

This is not whitelisting, this is blacklisting.

Should the file be renamed? Should all KILLs and ACCEPTs be swapped to achieve whitelisting?

It would be nice to have a true whitelisting example, since this is the strongly recommended use of seccomp.

RFE: support inclusion of hand crafted BPF

For those scenarios where the built-in filters are not sufficient, provide support for libseccomp users to insert their own hand-crafted BPF filter code. We should support inserting BPF code at the following points:

  • At the very start of the filter, before the arch/ABI checks take place.
  • After the arch/ABI checks, but before the application specified filter rules.
  • After the application specified filter rules and before the default action.

Socket filtering broken on x86

I'm trying to filter socket families, but it seems to be broken on x86. Filtering any family will cause AF_UNIX (all?) to be blocked. Here's my simple program trying to open the X socket.

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>
#include <fcntl.h>
#include <seccomp.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <X11/Xlib.h>

/* Build with:
 * $CC -o seccomp seccomp.c `pkg-config --cflags --libs libseccomp x11`
 */

int main(int argc, char *argv[])
{
  scmp_filter_ctx seccomp;
  int family = AF_VSOCK; /* any non AF_UNIX family */
  int ret;
  Display *dpy;

  seccomp = seccomp_init(SCMP_ACT_ALLOW);
  if (!seccomp)
    {
      fprintf(stderr, "Could not initialize seccomp\n");
      exit(1);
    }

  ret = seccomp_arch_add(seccomp, SCMP_ARCH_X86);
  if (ret < 0 && ret != -EEXIST)
    {
      fprintf(stderr, "Failed to add x86 seccomp arch: %s\n",
              strerror(-ret));
    }

  ret = seccomp_rule_add(seccomp, SCMP_ACT_ERRNO(EAFNOSUPPORT),
                         SCMP_SYS(socket), 1,
                         SCMP_A0(SCMP_CMP_EQ, family));
  if (ret < 0)
    {
      fprintf(stderr, "Failed to block socket family %d: %s\n", family,
              strerror(-ret));
      exit(1);
    }

  ret = seccomp_load(seccomp);
  if (ret < 0)
    {
      fprintf(stderr, "Failed to load seccomp filter: %s\n", strerror(-ret));
      exit(1);
    }

  dpy = XOpenDisplay(NULL);
  if (!dpy)
    {
      fprintf(stderr, "Failed to open X display: %s\n", strerror(errno));
      exit(1);
    }

  printf("X displayed opened!\n");
  return 0;
}

I'm using libseccomp master. Building as a 64 bit program, everything works fine. Building as a 32 bit program, I get Failed to open X display: Address family not supported by protocol. I've tried this on a couple systems. One is Fedora 20 running a 64 bit kernel on both 64 and 32 bit userspace. The other is on a debian style system with 32 bit userspace. In both cases, the kernel is a bit old, but I'm not sure that's the issue since the 64 bit program works. On Fedora 20, the kernel is 3.19.3. On the debian style system, the kernel is based on Ubuntu's 3.16 series, but I've also run it in a chroot on the Fedora 20 system.

Any ideas?

BUG: s390/s390x need to support both multiplexed and direct wired socket syscalls

Linux kernel commit:

commit 977108f89c989b1eeb5c8d938e1e71913391eb5f
Author: Heiko Carstens <[email protected]>
Date:   Thu Sep 17 18:30:36 2015 +0200

s390: wire up separate socketcalls system calls

As discussed on linux-arch all architectures should wire up the separate
system calls that are hidden behind the socketcall multiplexer system call.

It's just a couple more system calls and gives us a very small performance
improvement.

Signed-off-by: Heiko Carstens <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>

RFE: export filter in a C header file format

Add functionality to export the seccomp BPF filter in a C header file format suitable for including in a project. Ideally the generated C header would include the filter loading function which would handle the prcrtl()/seccomp() call as well as any NNP settings as defined by the filter.

Proposed new libseccomp API:

API int seccomp_export_chdr(const scmp_filter_ctx ctx, int fd);

Q: verify libseccomp testsuite runs correctly with FORTIFY_SOURCE=2

While most distributions likely support gcc's FORTIFY_SOURCE feature, we don't want to enable it by default in the build (that is best left to the distribution packagers). This issue is to enable FORTIFY_SOURCE=2 and run our automated tests to ensure that we don't have any obvious problems.

RFE: static analysis using scmp_app_inspector

It would be great if the scmp_app_inspector tool could run a static analysis scan on a compiled binary to retrieve all of the system calls compiled in, since a dynamic analysis will most likely not catch all of the system calls a program could call (i.e. programmed to call). If gcc debug symbols were required to make the solution simpler that would work as well.

Q: dereferencing syscall arguments

In many introductions about seccomp it is mentioned that seccomp-bpf programs may not dereference
pointers which constrains all filters to solely evaluating the system call arguments directly.

However in the libseccomp man page there is an example showing just that if I am not mistaken:

fd = open("file.txt", 0);

rc = seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(read), 3,
			      SCMP_A0(SCMP_CMP_EQ, fd),
			      SCMP_A1(SCMP_CMP_EQ, (scmp_datum_t)buf),
			      SCMP_A2(SCMP_CMP_LE, BUF_SIZE));
	if (rc < 0)
		goto out;

	rc = seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(write), 1,
SCMP_CMP(0, SCMP_CMP_EQ, fd));

Does this mean that with libseccomp it is possible to filter for specific files and thereby create white and blacklists? And is this safe to use for such purposes?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.