Code Monkey home page Code Monkey logo

syscalls's Introduction

syscalls

Crates.io docs.rs License

This is a low-level library for listing and invoking raw Linux system calls.

Features

  • Provides a syscall enum for multiple architectures (see table below).
  • Provides inlinable syscall functions for multiple architectures (see table below).
  • Provides an Errno type for Rustic error handling.
  • Provides O(1) array-backed SysnoSet and SysnoMap types.

Feature Flags

The features that are enabled by default include std and serde.

std

By default, std support is enabled. If you wish to compile in a no_std environment, use:

syscalls = { version = "0.6", default-features = false }

serde

Various types can be serialized with Serde. This can be enabled with:

syscalls = { version = "0.6", features = ["serde"] }

full

Enables all extra features.

all

Enables syscall tables for all architectures. If you don't need all architectures, you can enable them individually with features like arm, x86, powerpc, etc. See the Architecture Support table below for a full list of available architectures.

Architecture Support

The Enum column means that a Sysno enum is implemented for this architecture.

The Invoke column means that syscalls can be invoked for this architecture.

The Stable Rust? column means that syscall invocation only requires stable Rust. Some architectures require nightly Rust because inline assembly is not yet stabilized for all architectures.

Arch Enum Invoke Stable Rust?
arm* Yes ✅
aarch64 Yes ✅
mips No ❌
mips64 No ❌
powerpc No ❌
powerpc64 No ❌
riscv32 ❌† No ❌
riscv64 Yes ✅
s390x No ❌
sparc N/A
sparc64 N/A
x86 Yes ✅
x86_64 Yes ✅

* Includes ARM thumb mode support.

† Rust does not support riscv32 Linux targets, but syscall functions are implemented if you're feeling adventurous.

Updating the syscall list

Updates are pulled from the .tbl files in the Linux source tree.

  1. Change the Linux version in syscalls-gen/src/main.rs to the latest version. Only update to the latest stable version (not release candidates).
  2. Run cd syscalls-gen && cargo run. This will regenerate the syscall tables in src/arch/.

syscalls's People

Contributors

dvdhrm avatar hack3ric avatar jasonwhite avatar kpp avatar loganwendholt avatar nyurik avatar reedobrien avatar shurizzle avatar wangbj avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

syscalls's Issues

Support aarch64

aarch64 is kind of a big deal and this should definitely be supported. However, the way that the kernel implements the syscall list for this architecture is a bit different than the other architectures. There is no .tbl file containing all of the syscalls that we can parse. Instead, it uses a newer way where the syscalls are #defined in a header file. Thus, we'd need to evaluate this header file (plus any header files it #includes). That could be as simple as using a regex to parse it or using a real preprocessor to do it.

`syscall0()` should maybe not be marked `readonly`

The syscall0() helper currently uses a readonly annotation for the inline-asm. However, this does not necessarily apply to all system-calls, even though they might take no arguments. I would suggest dropping this annotation.

One particular example is the RESTART_SYSCALL system call, which takes no arguments and restarts a syscall that was interrupted. You are unlikely to invoke it directly, but you will see it in stack-frames during signal-handling. This system-call can definitely write to memory, but takes no arguments.

Also note that many linux system calls assume to work on the task_struct of the calling process. Things like prctl() allow saving custom user-space pointers per-task for various reasons (e.g., automatic lock-unfreezing on crash). Hence, there might be syscalls that take no arguments but inevitably end up writing to memory. Exit, for instance, writes to memory but could be argued does not apply here as it does not return, anyway.

I have not digged too far, yet, and this might be a rather theoretic issue. However, at the same time the speedup by using readonly seems negligible, given that the kernel writes the register-set to the stack, anyway. And for fast alternatives, the VDSO-entries are better.

Anyway, this is mostly meant as hint and suggestion to drop the readonly annotation. But feel free to close this, if you think it is worth it to keep.

Trouble with FP (r7) on ARM

Hello,
The project mentions ARM is supported, however I get the following error when trying to compile as simple call to the syscall wrapper:

$ cargo build --target=armv7-linux-androideabi
...
error: cannot use register `r7`: the frame pointer (r7) cannot be used as an operand for inline asm
  --> $HOME/.cargo/registry/src/github.com-1ecc6299db9ec823/syscalls-0.6.12/src/syscall/arm.rs:29:9
   |
29 |         in("r7") n as usize,
   |         ^^^^^^^^^^^^^^^^^^^

I am using the following version of Rust

cargo --version
cargo 1.70.0 (ec8a8a0ca 2023-04-25)

Note that I have no error when compiling for aarch64

Let me know if you need more details.

Out of bounds array access in SyscallNo::name

This code can cause an out of bounds access on higher-numbered syscalls:
https://github.com/wangbj/syscalls/blob/fa436e7733a6510ccfda1158bd93a5f69bffc9bf/src/nr.rs#L709-L711

Since the list of syscall numbers are not contiguous, calling SyscallNo::name on the following syscalls should fail (since their number is larger than the last index in the SYSCALL_NAMES array):

    SYS_pidfd_send_signal = 424,
    SYS_io_uring_setup = 425,
    SYS_io_uring_enter = 426,
    SYS_io_uring_register = 427,
    SYS_open_tree = 428,
    SYS_move_mount = 429,
    SYS_fsopen = 430,
    SYS_fsconfig = 431,
    SYS_fsmount = 432,
    SYS_fspick = 433,

There are a couple of ways to solve this:

  1. Make SyscallNo::name aware of the gaps.
  2. Include the gaps in SYSCALL_NAMES so it has the right length.

I vote for option (2) since that ought to be the most efficient way.

Architecture-independent `syscall_map` macro

Among the syscall tables for each architecture, there is quite a lot of overlap. That is, many of the syscalls are used by all architectures. Relatively few syscalls only exist in one or several architectures.

There is a common pattern of wanting to create a mapping of all possible syscalls in all architectures to some type T, but we don't want to have to figure out which architectures a particular syscall is valid on.

syscalls-gen can programmatically figure out which syscalls are common among all architectures (even if they have different numbers) and which ones are architecture-specific.

syscall_map macro

Using this information, we should be able to create a macro that is used like this:

static SYSCALL_DESCRIPTIONS: SysnoMap<&'static str> = syscall_map! {
    pipe => "ceci n'est pas une pipe",
    pipe2 => "ceci est une pipe",
};

In this example, pipe is not available on aarch64, but pipe2 is available on all architectures, so the following code should be generated:

static SYSCALL_DESCRIPTIONS: SysnoMap<&'static str> = SysnoMap::from_slice(&[
    #[cfg(not(any(target_arch = "aarch64")))]
    (Sysno::pipe, "ceci n'est pas une pipe"),
    (Sysno::pipe2, "ceci est une pipe"),
]);

syscall_map_for_arch macro

To support architectures that are not for the current target architecture, then the following macro could also be created:

static SYSCALL_DESCRIPTIONS: SysnoMap<&'static str> = syscall_map_for_arch!(aarch64, {
    pipe => "ceci n'est pas une pipe",
    pipe2 => "ceci est une pipe",
});

Which would generate the following code:

static SYSCALL_DESCRIPTIONS: ::syscalls::aarch64::SysnoMap<&'static str> = ::syscalls::aarch64::SysnoMap::from_slice(&[
    (::syscalls::aarch64::Sysno::pipe2, "ceci est une pipe"),
]);

Of course, this will require SysnoMap to be supported on all architectures simultaneously. See #21 for why this is tricky without const fn in traits.

Implementation Details

These macros should probably be procedural macros because it will make the implementation easier.

The architecture metadata can be generated one time by syscalls-gen. The result could simply be a mapping like:

[
    ("pipe", ["x86", "x86_64",  /* ... */]),
    ("pipe2", ["aarch64", "x86", "x86_64", /* ... */]),
    // ...
]

The proc macro can then use this info to generate the SysnoMap that we want.

Add SysnoMap

We have SysnoSet for creating sets of syscalls, but no way to create a static mapping of syscalls to arbitrary types.

Usage would look something like this:

use syscalls::{Sysno, SysnoMap};

fn read_callback() {
    println!("read called");
}

fn write_callback() {
    println!("write called");
}

// A static mapping of sysno -> function callback.
static SYSCALL_FUNCTIONS: SysnoMap<fn()> = SysnoMap::new(&[
    (Sysno::read, read_callback),
    (Sysno::write, write_callback),
]);

fn main() {
    // Do a constant-time lookup to call `read_callback`.
    SYSCALL_FUNCTIONS[Sysno::read]();
}

This could also be done with a large match statement, but that is cumbersome and the compiler may not optimize it very well if the match is sparse.

The data type would look something like this:

pub struct SysnoMap<T> {
    table: [Option<T>; Sysno::table_size()],

    // ...or alternatively, use SysnoSet + MaybeUninit to avoid the Option:
    set: SysnoSet, // For knowing which items are initialized
    table: [MaybeUninit<T>; Sysno::table_size()],
}

The API should try to mimic the HashMap API as much as possible and have as many methods be const fn as possible.

Add predefined sets of syscalls

With SysnoSet, we can create a bitset of syscalls at compile time. Having predefined sets for various groups of related syscalls is useful for constructing seccomp filters. strace for example, categorizes syscalls in their syscall tables.

We could have SysnoSets for:

  • All syscalls that create file descriptors.
  • All syscalls that take file descriptors as parameters.
  • All network related syscalls.
  • All memory related syscalls.
  • All state-related syscalls.
  • All signal-related syscalls.
  • All syscalls that never fail (e.g., getpid, gettid).
  • All process-related syscalls.

The great thing about using a bitset for these is that they can be easily manipulated with set operations (e.g., union, intersection, difference).

These syscall sets should probably be behind a feature flag to avoid slowing down compilation for crates that only need basic functionality.


See also:

Support for multiple architectures

I'd like to use this module for parsing syscall numbers returned by ptrace. This is great work, however under x64_64, it is possible to call 64 bit style syscalls using the syscall instruction as well as 32 bit style syscalls using int 0x80 on many plattforms, so the ability to decode other plattforms would be good too.

The simplest way would be to simply expose the plattform specific enums seperately on every plattform, however a helper From(u64, arch) function would probably be cool too.

Portability beyond x86_64

Currently, syscalls targets only x86_64. While this architecture is very common, other architectures (e.g. i386, arm64) may also be desirable targets. I believe enabling portability to other architectures would require the following main changes:

  • Move nr.rs and syscall.c to a directory such as arch/x86_64.
  • Modify syscalls-gen to write nr.rs to the appropriate directory for the architecture it is being run on.
  • Add a wrapper module that reexports everything in the right nr.rs for the target.
  • Build in the right syscall.c for the target.

I may get around to implementing this myself some day in the distant future, in which case I'll put in a PR.

Crate does not build on Ubuntu 18.04

Running cargo build on my Ubuntu 18.04 box results in the following error:

   Compiling syscalls v0.1.0 (/home/logan/cloudseal/syscalls)
error: failed to run custom build command for `syscalls v0.1.0 (/home/logan/cloudseal/syscalls)`

Caused by:
  process didn't exit successfully: `/home/logan/cloudseal/syscalls/target/debug/build/syscalls-0e6f2ed3f9466713/build-script-build` (exit code: 101)
--- stderr
thread 'main' panicked at 'assertion failed: `(left == right)`
  left: `335`,
 right: `424`: generated syscalls are not sequential', build.rs:101:9
stack backtrace:
   0: backtrace::backtrace::libunwind::trace
             at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.35/src/backtrace/libunwind.rs:88
   1: backtrace::backtrace::trace_unsynchronized
             at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.35/src/backtrace/mod.rs:66
   2: std::sys_common::backtrace::_print
             at src/libstd/sys_common/backtrace.rs:47
   3: std::sys_common::backtrace::print
             at src/libstd/sys_common/backtrace.rs:36
   4: std::panicking::default_hook::{{closure}}
             at src/libstd/panicking.rs:200
   5: std::panicking::default_hook
             at src/libstd/panicking.rs:214
   6: std::panicking::rust_panic_with_hook
             at src/libstd/panicking.rs:477
   7: std::panicking::continue_panic_fmt
             at src/libstd/panicking.rs:384
   8: std::panicking::begin_panic_fmt
             at src/libstd/panicking.rs:339
   9: build_script_build::gen_syscall_nrs
             at ./build.rs:101
  10: build_script_build::main
             at ./build.rs:155
  11: std::rt::lang_start::{{closure}}
             at /rustc/9b91b9c10e3c87ed333a1e34c4f46ed68f1eee06/src/libstd/rt.rs:64
  12: std::rt::lang_start_internal::{{closure}}
             at src/libstd/rt.rs:49
  13: std::panicking::try::do_call
             at src/libstd/panicking.rs:296
  14: __rust_maybe_catch_panic
             at src/libpanic_unwind/lib.rs:80
  15: std::panicking::try
             at src/libstd/panicking.rs:275
  16: std::panic::catch_unwind
             at src/libstd/panic.rs:394
  17: std::rt::lang_start_internal
             at src/libstd/rt.rs:48
  18: std::rt::lang_start
             at /rustc/9b91b9c10e3c87ed333a1e34c4f46ed68f1eee06/src/libstd/rt.rs:64
  19: main
  20: __libc_start_main
  21: _start
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

Support for syscall metadata such as the argument map

I am working on improving lurk -- an strace Rust rewrite. At the moment, lurk's author @JakWai01 has created a big manual map for the x86_64 syscalls, but the approach is very hard to maintain and does not support any other architectures.

The required metadata would be:

  • how many arguments each syscall has
  • the types of each argument: str, int, address, ...?
  • syscall return type: Error(i32), Success(i32), Address(usize), ...?
  • possibly some "group" that describe what each syscall belongs to: File, IPC, Memory, Creds, ...

The groups are a bit tricky as I am not certain if Linux officially describes each syscall in terms of which group(s) it belongs to - so this might be omitted.

We can use a macro_rule! (or even a proc macro) to improve syscall_enum! (adapting from a table)

syscall_enum! {
    pub enum Sysno {
        /// See [read(2)](https://man7.org/linux/man-pages/man2/read.2.html) for more info on this syscall.
        #[params = "(fd: i32, buf: str, count: usize) -> isize", categories = ["desc"]]
        read = 0,
        /// See [write(2)](https://man7.org/linux/man-pages/man2/write.2.html) for more info on this syscall.
        #[params = "(fd: i32, buf: str, count: usize) -> isize", categories = ["desc"]]
        write = 1,
        /// See [open(2)](https://man7.org/linux/man-pages/man2/open.2.html) for more info on this syscall.
        #[params = "(pathname: str, flags: i32, mode: u32) -> isize", categories = ["desc", "file"]]
        open = 2,

The actual syntax for the arguments could be different to simplify macro_rules processing (proc_macros are much harder to implement and maintain).

A macro could parse these params into a few extra functions:

// sequentially store all arguments for all syscalls
static ALL_ARGS: [ArgType; 2500] = [
    ArgType {name: "fd", typ: i32},
    ArgType {name: "buf", typ: str},
    ArgType {name: "count", typ: usize},
    ArgType {name: "fd", typ: i32},
    ArgType {name: "buf", typ: str},
    ArgType {name: "count", typ: usize},
    ArgType {name: "pathname", typ: str},
    ArgType {name: "flags", typ: i32},
    ArgType {name: "mode", typ: u32},
  ...
];

lazy_static! {
    static ref ARG_MAP: [&[ArgType]; 450] = [
        ALL_ARGS[0..3], // read
        ALL_ARGS[3..6], // write
        ALL_ARGS[6..9], // open
        ...
    ];
}

/// Auto-generated function
pub fn get_arguments(syscall: usize) -> &[ArgType] {
    ARG_MAP[syscall]
}

We would also generate lists for categories and return types - TBD of the exact format.

Support ARMv4T Thumb mode (ARM7TDMI)

Currently if you try to compile for ARMv4T architecture with Thumb mode enabled, you get errors like this:

error: instruction variant requires ARMv6 or later
  |
note: instantiated into assembly here
 --> <inline asm>:1:2
  |
1 |     mov r2, r7
  |     ^

Implement SysnoSet for each architecture

Currently, SysnoSet is only implemented for the current architecture. It would be nice to round out the API and include it in all of the architectures exposed via syscalls::{x86,arm,mips,...}::*.

However, to do this, we'll have to generate separate implementations for each architecture. I looked into making a trait to reduce code duplication, but since most of the SysnoSet API is const, there is no way to call trait methods because they can't be const. Thus, I think the whole SysnoSet implementation will need to be wrapped in a macro to stamp out the implementation for each architecture.

Or, we can just wait for const in traits to be stabilized.

Implementing `sparc` and `sparc64` support

Hi, it seems like there's no support for sparc or sparc64 architectures currently. What would be the steps to implement support for making syscalls on them? Would it be possible to incorporate the required assembly code from similar Rust libraries?

Serde should not be an unconditional dependency

Serde is a large library that takes some time to build and adds the better part of a megabyte to the binary size. It also uses std by default, though this can be changed. It should either be only an optional dependency (with its uses made conditional on a particular feature) or be removed entirely.

How to get the raw syscall number from the enum?

Hi, I want to get the number and the name (in &str or String) from the syscall enum.

I found name method is implemented for SyscallNo and each syscall has a number in the enum. Is there a way to get the number (e.g. in i32), from a SyscallNo enum?

Thank you!

Don't panic on From i32

Unless I misunderstood - panic seems rather extreme upon i32 parsing failure - why not return a result?
Or - add a try_from option?

Fix errno enum

Some parts of Errno are arch-specific, so Errno is currently broken for architectures besides x86-64 and aarch64. Instead, we should have a per-architecture definition of Errno.

Support FreeBSD

Currently, most architectures on Linux are supported, but other platforms such as FreeBSD are not yet supported.

The master list of FreeBSD syscalls is at https://cgit.freebsd.org/src/tree/sys/kern/syscalls.master. It should be possible to parse this and generate the necessary syscall tables. Unlike Linux, I believe this list is the same for all architectures under FreeBSD.

For the MacOS kernel, the list is similar: https://github.com/apple-oss-distributions/xnu/blob/main/bsd/kern/syscalls.master (But it looks harder to parse because of #ifdef usage.)

To support the addition of another platform besides Linux, I think the following structure should be used:

  • syscalls::Sysno - Still has the syscall list for the current target architecture and target platform.
  • syscalls::{target_arch}::Sysno will be the syscall list for the architecture target_arch and the current platform.
  • syscalls::{target_os}::{target_arch}::Sysno will be the syscall list for target_os and target_arch.

This should be a backwards compatible change while still allowing the syscall table of other platforms to be available. That is, it might be useful to be able to access the FreeBSD syscall table even if the target OS is Linux.

Thus, the source tree should look like this:

  • src/linux/{target_arch}.rs
  • src/freebsd.rs - Might not need individual Sysno tables for each architecture.
  • src/macos.rs

Support armv7 thumb-mode

We're working on upgrading youki (which depends on syscalls) on alpine Linux. Everything went smooth except for armv7 arch + thumb-mode.

We're running into this error when building (for every occurence of this line in the file):

error: cannot use register `r7`: the frame pointer (r7) cannot be used as an operand for inline asm
  --> /home/buildozer/.cargo/registry/src/index.crates.io-1cd66030c949c28d/syscalls-0.6.9/src/syscall/arm.rs:29:9
   |
29 |         in("r7") n as usize,
   |         ^^^^^^^^^^^^^^^^^^^

It looks we can do something like this to add support:
bytecodealliance/rustix@ebf562f

Add a way to call syscall by number (as opposed to Sysno)

Please, add a variant of syscalls::raw_syscall, which allows calling syscall by raw number as opposed to Sysno.

Why this is needed? Let's assume at some point of the future this crates will stop updating. And I will want to call some syscall, which is not added to Sysno. Attempting to craft fake Sysno instance (using transmute or something) is undefined behavior. So I need a way to call syscall by number

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.