Comments (16)
If we assume the host's
/tmp
is tmpfs
As a data point, in Debian this is easy to make true, but has never been the default. Because some software packages and some users rely on an on-disk /tmp
being larger than RAM, changing that default is sufficiently problematic that the Debian systemd maintainers have specifically disabled tmpfs.mount
.
from bubblewrap.
Well, if you have a full chroot you can just do "--mount-bind /some/chroot /" and you don't have to create much files. However, for xdg-app I want to have an internal tmpfs, because it is guaranteed to be isolated from the host, and cleaned up by the kernel when its not in use anymore.
It may make sense to expose some container /tmp to systemd-tmpfiles, but I don't think that is a always what you want.
If the host /tmp is not a tmpfs you can't pivot_root at all from it, so i don't think relying on that is a good idea. What is the problem with using a tmpfs?
from bubblewrap.
This is just about trying to reduce the amount of setuid code - if we can figure out how to have the caller create the files, a lot of things drop out.
Maybe we have --rootfs-prepare /usr/bin/mysandbox-rootfs-prep
which gets called with its cwd pointing to the tmpfs, but using the host's mount namespace?
from bubblewrap.
The way I'm thinking of this is - the tool shouldn't have any more than we would want in the kernel. We wouldn't put generating /etc/passwd
files into the kernel, so this tool shouldn't do it either.
from bubblewrap.
Can you bind-mount a /proc/self/fd file? If so you could pass in e.g. the passwd files as a fd and bind mount that.
I would very much like to keep everything related to a container owned by the container so that its auto-cleaned-up by the kernel. This is very important in the desktop case where things are started ad-hoc and there is no tracking of "all running containers" by some management thing.
I thougth initially about allowing the caller to shell out before the final unmount of the host fs, but its tricky. You have to create things in the right order, and you likely will need a mix of file creation and bind mounts, so you'll need to allow shelling out at arbitrary points in the construction of the root.
In the end I was not able to make that work though, because for it to work it has to happen before the pivot_root so that the host side binaries work. But in that case --bind-mount / / does not work, because that would try to bind-mount / onto something inside / which fails the mount. For that to work i had to make the tmpfs root the real root, and then pivot_root so that both the new root and the old root are subdirs of the in-between root.
from bubblewrap.
What we could do is traditional privilege separation though. I.e. fork and drop caps for all but the operations that need caps.
from bubblewrap.
This branch is an experiment with privilege separation:
https://github.com/alexlarsson/bubblewrap/tree/privilege-separation
Opinions?
from bubblewrap.
It still seems like there's a lot of code for e.g. copying files in and making etc/passwd? Also you didn't seem to update demos/
for it? I added the demo-shell one with this intent - we can hoist things like generating /etc/passwd
into that, right?
from bubblewrap.
Yeah, i want to move that out to a generic "pull data from fd", but it still seems like a worthwile thing.
from bubblewrap.
The reason I'm concerned about this is - while because we're using the user's uid as the fsuid, most filesystem operations should be pretty safe, I have a lingering concern about a stray portion of a synthetic fs (cgroup, proc, sysfs) that has DAC permissions that allow a user to read/write to it but gates on CAP_SYS_ADMIN in the code.
I briefly looked at:
for x in /proc /sys; do
find $x -type f -uid 0 -perm -o+w 2>/dev/null
done
There is one in the cgroup code that looks safe, some selinux things that also look safe (they require the requisite selinux perms) but /proc/$pid/attr
jumped out as writable. Luckily, writing to these requires CAP_SYS_PTRACE
, not CAP_SYS_ADMIN
. But still...
from bubblewrap.
Yeah, but with the privilege-separation in the above branch we would not touch these files as CAP_SYSADMIN. Only the mounts happen with privileges.
from bubblewrap.
OK, right. I'd still feel more comfortable if we moved as much as possible out of the binary. We could have bwrap-core
which is setuid, and install a bwrap-cli
which is a friendlier (unprivileged) wrapper?
from bubblewrap.
Yeah, i'll remove some stuff
from bubblewrap.
I added --make-bind-file which takes input from an fd, writes to a file and then bindmounts that over some other file. With this I was able to simplify things quite a bit.
The one thing I dislike is that the flags to --mount-bind (readonly/allow device) are separate
arguments, rather than some modifier for --mount-bind. I don't see a nice way to handle that without having to parse and possibly unescape the filename args though.
from bubblewrap.
We discussed possibly taking arguments NUL
separated from stdin.
from bubblewrap.
I added --args to master which takes a NUL-separated list of extra args.
from bubblewrap.
Related Issues (20)
- [How-to] Handle 'chroot' system calls as an unprivileged user HOT 2
- Binding of joystick inside bubblewrap HOT 2
- bubblewrap should fall back to MS_MOVE if pivot_root() fails HOT 3
- What is a proper way to have a regular user with sudo and root in container? HOT 3
- "pivot_root: Invalid argument" when running on a SLURM cluster node from NFS HOT 12
- Overlayfs masking/whiteout layer
- Bubblewrap trying to access `/proc/sys/kernel/overflowuid` HOT 1
- Assessment of the difficulty in porting CPU architecture for bubblewrap HOT 1
- Best practices for running games on Linux with Nvidia HOT 6
- Fails to build with meson 1.3.0 rc1 due to broken bash-completion handling HOT 7
- Please specify the license in Github HOT 1
- [Question] How does bwrap handle nested bindings? HOT 3
- enhancement: --daemonize-with-child option
- not immediately obvious that `--file` can overwrite a file mounted rw from outside the container HOT 4
- bwrap processes not exiting cleanly under Linux 6.8 (likely kernel regression) HOT 24
- Is there like a native C Library?
- Mount private information leakage HOT 5
- `bwrap` broke on Ubuntu 24.04 HOT 4
- `--die-with-parent` fails to clean up due to a race condition if the parent bwrap process is killed soon after startup
- Child PID from `--info-fd` and `--json-status-fd` is not concurency safe
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bubblewrap.