Code Monkey home page Code Monkey logo

Comments (19)

alexlarsson avatar alexlarsson commented on June 13, 2024 1

So, here is the issue. When we're using user namespaces we need to have the privileged parent set the uid mapping in the child user namespace, because there is not necessarily a 1-1 mapping. However, the child, as well as the parent are non-dumpable due to the initial setuid bit.

Non-dumpable processes have uid 0 owning /proc/pid/' files though, so the parent (running as the user uid) isn't allowed to set the uid map unless the child is made dumpable.

However, this is only needed in the case when we need user namespaces. If we're not, then being non-dumpable is fine. So, my solution is to only make the child non-dumpable in the case of using user namespaces.

But, i hear you, then we'll be able to ptrace the child in the case of using user namespaces, causing this bug to re-appear! That is not true though, because in a user namespace we lose all capabilities in the parent namespace, so we will not be able to do something affecting the host namespace anyway (i.e. sethostname gets EPERM unless you unshared UTS ns). In fact, there is not much we can do here, because any process in the parent user namespace (i.e. the regular host namespace) have all caps in the user namespace, including CAP_SYS_PTRACE, so we override the value of dumpable anyway.

from bubblewrap.

smcv avatar smcv commented on June 13, 2024

As a short-term fix for Debian, I've reverted #94 so there is no way to induce the privileged process to issue the sethostname() syscall, but I suspect a better long-term answer is to isolate the use of PR_SET_DUMPABLE to precisely where it is needed.

See also http://www.openwall.com/lists/oss-security/2016/10/13/4

For some reason it sets the PR_SET_DUMPABLE flag, as seen below. The comment about
it looks strange to me. If thats really true, suid programs shouldn't
be forced to play with the dumpable flag to achieve their goal.

I assume the developers of Bubblewrap wouldn't have done this if the
kernel (or at least a kernel they care about) didn't require it.
But hopefully becoming dumpable can be restricted to smaller sections
of the code.

If it's only for write_uid_gid_map(), then one way would be for that
function to fork(), with the child making itself dumpable and writing
the map files, and the parent just waiting for the child?
...
I believe the intention is that none of the operations that can pass over
the privilege separation socket are problematic, because they only affect
the sandboxed processes, and if you can ptrace bwrap then you can certainly
ptrace and subvert the sandboxed processes too. SETUP_SET_HOSTNAME clearly
doesn't match that intention, if Bubblewrap didn't unshare the UTS namespace
beforehand; Bubblewrap does insist that --hostname can't be used without
--unshare-uts, but you're right that a user outside the sandbox can ptrace
it and bypass that check by making it behave as though --hostname had been
used.

As a quick temporary fix for Debian, I'm reverting the addition of
SETUP_SET_HOSTNAME.

Am I right in thinking that this pseudocode would fix it while reinstating
the feature?

case PRIV_SEP_OP_SET_HOSTNAME:
  if (!opt_unshare_uts)
    die_with_error ("Refusing to set hostname in original namespace");
  else if (!sethostname (... as before))
    die_with_error (... as before);

from bubblewrap.

alexlarsson avatar alexlarsson commented on June 13, 2024

The check described above should be fine for that exact exploit, but we should also try to figure out the minimal part that needs DUMPABLE.

from bubblewrap.

alexlarsson avatar alexlarsson commented on June 13, 2024

So, for the dumpable, if you remove the code that sets this, then you get:

 bwrap --unshare-user  --bind / / sh
setting up uid map: Permission denied

Because its trying to write to /proc/self/uid_map, which is root owned for a non-dumpable app.

from bubblewrap.

cgwalters avatar cgwalters commented on June 13, 2024

See #109

from bubblewrap.

alexlarsson avatar alexlarsson commented on June 13, 2024

Here is my approach instead:
#110

from bubblewrap.

cgwalters avatar cgwalters commented on June 13, 2024

Let's call #109 the "fsuid" patch and #110 the "move-dumpable" patch.

I'd argue that the fsuid patch is a lot shorter, and more obviously correct than move-dumpable. Why is the latter better?

from bubblewrap.

alexlarsson avatar alexlarsson commented on June 13, 2024

Because the fsuid patch gives somewhat increases privileges (fsuid 0) which are not needed.

from bubblewrap.

alexlarsson avatar alexlarsson commented on June 13, 2024

In particular it is doing file writes as uid 0.

from bubblewrap.

cgwalters avatar cgwalters commented on June 13, 2024

So the risk with fsuid is things like - what if somehow the pid got reused and we happened to write to the uid map for another pid? Another aspect is that suddenly we're trusting /proc to be the real /proc. But the latter is true of all setuid binaries really.

The risk with dumpable is - now all of a sudden we have to be concerned about the fact that someone could ptrace us. And yes, we are in a user namespace...but in fact, your argument:

In fact any process from the parent user
namespace have ptrace access anyway, due to parent ns having
CAP_SYS_PTRACE (and all other caps) in the child ns.

appears to me to be wrong, because the dumpable flag isn't set, and we need that and the capability to ptrace. If I add in to your patch:

diff --git a/bubblewrap.c b/bubblewrap.c
index 5c7999b..06bc029 100644
--- a/bubblewrap.c
+++ b/bubblewrap.c
@@ -1652,11 +1652,14 @@ main (int    argc,

   if (is_privileged && opt_unshare_user)
     {
+      fprintf (stderr, "** BEFORE SETTING DUMPABLE; pid=%u **\n", getpid());
+      sleep (15);
       /* We have to be dumpable for the parent to be able to set the
          uid map for us. This enables ptracing for the child, but that
          is not really a major*/
       if (prctl (PR_SET_DUMPABLE, 1, 0, 0, 0) < 0)
         die_with_error ("prctl(PR_SET_DUMPABLE) failed");
+      fprintf (stderr, "** AFTER SETTING DUMPABLE **\n");

       /* Let parent continue to set the uid map */
       val = 1;

And try to strace before setting dumpable, I get EPERM as expected.

So the aspect of being in a user namespace obviously mitigates things a lot...but we need to consider risks such as having an open file descriptor outside of the namespace. I doubt we have anything open that's writable, but we'd have to be careful in the future.

If you're arguing that we can enable dumpable, it follows that everything else after that could be arbitrary user code, and we could do:

--- a/bubblewrap.c
+++ b/bubblewrap.c
@@ -1728,7 +1728,7 @@ main (int    argc,
   if (chdir ("/") != 0)
     die_with_error ("chdir / (base path)");

-  if (is_privileged)
+  if (is_privileged && !opt_unshare_user)
     {
       pid_t child;
       int privsep_sockets[2];

Or actually, hypothetically:

@@ -1807,7 +1807,8 @@ main (int    argc,
     die_with_error ("chdir /");

   /* Now we have everything we need CAP_SYS_ADMIN for, so drop it */
-  drop_caps ();
+  if (opt_unshare_user && !opt_retain_caps)
+    drop_caps ();

   if (opt_block_fd != -1)
     {

Which I think is where some of the patches in #101 are going. Basically, in the userns path (whether privileged or not) we rely solely on the kernel for security post-CLONE_NEWUSER.

from bubblewrap.

cgwalters avatar cgwalters commented on June 13, 2024

Here's a good way to compare the patches - on kernels vulnerable to https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2016-3135 (the one we cite in README.md as an example userns vulnerability) - I'm pretty sure your patch exposes the vulnerability, since a caller could inject code to do the iptables syscalls, even though we don't expose that in the binary. But mine does not, because we drop CAP_NET_ADMIN (really all caps) in the namespace before executing the user code.

Basically, your patch makes bubblewrap's security equivalent to that of the full kernel user namespace feature set, whereas mine keeps things restricted to what we expose on the bwrap command line.

(But on the other hand, exposing the full userns feature set is where #101 is going)

from bubblewrap.

smcv avatar smcv commented on June 13, 2024

Basically, your patch makes bubblewrap's security equivalent to that of the full kernel user namespace feature set, whereas mine keeps things restricted to what we expose on the bwrap command line.
...
(But on the other hand, exposing the full userns feature set is where #101 is going)

This sounds like it's heading towards a sysadmin configuration mechanism for "how much exposure to kernel userns bugs am I willing to tolerate?"...

on kernels vulnerable to https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2016-3135 (the one we cite in README.md as an example userns vulnerability) - I'm pretty sure your patch exposes the vulnerability, since a caller could inject code to do the iptables syscalls, even though we don't expose that in the binary

This sounds to me like a good argument for taking the fsuid approach, at least short-term.

from bubblewrap.

alexlarsson avatar alexlarsson commented on June 13, 2024

@cgwalters Your test case is wrong, because getpid() is cached so returns the old value
Try it again with this instead:
fprintf (stderr, "** BEFORE SETTING DUMPABLE; pid=%u real pid=%ld **\n", getpid(), syscall(SYS_getpid));
sleep (15);

from bubblewrap.

alexlarsson avatar alexlarsson commented on June 13, 2024

So, yeah, i get your point, but even with the fsuid approach we are ptraceable.

However, i mentioned this to @eparis and eric biederman, and he said:

ebiederm: alex, eparis That corner case where we let user namespace caps override dumpable looks buggy. I am not certain how that got in there.

So, maybe this will change over time.

from bubblewrap.

cgwalters avatar cgwalters commented on June 13, 2024

Looks like it changed in http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=8409cca7056113bee3236cb6a8e4d8d4d1eef102

from bubblewrap.

alexlarsson avatar alexlarsson commented on June 13, 2024

Maybe the end result is that we just can't expose user namespaces securely (above full kernel access to user namespaces) at all?

from bubblewrap.

cgwalters avatar cgwalters commented on June 13, 2024

OK, let's just paper over this rabbit hole for now and try to replace it after the release with something stronger. I'm fine going with #110 - I'll close #109 .

from bubblewrap.

cgwalters avatar cgwalters commented on June 13, 2024

Closed via #110

from bubblewrap.

cgwalters avatar cgwalters commented on June 13, 2024

https://github.com/projectatomic/bubblewrap/releases/tag/v0.1.3

from bubblewrap.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.