lind-project / lind_project Goto Github PK

View Code? Open in Web Editor NEW

23.0 23.0 7.0 673.44 MB

Lind: Secure Lightweight Adaptive Isolation

Home Page: https://hub.docker.com/r/securesystemslab/lind

License: Apache License 2.0

lind_project's People

Contributors

Stargazers

Watchers

Forkers

happy-ferret xucongyu zxb789 kapkic rusherrg sitedata neilnaveen

lind_project's Issues

Comprehend nap syscall table

The NaCl app struct, or nap, contains the field syscall_table which is an array of NaClSyscallTableEntry structs. These entries contain syscall handlers.

Backtraces of syscalls show that this is used to trampoline the program to the appropriate syscall implementation. We'll need to make sense of the addresses and implementation here to enhance our caging system to properly allow/disallow or switch between syscall options.

NaCl Fork Overhead

Currently, the Fork system call is taking a huge chunk of time that overshadows execution time for small programs and large pipelines. 90+% of fork is from the NaClCopyExecutionContext function which copies context between processes, roughly two-thirds of that function is composed of memmove's.

This seemingly is because we're losing the CopyOnWrite semantics that a native fork would have. Can we re-implement this to utilize CopyOnWrite?

Implement mmap

We'll need to implement this in SafePOSIX to get to the goal of a fully userspace implementation. This may be done alongside #18 but I believe that issue isn't blocking.

There is a Python version of mmap here that should be useful.

Proposed steps

Get memory location in NaCl and check bounds
Pass ptr to Repy, use python mmap to map fd to memory location
We'll need to get the actual fd (not SafePOSIX) using the fileobj
Need to find a way to set memory location with Python's mmap (perhaps using ctypes? )
For now, we'll let flags fall through to the kernel.

Update Wikis/README

We need better instructions on building, installing, and running Lind, as well as how the infrastructure works.

This will also serve as good onboarding process.

As a first task - let's update the build/install/and run pages

Increase robustness of test suite

Our current test suite, through Travis, is minimal, and doesn't give many interesting results besides that the project builds. We should integrate more robust testing into Travis.

NaCl FD efficiency

The latest updates which make NaCl's allocation more POSIX-y also sadly are O(N).

This includes:

Allocating the cagetable to all -1's instead of zeros
Searching for a new fd for allocation
Copying the fd table when creating a new cage

As Justin says "the biggest efficiency gain we're going to get is when this finally works", but after that point, this might be a source to make Lind more efficient.

Execve

Execve doesn't seem to work properly, as it does not execute the prompted binary.

Note: Execv here seems to also be broken, as the default environment its being sent is set as NULL, but should be an array of NULL. Should be a quick fix.

Exit Syscall

We're not actually intercepting exit() in SafePOSIX. This was causing some bugs in my pipe tests, because exit should close fds, including pipe-ends.

It's also adding overhead by not cleaning up the now dead data structures.

We should add interception to NaClSysExit, add RPC, and create a function in SafePOSIX which calls close on all open fd's and cleans up the data structures.

Recording Lind Metrics

We'll need some metrics comparing running a LAMP stack through Lind versus natively.

This issue has been created to initially brainstorm what data we want to collect.

Lind Input

There's no easy way to configure Lind to receive external input. By default, SafePOSIX has stdin set to return empty strings on read. We can replace the default keyboard input using the dup family, but right now that only works in program.

For example, if we wanted to run bash and supply it with instructions via commands.txt we would presumably run lind /bash.nexe < commands.txt. This actually doesn't supply the program with any input.

This will redirects commands.txt as the input to NaCl's sel_ldr executable, which is running the program. This doesn't read from stdin in the first place, and doesn't do anything. SafePOSIX still has stdin set to keyboard input and will return any reads of stdin in program as empty strings.

The two fixes here are:

require lind programs to have the user configure stdin in program. (how it is now, but bad)
make NaCl take arguments that will set SafePOSIX stdin from a supplied file (seems pretty good, shouldn't be too hard to implement)

Update README

Update README to better reflect how to build/run lind. Utilize wiki's if useful.

Sort

The sort program from coreutils isn't working when run through bash:

Lind v1.0-rc2
Opening file system...
done.
[14789,1890141248:11:23:06.982167] BYPASSING ALL ACL CHECKS
/bin/sort: couldn't create temporary file: /tmp/sortEW0M8V: Invalid argument
``

Seems have problems making a temp file. It would be worth looking at other programs that might try to do this.

ls output formatting

Currently running ls through Lind gives us something like

�.
�..
�bin
�dev
�lib
dc.nexe
forkexecarg.nexe
forkexecls.nexe
forkexecv.nexe
hello.nexe
Persisting metadata: ...
Done persisting metadata.
Terminated

The formatting runs into even more problems when additional flags are introduced. I think some of the flags may be due to unimplemented syscalls. OTOH the general formatting (rogue nonsense characters and extra newlines) is almost certainly due to the weird stdout buffering discussed here.

Testing Bash Pipelines + Metrics

Now that bash is running reasonably we'd like to test it on a number of pipelines and compare our metrics to native bash runs.

To accomplish this we should:

Create a list of typical bash pipeline commands ie ls | cat or cat file | grep -v a | sort - r.
Create a script that runs these commands natively and with Lind, and stores runtime metrics
Use this script to collect data we can analyze

@JustinCappos How many pipeline commands do you think we need to collect a reasonable data set?

Eliminating NaCl FD's

This was suggested a few years ago here nacl_repy:5

We talked in the 8/29/19 meeting about removing NaCl FD's entirely. This could be helpful when were implementing things such as mmap.

I need to do some research about NaCl's dependency on it's FD system.

Review/Test fork()

Fork was implemented and merged last year, but we're not sure about how robust it is.

PRs:
Lind-Project/native_client#5
Lind-Project/native_client#6

Review code changes and testing implementation to assure that all of this is correct.

FD Allocation

Right now, we have a very simple way of allocating FD's in NaCl, where we just purely increment the Max FD.

However, when things like Bash imply a larger FD (ie 255) the max jumps considerably. POSIX implies that we use the lower FD available if we don't give a suggestion, so we need to create some ticketing system.

Loosely related: We should check that open doesn't create an FD over our FD_MAX number.

Update syscall mappings in wiki

Per talk with Brendan today.

It would be nice to profile Lind to trace what parts of Lind each syscall is touching. A program that runs through all of the syscalls, combined with appropriate logging verbosity (such as in this issue, but also in NaCl) would be helpful.

For each syscall it would be good to know which of the following scenarios it's taking:

Routing from the IRT directly to SafePOSIX
Routhing from the IRT, through some NaCl functions, then to SafePOSIX
Routing from the IRT, into NaCl, then to the Kernel
Routing from the IRT, into NaCl (no Kernel)

STD Streams in SafePOSIX

Our implementation of std streams in safeposix as of now is relying on them to have their own inodes. This isn't really the case in POSIX and also could give us problems down the line.

Opening up this thread for discussion.

Bash

We want to get Bash up and running in Lind.

Yiwen showed me how it was being used before, and I can compile and run bash (trivially I can have bash --version run and show the version info). He also showed me how to supply a script as arguments.

These are good first steps, but running scripts is failing. Some minor debugging is showing it failing in dup2 which is being supplied with invalid file descriptors.

Implement pipe

Setting up a user-space pipe implementation will provide us robustness, as well as hopefully increasing efficiency, by eliminating the need for context switches.

Cross reference to Lind-Project/nacl_repy#11

Create Low/Mid-Level Architectural Diagram

Considering the vast code base, it would be incredibly helpful to have a lower level map of how the components interact with each other as programs are executed. There are some higher level diagrams in the papers etc, but having explicit references to files/functions would serve as a handy reference. It would also significantly reduce the on-boarding curve for future collaborators.

Debug nginx

We need to be able to easily build nginx like we can for bash and coreutils.

I've transferred the bootstrap_nacl script over to the nginx source code but I'm getting some compilation issues.

Unify All System Calls in SafePOSIX

Currently we have some system calls in SafePOSIX, but others have end points in NaCl.

We'd like to have all systemcalls pass through to SafePOSIX regardless.

A good starting point is to find which Syscalls aren't implemented in SafePOSIX currently.

Add test cases

The test suite has now been merged into master, and runs through make as well as being handled by travis.

Right now only three tests are running. Let's add in the rest of the tests from the tests/test_cases folder.

Re-organize Lind Project Structure

Error building on Lind Server

ENVIRONMENT USAGE REPORT

4 dbg-linux-x86-64
4 nacl-x86-64-pic-glibc
Done building NaCl
Building Repy in to /home/lind/lind_project/lind/repy
'./seattlelib/xmlrpc_client.repy' -> '/home/lind/lind_project/lind/repy/repy/xmlrpc_client.repy'
'./seattlelib/xmlrpc_common.repy' -> '/home/lind/lind_project/lind/repy/repy/xmlrpc_common.repy'
'./seattlelib/xmlrpc_server.repy' -> '/home/lind/lind_project/lind/repy/repy/xmlrpc_server.repy'
Done building Repy in "/home/lind/lind_project/lind/repy/repy"
removed './lind_server.mix.new'

It's the same old story; boy meets beer, boy drinks beer... boy gets
another beer.
-- Cheers

Copying component.h header to glibc:
Building glibc
make[1]: warning: jobserver unavailable: using -j1. Add '+' to parent make rule.
make[1]: Entering directory '/home/lind/lind_project/src/native_client/tools'
Makefile:75: *** No suitable make binary found.. Stop.
make[1]: Leaving directory '/home/lind/lind_project/src/native_client/tools'

[./src/mklind:551] Error: [function: build_glibc()] [arguments: (nil)].

It took 55 seconds

All done.
make: *** [Makefile:8: all] Error 1

ERROR: Service 'prebuiltsdk' failed to build: The command '/bin/sh -c ./src/mkli nd -q glibc' returned a non-zero code: 1

If I try to build Lind with docker-compose build, I get this error:

In file included from /home/lind/lind_project/src/native_client/tools/SRC/gcc/gcc/tree-dump.h:    26,                                                               
                 from /home/lind/lind_project/src/native_client/tools/SRC/gcc/gcc/tree-mudflap.c:40:     
/home/lind/lind_project/src/native_client/tools/SRC/gcc/gcc/tree-pass.h:102:5: note: enum constant defined here
  102 |     GIMPLE_PASS,
      |     ^~~~~~~~~~~
make[3]: *** [Makefile:949: tree-mudflap.o] Error 1
make[3]: *** Waiting for unfinished jobs....
make[2]: *** [Makefile:4858: all-gcc] Error 2
make[1]: *** [Makefile:565: BUILD/stamp-x86_64-nacl-pregcc-standalone] Error 2
make: *** [Makefile:957: build-with-glibc] Error 2

[./src/mklind:552] Error: [function: build_glibc()] [arguments: (nil)].


It took 274 seconds

All done.
ERROR: Service 'prebuiltsdk' failed to build: The command '/bin/sh -c ./src/mklind -q glibc' returned a non-zero code: 1

I will get the whole log of the error. I had this error on Docker on both Ubuntu and Fedora.

Also, is docker-compose the right way to build Lind? Do I need to run ./mklind first?

Dependency on old make and texinfo versions

Currently we are depending on older versions of make and texinfo which we have had to vendor. This is due to build errors in the glibc part of our toolchain when uses more recent versions. At some point it would be helpful to patch glibc or find some other solution which more ideal than this.

mmap bug

There seems to be a bug when mmap returns with an error from SafePOSIX. This stems from NaCl bitmasking -1 (0xffffffff) erroneously to create an invalid NaCl address.

This is only effecting some programs, but sadly most of the large pipeline bash runs. So we'll need to get this fixed with priority.

Streamline build process

When we separated make download from make all we removed some lines that set all of our repos to the develop branch.

It would be nice to have all the repos set to develop at the end of make download again so we don't have to revert them manually.

Run LAMP stack via Lind

Our goal is to run a LAMP (Linux/Apache/MySQL/Python) web-service stack through Lind to demonstrate that Lind can operate programs using its user-space implementations correctly.

We need to:

Run these applications without unexpected errors.
Inspect operation to make sure system calls are being properly routed.

Error with docker-compose on Ubuntu

If I try to run docker-compose up on Docker/Ubuntu 18.04 (with the devicemapper storage driver), I get this error:

Error: Command /home/lind/lind_project/venv/bin/python2 native_client/build/download_toolchains.py --keep --arm-untrusted native_client/TOOL_REVISIONS returned non-zero exit status 1 in /home/lind/lind_project/src

I'm running Ubuntu in a VM on my FreeBSD server.

Inside the container, I tried to run /home/lind/lind_project/venv/bin/python2 native_client/build/download_toolchains.py --keep --arm-untrusted native_client/TOOL_REVISIONS and I get an out of space error.

The workaround is to increase dm.basesize to larger than 10GB (source).

Repy Memory Checking

At this point there isn't really a good way to check if the memory being allocated in SafePOSIX is contained in the NaCl bounded memory address space. It will help us prove correctness if we could manage this.

Debug Postgres

SQLite is lightweight and easy to configure. I think this is a good choice for the "M" portion of our source code.

We need to add the source code to our applications folder and try to bootstrap it.

Update containers

Containers need to be updated to have default paths set for lind and lindsh, including having rlwrap installed.

These should be put up on dockerhub.

unlink error on fork

The below sample program fails through Lind and shouldn't, it seems due to an error in unlink.

#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
#include <errno.h>
int main(){
  int fd = open("test.txt", O_RDWR | O_CREAT, 0777);
  char buf[] = "This is a test of the wonderful fork call in lind.";
  write(fd, buf, sizeof(buf));
  close(fd);
  fd = open("test.txt", O_RDWR);
  switch(fork()){
    case -1:
      puts("fork failed");
      break;
    case 0:
      close(fd);
      break;
    default:
      sleep(1);
      char newbuf[100];
      int numread = read(fd, newbuf, 100);
      printf("%s\n",newbuf);
      printf("read %d chars\n",numread);
      unlink("test.txt");
      close(fd);
      break;
  }
  return 0;
}

libc.so versioning discrepancy on build

Currently on a clean build, trying to run a nacl-gcc compiled nexe file will yield the following error:
error while loading shared libraries: /lib/glibc/tls/libc.so.b7b14f88: cannot read file data: Error 9

The LindFS is populated with /lib/glibc/libc.so.990e7c45 and /lib/glibc/32/libc.so.990e7c45 which lines up with the commit version in the up to date version at: https://github.com/Lind-Project/native_client/blob/lind/tools/REVISIONS

However, the compiled nexe's are requesting a libc.so commit that is a seemingly random, from a short period in 2013: Lind-Project/native_client@2fb46f0#diff-f55637873e2d1ef5a5cee5a266a7d956

This needs to be traced to determine why they are requesting that version. I can't find any hard-coded references to that commit anywhere in the organization.

Pipe "Edge" Cases

Now that we have pipe related fd's, we need to make sure syscalls that take an fd int do the right thing with them.

These are handled by the basic pipe protcol
int lind_pread(int fd, void* buf, int count, off_t offset, int cageid);
int lind_pwrite(int fd, const void *buf, int count, off_t offset, int cageid);
int lind_close (int fd, int cageid);
int lind_read (int fd, int size, void *buf, int cageid);
int lind_write (int fd, size_t count, const void *buf, int cageid);

I believe these will need to be specifically handled
int lind_lseek (off_t offset, int fd, int whence, int cageid);
int lind_fxstat (int fd, int version, struct lind_stat *buf, int cageid);
int lind_fstatfs (int fd, struct lind_statfs *buf);
int lind_dup (int oldfd, int cageid);
int lind_dup2 (int oldfd, int newfd);
int lind_getdents (int fd, size_t nbytes, char *buf, int cageid);
int lind_fcntl_get (int fd, int cmd);
int lind_fcntl_set (int fd, int cmd, long set_op);
int lind_flock (int fd, int operation);

These socket related ones I believe already are handled by checking IS_SOCK
int lind_bind (int sockfd, socklen_t addrlen, const struct sockaddr *addr);
int lind_send (int sockfd, size_t len, int flags, const void *buf);
int lind_recv (int sockfd, size_t len, int flags, void *buf);
int lind_connect (int sockfd, socklen_t addrlen, const struct sockaddr *src_addr);
int lind_listen (int sockfd, int backlog);
int lind_sendto (int sockfd, size_t len, int flags, socklen_t addrlen, const struct sockaddr_in *dest_addr, const void *buf);
int lind_accept (int sockfd, socklen_t addrlen);
int lind_getpeername (int sockfd, socklen_t addrlen_in, __SOCKADDR_ARG addr, socklen_t * addrlen_out);
int lind_setsockopt (int sockfd, int level, int optname, socklen_t optlen, const void *optval);
int lind_getsockopt (int sockfd, int level, int optname, socklen_t optlen, void *optval);
int lind_shutdown (int sockfd, int how);
int lind_select (int nfds, fd_set * readfds, fd_set * writefds, fd_set * exceptfds, struct timeval *timeout, struct select_results *result);
int lind_recvfrom (int sockfd, size_t len, int flags, socklen_t addrlen, socklen_t * addrlen_out, void *buf, struct sockaddr *src_addr);
int lind_poll (int nfds, int timeout, struct pollfd *fds_in, struct pollfd *fds_out);

build_glibc() has nil argument

I'm trying to compile your project in Ubuntu 16.04(I tried 18.04 as well). I followed extacly the same instructions in your README file, however, it gives the following errors

config.status: creating po/Makefile.in
config.status: executing depfiles commands
config.status: executing libtool commands
config.status: executing default-1 commands
config.status: executing bfd_stdint.h commands
config.status: executing default commands
make[6]: Nothing to be done for 'info'.
make[6]: Leaving directory '/home/fma/lind_project/native_client/tools/BUILD/build-binutils-x86_64-nacl/bfd/po'
make[6]: Entering directory '/home/fma/lind_project/native_client/tools/BUILD/build-binutils-x86_64-nacl/bfd'
make[6]: Nothing to be done for 'info-am'.
make[6]: Leaving directory '/home/fma/lind_project/native_client/tools/BUILD/build-binutils-x86_64-nacl/bfd'
Makefile:1544: recipe for target 'info-recursive' failed
make[5]: *** [info-recursive] Error 1
make[5]: Leaving directory '/home/fma/lind_project/native_client/tools/BUILD/build-binutils-x86_64-nacl/bfd'
Makefile:3278: recipe for target 'all-bfd' failed
make[4]: *** [all-bfd] Error 2
make[4]: Leaving directory '/home/fma/lind_project/native_client/tools/BUILD/build-binutils-x86_64-nacl'
Makefile:828: recipe for target 'all' failed
make[3]: *** [all] Error 2
make[3]: Leaving directory '/home/fma/lind_project/native_client/tools/BUILD/build-binutils-x86_64-nacl'
Makefile:372: recipe for target 'BUILD/stamp-x86_64-nacl-binutils' failed
make[2]: *** [BUILD/stamp-x86_64-nacl-binutils] Error 2
make[2]: Leaving directory '/home/fma/lind_project/native_client/tools'
Makefile:954: recipe for target 'build-with-glibc' failed
make[1]: *** [build-with-glibc] Error 2
make[1]: Leaving directory '/home/fma/lind_project/native_client/tools'

[./mklind:552] Error: [function: build_glibc()] [arguments: (nil)].


It took 141 seconds

All done.
Makefile:8: recipe for target 'all' failed
make: *** [all] Error 1

It seems to be compiling for all-bsd? I don't see how I can specify the architecture. I already libglib2.0-dev with version 2.48.2-0ubuntu4.1.
Here is the extact commands I run (I already installed virtualenv and virtualenvwrapper)

git clone https://github.com/Lind-Project/lind_project.git
mv lind_project ~/
cd ~/lind_project
make  # choose 1 for all

Then I get the error after a while. Did I miss anything? Thanks.

Fork implementation not working

It seems that fork() was added as of native_client's PR#6, but it isn't currently working.

Compiled programs that contain fork() yield the error:
warning: warning: fork is not implemented and will always fail

Threads/Exec/Wait Refactor

I've started a document detailing the overall plan for the refactor. So far I've written about the thread infrastructure.

This document can be found here.

Cage Information into SafePOSIX

This may be a longer term goal, and for now is set for after our current LAMP goals.

We currently keep most of our book keeping in each cage's nap structure. These include references and bookkeeping variables for the cages parent and children.

One hope is to eventually move all this information to SafePOSIX. Doing this could enable us to implement wait() in SafePOSIX, as well as move much of fork() and exec() besides the thread spawning necessities.

This pairs with the goal of moving sysem level id's to SafePOSIX (pid, uid, gid, etc..)

dup

We'll need the dup family of syscalls to work for the Bash/LAMP milestone. There's some implementations in place but it needs to be futher investigated. These need to all move to the SafePOSIX side and work with all fds.

Debug Python

I'd like to build python so we can use flask for a web app. I think this is a good choice for the "P" portion of our source code.

We need to add the source code to our applications folder and try to bootstrap it.

Reorganize 'tests' folder

The tests folder has been a mess forever and isn't intuitive. I think it's a good idea to clean this up and merge it to master with the working bash build.

Make sense of submodules

Will need to be solved for PR for the reorg branch. Currently the sub-modules usually point to develop which usually is behind/ahead of master. Making sense of what should be used will be necessary to complete updates.

log() adds extra space

While researching an issue for pipe, I found an error that seems to be pervasive in read It seems like an extra character is always being copied over into the buffer, which usually appears as a space.

For example reading the string "EXAMPLE" byte by byte, and printing those bytes will end up being formatted as:
E X A M P L E

It seems like this error is most likely due to the COPY_DATA macro in lind_platform.c which handles the RPC reception for read(), however I'm having trouble tracking down the root cause.

Update Process Bookkeeping

Per #10 and #8 and several discussions today.

We need to update Lind so the Repy side can keep track of separate processes. This will allow us to safely handle files between processes and implement pipes.

To do this we need to:

Update the RPC calls in NaCl so they transmit the Cage-ID. These will need to be traced back to where we have nap access in nacl_syscall_common.c, and the RPC needs to be modified in lind_platform.c
Modify the fork implementation in nacl_syscall_common.c so that it calls a new fork RPC in lind_platform.c which transmits parent and child cage ID's.
Parse the RPC on the Repy side and handle that properly. The best way would be to change the flow so we can set our process "context" by cage ID. Forking, and copying FD's will have to be handled somewhat independently.

Python virtualenv failing during make process

I'm getting an error now while trying to make Lind in a new container via dockerhub

Installing collected packages: virtualenv
  Attempting uninstall: virtualenv
    Found existing installation: virtualenv 20.0.17
    Uninstalling virtualenv-20.0.17:
      Successfully uninstalled virtualenv-20.0.17
Successfully installed virtualenv-20.0.25
ImportError: No module named via_app_data.via_app_data
Command `python2 -m virtualenv ./venv` failed

This is happening reproducibly from multiple container downloads, and also is what probably caused Travis to fail earlier this week unexpectedly.

Something is going on with python/pip/virtualenv.

Luckily this was solved by uninstalling and reinstalling python2, but this needs to be propagated to the images somehow so it doesn't continue to happen.