reberhardt7 / cplayground Goto Github PK

License: GNU General Public License v3.0

JavaScript 5.29% HTML 0.40% Dockerfile 0.30% C++ 0.06% TypeScript 79.46% Python 1.70% Makefile 0.08% C 4.71% Shell 1.06% SCSS 6.94%

cplayground's People

Contributors

Stargazers

Watchers

Forkers

ilhangrn martigp catzhang glen3b noahgeller learningprogramingonline panta hjmmjg karshilov chaosad maknee joypoint altanali

cplayground's Issues

Prototype some visualizations given dummy data

Given some dummy data (a list of processes, a list of open file descriptors for each process, and info about what they point to), we need to build a simple single-page visualization as a proof of concept.

Some pieces to this issue:

Figure out the structure of the dummy data
Figure out exactly what the visualization should look like
Decide what libraries to use

Add license to repository

This repository doesn't have a license file, so it's unclear what terms others can use this project under or what terms contributors are contributing their work under.

Per discussions in the development Slack, not having a license is simply an oversight; according to Choose A License, not having one is probably not what we want.

I'm creating this issue to track the ToDo so it doesn't get lost.

In terms of the actual licence in question, I have a slight preference for copyleft for non-libraries, especially something educational like this, but I'm fine with anything FOSS. I only chip in here as my 2 cents; @reberhardt7 should make this call.

Cplayground needs tests!!

Cplayground is badly in need of automated tests. I've gotten by so far because this has been a one-person operation of sufficiently small scope, but it's becoming too complex to manage, and lack of tests makes it hard for other people to contribute.

I have put this off for so long mainly because I'm not sure how to test such a project with so many moving pieces. #21 will help a lot; at least then it will be easy to spin up a testing environment that we can send requests to or run end-to-end tests on.

However, any kind of tests will help. Even simple frontend or node unit tests will be very useful; I just haven't had time to write them. If you're interested, I have a list of functionality that would be good to test.

cplayground.com is down

Hi,
When would it be possible to get it back up?

Add type safety for database queries

We're currently making database queries using the mysql library in db.ts. However, we've had bugs in the past because of simple type issues, since no type checking is done for passing variables as query parameters, or for processing data returned from the database.

To help cut down on these simple mistakes, it would be really helpful to use a typed query builder for db queries. This library seems pretty good from a quick glance: https://github.com/ggmod/type-sql

Support for direct ad-hoc links that embed code

The Rust playground provides a feature where you can link to their playground with code directly embedded in the URL. Inspecting and visiting this link should provide a basic demonstration of how these links work.

These ad-hoc links are immensely useful when learning and teaching Rust and I think I and other cplayground users would benefit from similar functionality. Would there be any issue in adding support for direct code linking to this project?

Change background on hover for debug control buttons

There are two sets of debugger controls: the controls in the Processes tab, and the inline controls:

When you mouse over a button in these controls, the cursor changes to the glove to indicate it's a button, but that's the only indication that those things are clickable. It may help the usability of the UI to also make the button background change on hover. (We do that for basically every other clickable element in cplayground right now.)

Creating online working area

Can we add the saving code section? By doing this we can reach our codes from everywhere, it will we online working environment :)

Store program output as a blob/buffer instead of a varchar/string

Program output is being logged to the database as a varchar. However, this may cause problems since programs can output arbitrary byte sequences. We should change the output type in the server code to be a Buffer (instead of a string), and we should add a database migration to store this output as a blob.

Tech debt: replace synchronous file operations with async versions

There were a few cases where I was in a hurry to get things working and called functions like fs.readFileSync. We should replace those with their asynchronous counterparts to support more clients using cplayground concurrently.

Figure out how to determine when two fds reference the same open file table entry

Right now, we can get a fair amount of useful information from /proc/*/fd/ and from lsof. However, if process A and process B have a file descriptor pointing to file X, we can't tell if that's because they are sharing a file table entry (e.g. process B forked from process A after A opened the file) or because they opened the same file independently. We can't reconstruct the open file table unless we can figure out when two fds are aliases of each other.

Show zombie processes in debugger

The debugger has a pane that lists running processes:

Say the user forks off a child process; the child process exits, but the parent process is still running and has not yet called waitpid() on the child. The child process is then a zombie process; it still exists, even though it has terminated. As such, we should list that zombie process in the process listing. (This is important because cplayground is used in an educational setting where we help students new to multiprocessing learn how to reap processes and avoid resource leaks.)

The data for the process listing comes from two places: /proc/cplayground (a file generated by our kernel module) and gdb. I am pretty sure /proc/cplayground does not currently include zombie processes, so we'll have to change that. (I believe this skips over zombies so that we can get the process namespace. If true, we need some other way to get the process's PID namespace. But this code is rusty in my head.) Then, we'll need to change the code that integrates the procfile data with gdb data. Right now, it omits processes that don't show up in gdb, but we wouldn't want to do that because we want to keep the zombie processes (even though they're no longer gdb inferiors).

Ctrl+c doesn't kill user's program when run in debug mode

Normally, if you click the terminal and press ctrl+c, that will result in SIGINT being delivered to the user's process (killing it unless they've installed a signal handler). However, if the program is being run in debug mode, the program isn't killed. I think this is because run.py doesn't actually start the user's program directly (it starts gdb, which starts the user's program) and gdb is ignoring SIGINT. If that's true, it means that other keyboard-generated signals also aren't being delivered to the program. This is a problem, since cplayground is used to teach about signal handling, and I don't want students to have a confusing experience where things work one way normally and then work a different way when they start the debugger.

The fix will probably involve getting gdb to create a process group, placing the user's program in the process group, then putting that process group in the foreground using tcsetpgrp. I'm not sure -- this will take some investigation.

Create VM image for reproducible development/testing/deployment

It's a pretty big hassle to set up a development environment for Cplayground, and an even bigger hassle to set up a server for deployment. Normally, the best solution for this would be to use Docker, but we can't do that because Cplayground uses Docker, and also because it depends on a modified Linux kernel.

I haven't looked at solutions, so I don't know the best way to do this, but I know that Vagrant has Vagrantfiles (like Dockerfiles) that can be used to reproducibly set up a VM for development or configure a server for deployment. If anyone has time to look into writing one (and making sure it can support custom kernels), that would be wonderful!

Add animation to transition between run/debug button

When a breakpoint is set, the "Run" button becomes a "Debug" button. It would be nice to have a CSS animation to transition between the two. A fade would be simple enough, but maybe we can have something even more fun?

Write simple gdb script to control inferior

We should write a simple script that launches a program under gdb, and uses gdb to do something really simple (e.g. set some breakpoints and print a message when those breakpoints are hit). This is to get a better sense of how to use gdb's programmatic control interface.

Don't hardcode uid/gid in kernel module

The cplayground debugger reads data from the /proc/cplayground procfile, which is generated by our kernel module. In order to make this file readable by the (unprivileged) node server process, we set the procfile to be owned by the user running the server process:

cplayground/src/server/kernel-mod/cplayground.c

Line 305 in 6366368

kuid_t uid = { .val = 1000 };

Right now, that's hardcoded, but that obviously causes problems when we try to run/test this on other peoples' machines, or when we go to deploy this in production.

We should have some way to adapt this based on the environment that is being used, e.g. use a compile-time environment variable via make, or something like that. This will be less important when #21 is addressed, but it's still important.

Ban ptrace syscall from user programs

Ptrace is a really complex syscall with a very large attack surface and a history of vulnerabilities. Also, I don't think there's much reason that user programs on cplayground should need it.

The container still needs to be able to invoke ptrace in order to run the cplayground debugger (we run gdb inside of the container), but ideally, we would prevent the user program from calling ptrace. We can accomplish this by tightening the seccomp profile used to execute the user program (or there may be some other simpler way to do it).

Add "stop" button for running programs

Right now, you can stop a running program by refreshing the page, or by clicking in the terminal and pressing ctrl+c. However, there should be a more obvious way to do it.

We'll need to think about UI -- should the "run" button become a "stop" button when the program is running, or should there be a separate "stop" button that appears once it starts running? I like the minimalism of having fewer buttons, but I think it's cool how the "run" button goes from outline to solid when the program starts running, and it may be clearer to users that it's possible to stop the program if there's an obvious "stop" button showing.

The implementation of this is quite easy. To stop a program, just close the websocket. No backend changes needed for this.

Show threads in debugger

Currently, the debugger only lists processes. We should list threads if a process has multiple threads, and threads should be independently controllable in the inline debugging controls that appear in the editor.

We can get thread info from gdb; there is already some code tracking threads here. We need to pass this data to debugging.getContainerInfo, embed it in the ContainerInfo object, return it to the client, and update the display to show this info.

Do type checking on websocket messages

In the server code that handles websocket messages (mainly socket-connection.ts), there is a lot of annoying code to make sure that the expected fields are present and that they're of the expected type. More importantly, there are cases where this validation is missing, and our code is susceptible to Denial of Service if someone sends us malformed messages.

Since we have types for most websocket messages (see https://github.com/reberhardt7/cplayground/blob/master/src/server/socket-connection.ts), it shouldn't be hard to automate some of this type checking. typed-socket.io seems like a promising library; in particular, we would want to use TypedServer and TypedClient to get runtime validation.

Drop ptrace capabilities before executing program in debug mode

When running in debug mode, we need to grant CAP_SYS_PTRACE to the docker container so that gdb can run ptrace on the user's program. However, I don't want the user's program to have that capability. ptrace is a really complicated syscall with a large attack surface, and I won't be surprised if another serious ptrace vuln shows up in the future. I don't want to make it possible for someone to exploit such a vuln to escape the container.

You can see that CAP_SYS_PTRACE is currently not being dropped: https://cplayground.com/?p=mallard-coyote-wasp&breakpoints=%5B9%5D

$ capsh --decode=0000000000080000
0x0000000000080000=cap_sys_ptrace

We need to come up with some clever way of doing this. Currently, in debug mode, run.py launches gdb, and gdb launches the user's program, so we don't have control over how the user's program gets forked. However, maybe we can do something simple, like writing a silly program that does this:

drop CAP_SYS_PTRACE
exec(user program, args)

Then, we can run gdb on that program. Not sure if that will break gdb functionality, but it may be worth a shot.

Make banner width adapt to terminal width

"Banners" are used to show info info during execution. However, they are fixed width and do not adapt to the terminal size:

Banners are generated here and here. We should try to size them to the terminal width instead of hardcoding the width.

Can not listen on an unix socket in /cplayground which is mapped as a shared folder in VirutalBox

I found some information from other repo, someone said that the UNIX socket file can not be created in a shared folder and I verified that was true.

Any plan on this issue?

Research: Running GDB in docker container

What's the best way? Let's find out!

Implement seq_file iterator to improve performance under load

Cplayground gets info from the kernel by reading /proc/cplayground. That file is generated by the cplayground kernel module, which iterates over the processes and "prints" to a buffer managed by the seq_file interface. The way that works is seq_file allocates a buffer, and each "print" appends to the buffer. If the buffer runs out of space, then subsequent prints are discarded until our code finishes running; then, seq_file discards the too-small buffer, allocates a new one double the size, and reruns our code, restarting the process of printing process info.

When there are a lot of processes running or a lot of open files, this creates severe performance issues. It's not uncommon to see seq_file run our code 5+ times before it finally succeeds because it finally got the buffer size right.

A better fix is to implement the seq_file interface. Instead of having one function to generate the entire file (ct_seq_show in our existing code), we should do the following:

When the cplayground file gets opened, we should call our get_containerized_processes function to populate a list of processes. We can store this in the file private data. This example from the seq_file documentation looks promising:

  static int ct_open(struct inode *inode, struct file *file)
  {
  	struct mystruct *p =
  		__seq_open_private(file, &ct_seq_ops, sizeof(*p));

  	if (!p)
  		return -ENOMEM;

  	p->foo = bar; /* initialize my stuff */
  		...
  	p->baz = true;

  	return 0;
  }

We should use the iterator interface to iterate over the processes in the process list. That way, if the buffer is overflowed, we don't need to restart writing the entire file from scratch; seq_file can copy the output that was successfully generated up until this point, then restart only from the process where the output overflowed.
We'll need to free the list of processes when the file is closed

We should also confirm that performance improves. I'm pretty sure it should based on my understanding of seq_file, but if I am wrong and seq_file still restarts file generation from scratch on overflow, then this is not helpful and only adds more complexity.

Debugger: show more detailed process status

Right now, cplayground shows process statuses based on gdb:

However, the displayed process status is the status from gdb's perspective, but not from the OS scheduler's perspective. In that screenshot, pid 14 is said to be running, but is actually blocked due to a sleep() call. It would be really neat to show more details, such as the scheduler status (is the process actually runnable, or is it blocked?) and maybe even the reason for the status (if it is blocked, why? is it stuck waiting for a mutex?)

Showing the scheduler status should be easy. We can get it by modifying the kernel module to print each thread's status (it's stored in the struct task_struct), and we might even be able to get it by running "ps" (although this may have issues with being in sync with the rest of our data -- there are already occasionally discrepancies with the data sourced from our kernel module vs sourced from gdb). I don't know if it's possible to determine why blocked threads are blocked in a general way, but some research would be necessary here. UI changes should be easy.

Add ability to drag the divider between editor/terminal

Right now, the editor-terminal split is fixed at 50%. We should allow users to drag the divider to show more of the editor or show more of the terminal.

Figure out how to build kernel with custom version string

build-kernel.sh applies this patch to the kernel code in order to make get_files_struct and put_files_struct available to kernel modules. (I have no idea why these are not exported in the first place; they're pretty important.) However, the compiled kernel ends up being built with the normal version string; if I run uname -r, it returns 5.3.0-42-generic. It would be nice to have a version suffix added, so that we can see in production that we're running the cplayground kernel (e.g. 5.3.0-42+cplayground-generic).

The Ubuntu build instructions say to do this by modifying debian.master/changelog, but this didn't work for me. I did a lot of Googling and saw some other possible methods (e.g. editing the Makefile) but I haven't had time to get anything to work.

In Open Files debugger tab, make the diagram container fit the contents

The Open Files debugger tab shows a diagram of the file descriptor / open file / vnode tables:

The size of the diagram container is currently hardcoded:

cplayground/src/client/components/Diagram.tsx

Line 82 in 6366368

// TODO: set a more reasonable width/height

That means that if there are not many processes running, there's weird horizontal scrolling behavior (you can scroll really far to the right, where there's nothing showing on the diagram), and if there are a lot of processes running, then they potentially get clipped on the diagram. We should dynamically compute the width/height needed to show the diagram, and use that instead of hardcoding.