Code Monkey home page Code Monkey logo

sledge-serverless-framework's People

Contributors

bushidocodes avatar emil916 avatar gparmer avatar lyuxiaosu avatar meilier avatar mzpqnxow avatar phanikishoreg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

sledge-serverless-framework's Issues

Ensure that changes to x64 are also completed on ARM

Relevant PR Comment:

runtime/include/arch/aarch64/context.h lines 43-46

	/*
	 * if ctx->regs[0] is set, this was last in a user-level context switch state!
	 * else restore mcontext..
	 */
	if (ctx->regs[0]) {

phanikishoreg
You've not incorporated your changes to aarch64. I think you should at least create an issue for this and look into it when feasible.

Clang-Format Alignment Issue

In runtime/include/sandbox_run_queue.h, the delete function is not aligned as excepted with auto-format using clang-format.

typedef struct sandbox_run_queue_config_t {
	sandbox_run_queue_add_t      add;
	sandbox_run_queue_is_empty_t is_empty;
	sandbox_run_queue_delete_t delete;

@gparmer
gparmer 23 hours ago
alignment

@bushidocodes
bushidocodes 19 hours ago Author
I have no explanation for this one, but it seems like there might be a clang-format bug. If I rename delete to anything else, it aligns properly. ๐Ÿค”

Revisit Semantics

Once Code is closer to final, revisit semantics:

  • local - Phani suggested perhaps per_core might be better
    others?

Relevant Comments
Phani:
I feel like "local" is not conveying what we want. It should either be "per_core" or "per_thread" or "worker_runqueue" sort...

Discussion: Revisit HTTP preprocessor paths (USE_HTTP_UVIO, etc).

There are different preprocessor configs that I haven't been testing that Phani used in the past. Should these be retained, refactored, or eliminated?

runtime/src/sandbox.c: 70-72

	assert(sandbox != NULL);

	sandbox->request_response_data_length = 0;
#ifndef USE_HTTP_UVIO

@gparmer
gparmer 6 days ago
Are we ever going to test the other paths? If "no", I wonder if we need the #defines for this. @phanikishoreg Have these outlasted their purpose?

@phanikishoreg
phanikishoreg 6 days ago
For the paper, we changed the HTTP request /response paths were using system calls for recv/send because we had some latency issues with long running applications. So this #define. I'm not sure whether to keep or remove it, it might be worth keeping if we come to a point where we debug performance.

Feature: Deferrable Signals while software interrupts are disabled

Signals that occur when softint are disabled are currently ignored, potentially causing the timer ticks defining quantums to be missed. The runtime could be modified to track if a signal was received when softints were disabled, such that they could be deferred until softints are enabled.

Considerations:

  • Under what situations are softints disabled? Is this only when the scheduler is executing or when other things are happening as well?
  • How would this change impcat system overhead?
  • How would this change impact latency?

Relevant Slack Chat

phanikishoreg 30 minutes ago
Kk, its not the configuration feature we're taking about! So its the software interrupt handling, the signals that occur when softint are disabled are ignored currently. @sean is taking about handling those after we enable them, I see.

phanikishoreg 20 minutes ago
I think this really means, tracking if there was a timer when softints are disabled, and calling schedule again right after we enable interrupts. In most cases we have interrupts disabled only when we're actually scheduling, so. But I think it's useful in case, the current interrupt handler took too long and its actually time to schedule a new sandbox. (edited)
๐Ÿ‘
2

gparmer 19 minutes ago
....and if we disable interrupts while locks are held
๐Ÿ‘
2

feat: WASI Networking

WASI provides a handful of sock_* functions:
sock_accept
sock_recv
sock_send
sock_shutdown

This suggests that a sandbox is currently limited to polling and accepting client connections form a preopened socket bound to a particular port by the host embedding environment prior to sandbox initialization.

This particular subset of functionality does not really seem well aligned to a serverless runtime. We associate sockets with modules, which upon a client request:

  1. parse query params as arguments
  2. read the entire HTTP body into a buffer
  3. Initialize a sandbox with query parameters as arguments in the WASI context.
  4. WASI calls to read redirect reads to STDIN as reads to the buffer containing the HTTP body. WASI calls to write redirect writes to STDOUT as writes to an outbound buffer.
  5. When a sandbox completes, the outbound buffer is written back to the client socket and the runtime then immediately calls shutdown on the socket.

Given the short lifetime of a webassembly sandbox, the accept function is likely not useful for us.

Refactor global_request_scheduler_add to propagate return codes to caller

Current behavior panics if the backing data structure is full.

/**
 * Pushes a sandbox request to the global deque
 * @param sandbox_request
 * @returns pointer to request if added. NULL otherwise
 */
static struct sandbox_request *
global_request_scheduler_minheap_add(void *sandbox_request)
{
	assert(sandbox_request);
	assert(global_request_scheduler_minheap);
	if (unlikely(!listener_thread_is_running())) panic("%s is only callable by the listener thread\n", __func__);

	int return_code = priority_queue_enqueue(global_request_scheduler_minheap, sandbox_request);
	/* TODO: Propagate -1 to caller. Issue #91 */
	if (return_code == -ENOSPC) panic("Request Queue is full\n");
	return sandbox_request;
}

Wasmception inflates Docker build context

The wasmception directory in the silverfish submodule seems to be ~2GB. We should investigate adding some of these things to a .dockerignore file to improve build times.

Blocked by aWsm work implementing WASI-SDK and follow up work integrating the new release of aWsm into sledge.
No issue in aWsm for this, but PR is at gwsystems/aWsm#19

Refactor Code to require minimum 2 cores

In runtime/src/sandbox_request_scheduler_fifo.c,

// TODO: Running the runtime and listener cores on a single shared core is untested
// We are unsure if the locking behavior is correct, so there may be deadlocks
#if NCORES == 1

@gparmer
gparmer 5 days ago
@phanikishoreg Have these settings outlived their utility? Seems like we only test on NCORES > 1 now.

@phanikishoreg
phanikishoreg 5 days ago
I think so. It would probably be better to make sure NCORES > 1 in runtime startup code.

test_modules.json not valid JSON

This file should be wrapped in a top level array to be valid JSON. Not sure jsmn has been able to gracefully handle this or not.

Document the contracts of the polymorphic interfaces

Relevant Gabe comment:

In [the .c] file, or in the .h file, you should document the purpose of this API. As it is polymorphic, it is important that clients understand how to use it, and implementors understand how to implement a new variant.

Context-switch Assembly Reference

https://bitbucket.org/pypy/pypy/src/default/rpython/translator/c/src/stacklet/

This pypy package has context-switch reference code for different architectures.
Pasting the code here, just in case the link goes invalid / just cloned it here (https://bitbucket.org/pkgadepalli/pypy/src/default/rpython/translator/c/src/stacklet/)

ARM AARCH64

static void *slp_switch(void *(*save_state)(void*, void*),
                        void *(*restore_state)(void*, void*),
                        void *extra) __attribute__((noinline));

static void *slp_switch(void *(*save_state)(void*, void*),
                        void *(*restore_state)(void*, void*),
                        void *extra)
{
  void *result;
  /*
      registers to preserve: x18-x28, x29(fp), and v8-v15
      registers marked as clobbered: x0-x18, x30

      Note that x18 appears in both lists; see below.  We also save
      x30 although it's also marked as clobbered, which might not
      be necessary but doesn't hurt.

      Don't assume gcc saves any register for us when generating
      code for slp_switch().

      The values 'save_state', 'restore_state' and 'extra' are first moved
      by gcc to some registers that are not marked as clobbered, so between
      x19 and x29.  Similarly, gcc expects 'result' to be in a register
      between x19 and x29.  We don't want x18 to be used here, because of
      some special meaning it might have.  We don't want x30 to be used
      here, because it is clobbered by the first "blr".

      This means that three of the values we happen to save and restore
      will, in fact, contain the three arguments, and one of these values
      will, in fact, not be restored at all but receive 'result'.
  */

  __asm__ volatile (

    /* The stack is supposed to be aligned as necessary already.
       Save 12 registers from x18 to x29, plus 8 from v8 to v15 */

    "stp x18, x19, [sp, -160]!\n"
    "stp x20, x11, [sp, 16]\n"
    "stp x22, x23, [sp, 32]\n"
    "stp x24, x25, [sp, 48]\n"
    "stp x26, x27, [sp, 64]\n"
    "stp x28, x29, [sp, 80]\n"
    "str d8,  [sp, 96]\n"
    "str d9,  [sp, 104]\n"
    "str d10, [sp, 112]\n"
    "str d11, [sp, 120]\n"
    "str d12, [sp, 128]\n"
    "str d13, [sp, 136]\n"
    "str d14, [sp, 144]\n"
    "str d15, [sp, 152]\n"

    "mov x0, sp\n"        	/* arg 1: current (old) stack pointer */
    "mov x1, %[extra]\n"   	/* arg 2: extra, from x19-x28         */
    "blr %[save_state]\n"	/* call save_state(), from x19-x28    */

    /* skip the rest if the return value is null */
    "cbz x0, zero\n"

    "mov sp, x0\n"			/* change the stack pointer */

	/* From now on, the stack pointer is modified, but the content of the
	stack is not restored yet.  It contains only garbage here. */
    "mov x1, %[extra]\n"	/* arg 2: extra, still from x19-x28   */
                /* arg 1: current (new) stack pointer is already in x0*/
    "blr %[restore_state]\n"/* call restore_state()               */

    /* The stack's content is now restored. */
    "zero:\n"

    /* Restore all saved registers */
    "ldp x20, x11, [sp, 16]\n"
    "ldp x22, x23, [sp, 32]\n"
    "ldp x24, x25, [sp, 48]\n"
    "ldp x26, x27, [sp, 64]\n"
    "ldp x28, x29, [sp, 80]\n"
    "ldr d8,  [sp, 96]\n"
    "ldr d9,  [sp, 104]\n"
    "ldr d10, [sp, 112]\n"
    "ldr d11, [sp, 120]\n"
    "ldr d12, [sp, 128]\n"
    "ldr d13, [sp, 136]\n"
    "ldr d14, [sp, 144]\n"
    "ldr d15, [sp, 152]\n"
    "ldp x18, x19, [sp], 160\n"

    /* Move x0 into the final location of 'result' */
    "mov %[result], x0\n"

    : [result]"=r"(result)	/* output variables */
	/* input variables  */
    : [restore_state]"r"(restore_state),
      [save_state]"r"(save_state),
      [extra]"r"(extra)
    : "x0", "x1", "x2", "x3", "x4", "x5", "x6", "x7", "x8", "x9",
      "x10", "x11", "x12", "x13", "x14", "x15", "x16", "x17", "x18",
      "memory", "cc", "x30"  // x30==lr
  );
  return result;
}

ARM

#if defined(__ARM_ARCH_4__) || defined (__ARM_ARCH_4T__)
# define call_reg(x) "mov lr, pc ; bx " #x "\n"
#else
/* ARM >= 5 */
# define call_reg(x) "blx " #x "\n"
#endif

static void *slp_switch(void *(*save_state)(void*, void*),
                        void *(*restore_state)(void*, void*),
                        void *extra) __attribute__((noinline));

static void *slp_switch(void *(*save_state)(void*, void*),
                        void *(*restore_state)(void*, void*),
                        void *extra)
{
  void *result;
  /*
      seven registers to preserve: r2, r3, r7, r8, r9, r10, r11
      registers marked as clobbered: r0, r1, r4, r5, r6, r12, lr
      others: r13 is sp; r14 is lr; r15 is pc
  */

  __asm__ volatile (

    /* align the stack and save 7 more registers explicitly */
    "mov r0, sp\n"
    "and r1, r0, #-16\n"
    "mov sp, r1\n"
    "push {r0, r2, r3, r7, r8, r9, r10, r11}\n"   /* total 8, still aligned */
#ifndef __SOFTFP__
    /* We also push d8-d15 to preserve them explicitly.  This assumes
     * that this code is in a function that doesn't use floating-point
     * at all, and so don't touch the "d" registers (that's why we mark
     * it as non-inlinable).  So here by pushing/poping d8-d15 we are
     * saving precisely the callee-saved registers in all cases.  We
     * could also try to list all "d" registers as clobbered, but it
     * doesn't work: there is no way I could find to know if we have 16
     * or 32 "d" registers (depends on the exact -mcpu=... and we don't
     * know it from the C code).  If we have 32, then gcc would "save"
     * d8-d15 by copying them into d16-d23 for example, and it doesn't
     * work. */
    "vpush {d8, d9, d10, d11, d12, d13, d14, d15}\n"  /* 16 words, still aligned */
#endif

    /* save values in callee saved registers for later */
    "mov r4, %[restore_state]\n"  /* can't be r0 or r1: marked clobbered */
    "mov r5, %[extra]\n"          /* can't be r0 or r1 or r4: marked clob. */
    "mov r3, %[save_state]\n"     /* can't be r0, r1, r4, r5: marked clob. */
    "mov r0, sp\n"        	/* arg 1: current (old) stack pointer */
    "mov r1, r5\n"        	/* arg 2: extra                       */
    call_reg(r3)		/* call save_state()                  */

    /* skip the rest if the return value is null */
    "cmp r0, #0\n"
    "beq zero\n"

    "mov sp, r0\n"			/* change the stack pointer */

	/* From now on, the stack pointer is modified, but the content of the
	stack is not restored yet.  It contains only garbage here. */
    "mov r1, r5\n"       	/* arg 2: extra                       */
                /* arg 1: current (new) stack pointer is already in r0*/
    call_reg(r4)		/* call restore_state()               */

    /* The stack's content is now restored. */
    "zero:\n"

#ifndef __SOFTFP__
    "vpop {d8, d9, d10, d11, d12, d13, d14, d15}\n"
#endif
    "pop {r1, r2, r3, r7, r8, r9, r10, r11}\n"
    "mov sp, r1\n"
    "mov %[result], r0\n"

    : [result]"=r"(result)	/* output variables */
	/* input variables  */
    : [restore_state]"r"(restore_state),
      [save_state]"r"(save_state),
      [extra]"r"(extra)
    : "r0", "r1", "r4", "r5", "r6", "r12", "lr",
      "memory", "cc"
  );
  return result;
}

MIPS64

static void *slp_switch(void *(*save_state)(void*, void*),
                        void *(*restore_state)(void*, void*),
                        void *extra)
{
  void *result;
  __asm__ volatile (
     "daddiu $sp, $sp, -0x50\n"
     "sd $s0, 0x0($sp)\n" /* push the registers specified as caller-save */
     "sd $s1, 0x8($sp)\n"
     "sd $s2, 0x10($sp)\n"
     "sd $s3, 0x18($sp)\n"
     "sd $s4, 0x20($sp)\n"
     "sd $s5, 0x28($sp)\n"
     "sd $s6, 0x30($sp)\n"
     "sd $s7, 0x38($sp)\n"
     "sd $fp, 0x40($sp)\n"
     "sd $ra, 0x48($sp)\n"

     "move $s0, %[rstate]\n" /* save 'restore_state' for later */
     "move $s1, %[extra]\n" /* save 'extra' for later */

     "move $a1, %[extra]\n"/* arg 2: extra */
     "move $a0, $sp\n" /* arg 1: current (old) stack pointer */
                           
     "move $t9, %[sstate]\n"
     "jalr $t9\n" /* call save_state() */

     "beqz $v0, 0f\n" /* skip the rest if the return value is null */

     "move $sp, $v0\n" /* change the stack pointer */

     /* From now on, the stack pointer is modified, but the content of the
        stack is not restored yet.  It contains only garbage here. */

     "move $a1, $s1\n" /* arg 2: extra */
     "move $a0, $v0\n" /* arg 1: current (new) stack pointer */
     "move $t9, $s0\n"
     "jalr $t9\n" /* call restore_state() */

     /* The stack's content is now restored. */

     "0:\n"
     "move %[result], $v0\n"
     "ld $s0, 0x0($sp)\n"
     "ld $s1, 0x8($sp)\n"
     "ld $s2, 0x10($sp)\n"
     "ld $s3, 0x18($sp)\n"
     "ld $s4, 0x20($sp)\n"
     "ld $s5, 0x28($sp)\n"
     "ld $s6, 0x30($sp)\n"
     "ld $s7, 0x38($sp)\n"
     "ld $fp, 0x40($sp)\n"
     "ld $ra, 0x48($sp)\n"
     "daddiu $sp, $sp, 0x50\n"

     : [result]"=&r"(result)
     : [sstate]"r"(save_state),
       [rstate]"r"(restore_state),
       [extra]"r"(extra)
     : "memory", "v0", "a0", "a1", "t9"
     );
  return result;
}

PowerPC64

#if !(defined(__LITTLE_ENDIAN__) ^ defined(__BIG_ENDIAN__))
# error "cannot determine if it is ppc64 or ppc64le"
#endif

#ifdef __BIG_ENDIAN__
# define TOC_AREA   "40"
#else
# define TOC_AREA   "24"
#endif


/* This depends on these attributes so that gcc generates a function
   with no code before the asm, and only "blr" after. */
static __attribute__((noinline, optimize("O2")))
void *slp_switch(void *(*save_state)(void*, void*),
                 void *(*restore_state)(void*, void*),
                 void *extra)
{
  void *result;
  __asm__ volatile (
     /* By Vaibhav Sood & Armin Rigo, with some copying from
        the Stackless version by Kristjan Valur Jonsson */

     /* Save all 18 volatile GP registers, 18 volatile FP regs, and 12
        volatile vector regs.  We need a stack frame of 144 bytes for FPR,
        144 bytes for GPR, 192 bytes for VR plus 48 bytes for the standard
        stackframe = 528 bytes (a multiple of 16). */

     "mflr  0\n"               /* Save LR into 16(r1) */
     "std  0, 16(1)\n"

     "std  14,-288(1)\n"      /* the GPR save area is between -288(r1) */
     "std  15,-280(1)\n"      /*        included and -144(r1) excluded */
     "std  16,-272(1)\n"
     "std  17,-264(1)\n"
     "std  18,-256(1)\n"
     "std  19,-248(1)\n"
     "std  20,-240(1)\n"
     "std  21,-232(1)\n"
     "std  22,-224(1)\n"
     "std  23,-216(1)\n"
     "std  24,-208(1)\n"
     "std  25,-200(1)\n"
     "std  26,-192(1)\n"
     "std  27,-184(1)\n"
     "std  28,-176(1)\n"
     "std  29,-168(1)\n"
     "std  30,-160(1)\n"
     "std  31,-152(1)\n"

     "stfd 14,-144(1)\n"      /* the FPR save area is between -144(r1) */
     "stfd 15,-136(1)\n"      /*           included and 0(r1) excluded */
     "stfd 16,-128(1)\n"
     "stfd 17,-120(1)\n"
     "stfd 18,-112(1)\n"
     "stfd 19,-104(1)\n"
     "stfd 20,-96(1)\n"
     "stfd 21,-88(1)\n"
     "stfd 22,-80(1)\n"
     "stfd 23,-72(1)\n"
     "stfd 24,-64(1)\n"
     "stfd 25,-56(1)\n"
     "stfd 26,-48(1)\n"
     "stfd 27,-40(1)\n"
     "stfd 28,-32(1)\n"
     "stfd 29,-24(1)\n"
     "stfd 30,-16(1)\n"
     "stfd 31,-8(1)\n"

     "li 12,-480\n"           /* the VR save area is between -480(r1) */
     "stvx 20,12,1\n"         /*       included and -288(r1) excluded */
     "li 12,-464\n"
     "stvx 21,12,1\n"
     "li 12,-448\n"
     "stvx 22,12,1\n"
     "li 12,-432\n"
     "stvx 23,12,1\n"
     "li 12,-416\n"
     "stvx 24,12,1\n"
     "li 12,-400\n"
     "stvx 25,12,1\n"
     "li 12,-384\n"
     "stvx 26,12,1\n"
     "li 12,-368\n"
     "stvx 27,12,1\n"
     "li 12,-352\n"
     "stvx 28,12,1\n"
     "li 12,-336\n"
     "stvx 29,12,1\n"
     "li 12,-320\n"
     "stvx 30,12,1\n"
     "li 12,-304\n"
     "stvx 31,12,1\n"

     "stdu  1,-528(1)\n"         /* Create stack frame             */

     "std   2, "TOC_AREA"(1)\n"  /* Save TOC in the "TOC save area"*/
     "mfcr  12\n"                /* Save CR in the "CR save area"  */
     "std   12, 8(1)\n"

     "mr 14, %[restore_state]\n" /* save 'restore_state' for later */
     "mr 15, %[extra]\n"         /* save 'extra' for later */
     "mr 12, %[save_state]\n"    /* move 'save_state' into r12 for branching */
     "mr 3, 1\n"                 /* arg 1: current (old) stack pointer */
     "mr 4, 15\n"                /* arg 2: extra                       */

     "stdu 1, -48(1)\n"       /* create temp stack space (see below) */
#ifdef __BIG_ENDIAN__
     "ld 0, 0(12)\n"
     "ld 11, 16(12)\n"
     "mtctr 0\n"
     "ld 2, 8(12)\n"
#else
     "mtctr 12\n"             /* r12 is fixed by this ABI           */
#endif
     "bctrl\n"                /* call save_state()                  */
     "addi 1, 1, 48\n"        /* destroy temp stack space           */

     "cmpdi 3, 0\n"     /* skip the rest if the return value is null */
     "bt eq, zero\n"

     "mr 1, 3\n"              /* change the stack pointer */
       /* From now on, the stack pointer is modified, but the content of the
        stack is not restored yet.  It contains only garbage here. */

     "mr 4, 15\n"             /* arg 2: extra                       */
                              /* arg 1: current (new) stack pointer
                                 is already in r3                   */

     "stdu 1, -48(1)\n"       /* create temp stack space for callee to use  */
     /* ^^^ we have to be careful. The function call will store the link
        register in the current frame (as the ABI) dictates. But it will
        then trample it with the restore! We fix this by creating a fake
        stack frame */

#ifdef __BIG_ENDIAN__
     "ld 0, 0(14)\n"          /* 'restore_state' is in r14          */
     "ld 11, 16(14)\n"
     "mtctr 0\n"
     "ld 2, 8(14)\n"
#endif
#ifdef __LITTLE_ENDIAN__
     "mr 12, 14\n"            /* copy 'restore_state'               */
     "mtctr 12\n"             /* r12 is fixed by this ABI           */
#endif

     "bctrl\n"                /* call restore_state()               */
     "addi 1, 1, 48\n"        /* destroy temp stack space           */

     /* The stack's content is now restored. */

     "zero:\n"

     /* Epilogue */

     "ld 2, "TOC_AREA"(1)\n"  /* restore the TOC */
     "ld 12,8(1)\n"           /* restore the condition register */
     "mtcrf 0xff, 12\n"

     "addi 1,1,528\n"         /* restore stack pointer */

     "li 12,-480\n"           /* restore vector registers */
     "lvx 20,12,1\n"
     "li 12,-464\n"
     "lvx 21,12,1\n"
     "li 12,-448\n"
     "lvx 22,12,1\n"
     "li 12,-432\n"
     "lvx 23,12,1\n"
     "li 12,-416\n"
     "lvx 24,12,1\n"
     "li 12,-400\n"
     "lvx 25,12,1\n"
     "li 12,-384\n"
     "lvx 26,12,1\n"
     "li 12,-368\n"
     "lvx 27,12,1\n"
     "li 12,-352\n"
     "lvx 28,12,1\n"
     "li 12,-336\n"
     "lvx 29,12,1\n"
     "li 12,-320\n"
     "lvx 30,12,1\n"
     "li 12,-304\n"
     "lvx 31,12,1\n"

     "ld  14,-288(1)\n"     /* restore general purporse registers */
     "ld  15,-280(1)\n"
     "ld  16,-272(1)\n"
     "ld  17,-264(1)\n"
     "ld  18,-256(1)\n"
     "ld  19,-248(1)\n"
     "ld  20,-240(1)\n"
     "ld  21,-232(1)\n"
     "ld  22,-224(1)\n"
     "ld  23,-216(1)\n"
     "ld  24,-208(1)\n"
     "ld  25,-200(1)\n"
     "ld  26,-192(1)\n"
     "ld  27,-184(1)\n"
     "ld  28,-176(1)\n"
     "ld  29,-168(1)\n"
     "ld  30,-160(1)\n"
     "ld  31,-152(1)\n"

     "lfd 14,-144(1)\n"     /* restore floating point registers */
     "lfd 15,-136(1)\n"
     "lfd 16,-128(1)\n"
     "lfd 17,-120(1)\n"
     "lfd 18,-112(1)\n"
     "lfd 19,-104(1)\n"
     "lfd 20,-96(1)\n"
     "lfd 21,-88(1)\n"
     "lfd 22,-80(1)\n"
     "lfd 23,-72(1)\n"
     "lfd 24,-64(1)\n"
     "lfd 25,-56(1)\n"
     "lfd 26,-48(1)\n"
     "lfd 27,-40(1)\n"
     "lfd 28,-32(1)\n"
     "ld 0, 16(1)\n"
     "lfd 29,-24(1)\n"
     "mtlr 0\n"
     "lfd 30,-16(1)\n"
     "lfd 31,-8(1)\n"

     : "=r"(result)         /* output variable: expected to be r3 */
     : [restore_state]"r"(restore_state),       /* input variables */
       [save_state]"r"(save_state),
       [extra]"r"(extra)
  );
  return result;
}

S390x

/* This depends on these attributes so that gcc generates a function
   with no code before the asm, and only "blr" after. */
static __attribute__((noinline, optimize("O2")))
void *slp_switch(void *(*save_state)(void*, void*),
                 void *(*restore_state)(void*, void*),
                 void *extra)
{
  void *result;
  __asm__ volatile (
     /* The Stackless version by Kristjan Valur Jonsson,
        ported to s390x by Richard Plangger */

     "stmg 6,15,48(15)\n"

     // store f8 - f15 into the stack frame that is not used!
     "std 8,128(15)\n"
     "std 9,136(15)\n"
     "std 10,144(15)\n"
     "std 11,152(15)\n"

     "std 12,16(15)\n"
     "std 13,24(15)\n"
     "std 14,32(15)\n"
     "std 15,40(15)\n"

     "lgr 10, %[restore_state]\n" /* save 'restore_state' for later */
     "lgr 11, %[extra]\n"         /* save 'extra' for later */
     "lgr 14, %[save_state]\n"    /* move 'save_state' into r14 for branching */
     "lgr 2, 15\n"                /* arg 1: current (old) stack pointer */
     "lgr 3, 11\n"                /* arg 2: extra                       */

     "lay 15,-160(15)\n"          /* create stack frame                 */
     "basr 14, 14\n"              /* call save_state()                  */
     "lay 15,160(15)\n"

     "cgij 2, 0, 8, zero\n"       /* skip the rest if the return value is null */

     "lgr 15, 2\n"                /* change the stack pointer */

     /* From now on, the stack pointer is modified, but the content of the
        stack is not restored yet.  It contains only garbage here. */
                               /* arg 1: current (new) stack pointer
                                 is already in r2                    */
     "lgr 3, 11\n"             /* arg 2: extra                       */

     "lay 15,-160(15)\n"       /* create stack frame                 */
     "basr 14, 10\n"           /* call restore_state()               */
     "lay 15,160(15)\n"

     /* The stack's content is now restored. */

     "zero:\n"

     /* Epilogue */
     "ld 8,128(15)\n"
     "ld 9,136(15)\n"
     "ld 10,144(15)\n"
     "ld 11,152(15)\n"

     "ld 12,16(15)\n"
     "ld 13,24(15)\n"
     "ld 14,32(15)\n"
     "ld 15,40(15)\n"

     "lmg 6,15,48(15)\n"

     : "=r"(result)         /* output variable: expected to be r2 */
     : [restore_state]"r"(restore_state),       /* input variables */
       [save_state]"r"(save_state),
       [extra]"r"(extra)
  );
  return result;
}

Discussion: current_sandbox_main - arch_context_init

	assert(!software_interrupt_is_enabled());
	arch_context_init(&sandbox->ctxt, 0, 0);

@gparmer
gparmer 6 days ago
Is this necessary? Haven't we already switched to the sandbox by the nature of running in this function?

@phanikishoreg
phanikishoreg 6 days ago
This must be my code. It took me a while to figure out why I added it in the first place. I remember having some comments somewhere, forget why I don't see it.

From https://github.com/phanikishoreg/awsm-Serverless-Framework/blob/master/runtime/src/sandbox.c#L179-L181

void
current_sandbox_main(void)
{
	struct sandbox *current_sandbox = current_sandbox_get();
	// FIXME: is this right? this is the first time this sandbox is running.. so it wont
	//        return to worker_thread_switch_to_sandbox() api..
	//        we'd potentially do what we'd in worker_thread_switch_to_sandbox() api here for cleanup..
	if (!software_interrupt_is_enabled()) {
		arch_context_init(&current_sandbox->ctxt, 0, 0);
		worker_thread_next_context = NULL;
		software_interrupt_enable();
	}
<snip>

So, on setting up a sandbox, we manually setup its IP and SP so it uses user-level switch, as obviously mcontex_t is not applicable. The IP we set is actually the address to "current_sandbox_main()" and so, resetting of "sp" needed to be done explicitly after the first context switch, as IP is not pointing to that label 2 in context switch assembly, instead it is "main" function here. So we do this.

Create predefined sandbox memory pointer functions for various types

Status Quo
uint64_t *p = worker_thread__get_memory_ptr_void(p_off, sizeof(uint64_t));

@phanikishoreg
Just a thought, instead of this function, we should have: xxxx_get_memory_ptr_uint64() which does not take size as an argument, so we don't need to worry about users passing wrong size argument here? Similarly, for other primitive data-types.

Improve Logging Infrastructure

A debug log should be optionally generated based on a configuration. When not set, printfs and logs should not be used because of their impact on performance.

This likely should use the existing debuglog infrastructure.

Error building tests

/tmp/lto-llvm-f73ecb.o: In function wasmf_SolveCubic': ld-temp.o:(.text.wasmf_SolveCubic+0x49f): undefined reference to cos'
ld-temp.o:(.text.wasmf_SolveCubic+0x4e7): undefined reference to cos' ld-temp.o:(.text.wasmf_SolveCubic+0x51f): undefined reference to cos'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
Makefile:36: recipe for target 'basic_math_sf' failed
make: *** [basic_math_sf] Error 1

Inline sandbox_context_cache into struct sandbox

runtime/include/types.h

struct sandbox_context_cache {
	struct indirect_table_entry *module_indirect_table;
	void *                       linear_memory_start;
	u32                          linear_memory_size;

@phanikishoreg
phanikishoreg 16 hours ago โ€ข
Why not inline this in the struct sandbox, so all you have to do for context switch is,

worker_context_cache = sandbox->context_cache; 

The struct assignment simplicity advantage I was talking about. I think it is cleaner, because we have 2 out of 3 of these variables in the sandbox struct anyway.

runtime/src/current_sandbox.c

		};
		worker_thread_current_sandbox = NULL;
	} else {
		local_sandbox_member_cache = (struct sandbox_member_cache){

@phanikishoreg
phanikishoreg 16 hours ago
This is where I'm talking about really,

local_sandbox_member_cache = sandbox->member_cache

Investigate removing the following from types.h

typedef enum
{
MOD_ARG_MODPATH = 0,
MOD_ARG_MODPORT,
MOD_ARG_MODNAME,
MOD_ARG_MODNARGS,
MOD_ARG_MAX,
} mod_argindex_t;

#define MOD_MAX_ARGS 16 // Max number of arguments

#define GLB_STDOUT "/dev/null"
#define GLB_STDERR "/dev/null"
#define GLB_STDIN "/dev/zero"

#define SBOX_RESP_STRSZ 32

#ifndef CACHELINE_SIZE
#define CACHELINE_SIZE 32
#endif
#define CACHE_ALIGNED attribute((aligned(CACHELINE_SIZE)))

Feature: Ability to load/free/start/stop a module at runtime

Note: This issue requires further decomposition into distinct feature issues.

Currently, sledge instantiates serverless modules once on startup, and changes to what modules run require the sledge runtime to be taken down and restarted after the modules JSON file has been modified.

Realistically, the runtime should have a set of RESTful CRUD and start/stop APIs to control modules dynamically at runtime:

  • create - passing a JSON file to add one or more modules
  • retrieve - returns a JSON of active modules
  • update - This could be limited to certain fields, such as "active"
  • delete - Remove module from system and free resources
  • start/stop - Related to update. Allows a module to be loaded, but inactive

Currently, there is unused and untested functionality intended to allow for dynamic add / removal of modules:

  • module_database_find_by_name
  • module_database_find_by_socket_descriptor
  • module_free
  • and the associated backing state for the module_database

Discussion: Should Linear Memory Initialization be shifted from sandbox_allocate

Gabe Comment:
I don't have reverse lookup handy right now, so I'll make this comment assuming that this is called in a single location that is in a somewhat "random" context. Perhaps the signal handler, another thread, etc...:

I think that we want to decouple stack and register allocation/initialization from sandbox initialization at large. The rest of the sandbox (mainly linear memory) can be allocated in the context of the sandbox thread before it executes the sandbox. The goal here is that all sandbox-specific costs can be attributed to that sandbox, and scheduled as that sandbox, rather than "stealing" time in another random context that happens to try and allocate it.

A possible place this could be moved is current_sandbox_main, which has this logic commented out.

runtime/src/sandbox.c
	// Initialize the module
	struct module *current_module = sandbox_get_module(sandbox);
	int            argument_count = module_get_argument_count(current_module);
	// alloc_linear_memory();

Gabe and Phani seemed to disagree about this.

 @gparmer
gparmer 5 days ago
I think that allocating linear memory in this thread is great.

WE likely need to allocate the stack elsewhere (as we cannot execute here without the stack), but the linear memory allocation and initialization can be here, I believe.

 @phanikishoreg
phanikishoreg 5 days ago
Allocation of linear memory and stack are both outside of this "sandbox_main" function, they're done when the "worker_thread" processes a request.
We should just remove commented lines, my bad.

Unit and system test regime

We need a test regime, at the very least, for unit test and system test for when we do PRs, we both extend the test cases and test all the prior ones that are still applicable.

How to? I think we have a few test programs in runtime/test, like, FILE I/O, SOCKET I/O, FIBONACCI etc, perhaps we start from there?
We might also need to test a couple of things here,

  1. WASM features that we add or modify, we'd need to test those.
  2. SERVERLESS functionality
  3. Unit testing could be around: http request/response, different wasm modules with different memory/stack/other requirements, scheduling, timeouts, preemptions, JSON configurations, etc.

@bushidocodes You have changed a lot, though it is changing naming, cleaning up unused code, outdated documentation, etc and not functionality changes I think, it would be good to write in the PRs what you tested!!

Make new, initialize, allocate semantics more meaningful and consistent.

Phani: I noticed somewhere, you had a sandbox__new(). It would be good to be consistent in the verb we choose for allocation/deallocation everywhere. If I'm wrong, ignore this!

Me: Good point! I think I should probably go back and review initialize, allocate, and new. I seem to recall that the new function was something like allocate + other constructor-style logic (maybe like push to runqueue or something). I'll open this as a follow-up issue.

Phani: I'm not confident you're being consistent with the "verb" or "action" as a suffix. Like softint__initialize(), current_sandbox__set are inconsistent with this convention. Something to revisit for cleanup, perhaps.

Refactor single-core operation to use pop, not steal

Blocked by #11.

Notice them mutex_lock for NCORES == 1 in both push/pop functions and steal calling 'pop' for NCORES == 1. Ideally you don't call steal for same core consumption but I just made the steal API wrap that logic, which is perhaps not good. We might just add the #if in the scheduling code which should explicitly call "pop" for single core and add an assert in "steal" function for NCORES == 1.

Shift Sandbox Alignment check to just after JSON parsing.

In runtime/src/sandbox.c, We are asserting that the sandbox size aligns to a page. This check should be performed when a module's JSON specification is loaded.

	if (linear_memory_size + sandbox_size > memory_size) return NULL;
	// Control information should be page-aligned
	// TODO: Should I use round_up_to_page when setting sandbox_page?
	assert(round_up_to_page(sandbox_size) == sandbox_size);
 @gparmer
gparmer 5 days ago
Need to validate that this is an internal invariant. If it can be impacted by the client (or the json spec), then this should be an externally visible error. I'd imagine this code is fine as is, but A small check to ensure that sandbox_size can only be wrong by an internal bug is healthy.

Discussion: Review Interrupt logic

runtime/src/sandbox_run_queue_fifo.c: 50

		assert(sandbox);
		free(sandbox_request);
		sandbox->state = RUNNABLE;
		sandbox_run_queue_add(sandbox);

@gparmer
I hope that preemptions are disabled for this function (in the calling context). A preemption in between sandbox_runqueue_is_empty and here or the lines after the if could cause strange things to happen.

runtime/src/sandbox_run_queue_ps.c: 103

void
sandbox_run_queue_ps_preempt(ucontext_t *user_context)
{
	software_interrupt_disable(); // no nesting!

@gparmer
A little surprisd to see this here as I figured surrounding abstractions disabled interrupts for much of this file's contents.

I don't think that much of the previous code in this file is preemption-safe (if a signal occurred and changed the runqueue between spans of the previous logic, would it work?), so perhaps this should be revisited?

runtime/src/sandbox_run_queue_ps.c: 152

		// And load the context of this new sandbox
		// RC of 1 indicates that sandbox was last in a user-level context switch state,
		// so do not enable software interrupts.
		if (arch_mcontext_restore(&user_context->uc_mcontext, &next_sandbox->ctxt) == 1)

@gparmer
As we've discussed, this code needs a careful audit. We cannot do a direct switch in the signal handler. We likely want assertions to check that. We should update the signal stack registers with those of the context we want to switch to. That might be what is happening here.

@phanikishoreg
phanikishoreg 6 days ago
Yes, that is what is happening. We cannot do a direct switch. We just update the input/output argument in signal handler and when that handler returns, that mcontext is restored and the thread continues from that execution.

runtime/src/sandbox_run_queue_ps.c: 155

		if (arch_mcontext_restore(&user_context->uc_mcontext, &next_sandbox->ctxt) == 1)
			should_enable_software_interrupt = false;
	}
	if (should_enable_software_interrupt) software_interrupt_enable();

@gparmer
If this can be called in the signal handler, we should absolutely not be enabling interrupts.

I'm likely wrong that this can be called from the signal handler as that would make a lot of this logic strange. Invariants in my head include:

If we're in the signal handler:

  • we cannot switch directly to a thread, and instead rely on the sigreturn to do the context switch.
  • we cannot even update the signal registers to return to another thread if the preemption disable flag is set.
  • we cannot enable or disable preemptions as that must be done in the surrounding context that was interrupted by the signal. Nested signals should likely be avoided using the signal masks at system initialization time.

When we enable preemptions

  • we should check if a signal happened, and if so, we should
    • check if we should switch to another thread, and do so, and
    • unset the record that the signal happened. We need to be careful of the race with the signal handler setting that value.

Refactor worker_thread inter-sandbox function names

The first function does seems to do two distinct things:
worker_thread_execute_runtime_maintenance_and_get_next_sandbox

The second function has a potential name conflict with what we'd refactor out of the first function
worker_thread_get_next_sandbox

Refactor expand_memory to return RCs

instruction_memory_grow should trigger a WebAssembly trap if linear memory is exhaused.

From WebAssembly Spec:
"The memory.grow instruction grows memory by a given delta and returns the previous size, or โˆ’1 if enough memory cannot be allocated. Both instructions operate in units of page size."

Might just be OK and ERR (-1), unless there is something actionable for OOM, in which case this would be OOM, ERR, and OK.

In types.h, document the use of these macros

#define EXPORT attribute((visibility("default")))
#define IMPORT attribute((visibility("default")))

#define INLINE attribute((always_inline))
#define WEAK attribute((weak))

#ifndef CACHELINE_SIZE
#define CACHELINE_SIZE 32
#endif

#ifndef PAGE_SIZE
#define PAGE_SIZE (1 << 12)
#endif

#define CACHE_ALIGNED attribute((aligned(CACHELINE_SIZE)))
#define PAGE_ALIGNED attribute((aligned(PAGE_SIZE)))

// TODO: Better document the use of these macros

/* For this family of macros, do NOT pass zero as the pow2 */
#define round_to_pow2(x, pow2) (((unsigned long)(x)) & (~((pow2)-1)))
#define round_up_to_pow2(x, pow2) (round_to_pow2(((unsigned long)x) + (pow2)-1, (pow2)))

#define round_to_page(x) round_to_pow2(x, PAGE_SIZE)
#define round_up_to_page(x) round_up_to_pow2(x, PAGE_SIZE)

Informat.sh script, blacklist third party code and troubleshoot `.clang-format` config usage

The format.sh script currently formats the files in our submodules. We should either blacklist anything in runtime/thirdparty. Additionally, it's unclear to me if the .clang-format file is actually being ingested. I believe that I've successfully been able to modify settings here and impact the formatting logic, but while investigating this issue, this should be confirmed and then the indentation keys should be looked at a bit close

There might be a conflict here. How does TabWidth and IndentWidth interact?
IndentWidth: 8
TabWidth: 8
UseTab: ForIndentation

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.