Code Monkey home page Code Monkey logo

Comments (23)

der-flo avatar der-flo commented on June 17, 2024 4

@jscheid: Did you make progress?

Doing serious tests with Lambda without a fast development environment is quite impossible. Does anybody know a working solution?

from docker-lambda.

jscheid avatar jscheid commented on June 17, 2024 2

Sorry, I didn't see the forest for the trees. Thanks for working through this with me, now I can make the code much less of a hack.

from docker-lambda.

mhart avatar mhart commented on June 17, 2024 1

@jscheid have you seen AWS SAM Local? It uses docker-lambda under the hood and achieves everything you'd need (I think) https://github.com/awslabs/aws-sam-local

from docker-lambda.

jscheid avatar jscheid commented on June 17, 2024 1

Thanks for explaining this, it's what I expected but I'm still somewhat surprised. Do you happen to know how to reconcile these two observations:

  1. AWS Lambda clearly performs multiple invocations per process when warm:

    • http://docs.aws.amazon.com/lambda/latest/dg/best-practices.html advocates the use of global/static variables, and of course these don't normally survive after the process exits,

    • it's hard to imagine how it could achieve such decent response times (<5 ms I believe) if it had to load a new node.js runtime, and load and parse what can potentially be hundreds or thousands of JS files in npm packages, for every invocation.

  2. Yet the code in awslambda appears to assume only one invocation per process, for example:

    • the module executes the invocation at the top level (on load), rather than in response to some message passed in, and rather than exporting a method to be invoked by some other module,

    • the way the code deals with global event handlers and environment variables looks very much like it's relying on process exit for cleanup.

Could it be that, while awslambda/index.js can be found on in their file system, it's not actually what they use for real?

I suppose there is a chance that they do in fact use this code but with some kind of wrapper around require(), process etc., or with a special Node runtime. I can't see how they could solve it purely on the OS level though.

I'll run a few experiments on Lambda to get a better idea of how this all fits together, but I'd still be curious if you had any insights to share.

Would you agree that it would be useful if docker-lambda, in addition to what you list in the README (installed software, file structure, etc.) would also replicate this aspect accurately? It would probably allow speeding up CI tests considerably, and allow testing of any in-app caching methods.

Your point about other runtimes is well taken, but first I'd like to figure out how to solve it for just a single runtime robustly and accurately.

from docker-lambda.

mhart avatar mhart commented on June 17, 2024

You'd need to create a whole communication layer to achieve this, instead of the existing invocation model – happy to accept a PR on it though

from docker-lambda.

ibratoev avatar ibratoev commented on June 17, 2024

Closed/opened by mistake ;)
For NodeJS, the main process can listen on a Unix Socket and process events on request. It shouldn't be too hard.
I looked at the AWS code - there it is implemented with a native module, but I don't think the efficiency is that important for local usage.

A simpler alternative would be to allow invoking the container with multiple events in one call. It would be less flexible but simpler to implement.

What do you think?

from docker-lambda.

jscheid avatar jscheid commented on June 17, 2024

Hi, here is my attempt at a solution: https://github.com/jscheid/lambda-server

There's a lot to say about this... for starters, merging this into docker-lambda, rather than keeping it as a separate image, would get rid of 80% of the FIXMEs in the code. Before starting that discussion I wanted to check that there is still interest in adding this feature, and that you agree with the general thrust of the solution?

from docker-lambda.

jscheid avatar jscheid commented on June 17, 2024

@mhart yes I've seen it (used it actually) but no, it doesn't: aws/aws-sam-cli#227

As far as I know they're just using vanilla (as in, one-container-per-invocation) lambci/lambda under the hood... https://github.com/awslabs/aws-sam-local#a-special-thank-you

from docker-lambda.

jscheid avatar jscheid commented on June 17, 2024

@mhart am I missing something?

from docker-lambda.

mhart avatar mhart commented on June 17, 2024

No, not missing anything – if you have to keep it warm as opposed to just multiple invocations, then SAM Local doesn't do this.

from docker-lambda.

jscheid avatar jscheid commented on June 17, 2024

Ah, good to know, thanks.

Are you interested at all in adding this feature to this repository, and if you don't, would you help me with a few questions I have (like this one)?

from docker-lambda.

mhart avatar mhart commented on June 17, 2024

That code comes from Lambda (you can find it on the filesystem) – which is true for all the other files on the container – the only file I add is the mock for the native library.

I think if a solution were to be added to this repo, it would need to be runtime-independent – or at least, the same interface for each runtime.

from docker-lambda.

mhart avatar mhart commented on June 17, 2024

I'm not sure what makes you think their code only assumes one invocation per process? It doesn't look that way to me (or didn't last time I checked).

Certainly anything set globally will only disappear on a process exit – although a bunch of env variables are set per invocation.

The awslambda code handles pretty much everything needed for an invocation – it just relies on the native module to communicate back to the AWS Lambda supervisor (or whatever AWS calls it) – and listen for new invocations

from docker-lambda.

jscheid avatar jscheid commented on June 17, 2024

Yes, it seems like I'm missing something.

It looks to me like it's assuming one invocation per process because the code is executed at the top level and uses environment variables for arguments.

For multiple invocations in the same process, using the same Node.js runtime (for global variables etc.) what you would usually do is just model this as a JavaScript function with arguments and a return value?

And also, just to pick a random bit of code out of many similar ones, process.on('beforeExit', () => invokeManager.finish(null, null, false)); looks a lot like the invocation is tied to the process lifetime.

Again, I might be missing something, but so far every way of invoking this code multiple inside the same Node runtime cleanly and robustly involves all sorts of hacks, like using eval (or the vm module) to run the code, to remove all event handlers for uncaughtException and beforeExit after each invocation, etc.

from docker-lambda.

mhart avatar mhart commented on June 17, 2024

Specifically: do a search for awslambda.waitForInvoke – you can see it's called at the end of each invocation (in the finish function) – so it just loops

from docker-lambda.

mhart avatar mhart commented on June 17, 2024

Also I'm not sure what you mean by "executed at the top level" – each event is processed by the user's specified event handler by the start function (which, again, is a callback to waitForInvoke).

The user's code is only required once (in _getHandler) – if that's what you mean by the top level – but that's expected – that's only supposed to be a once-off execution.

from docker-lambda.

jscheid avatar jscheid commented on June 17, 2024

Ahh... right, of course, I was thrown off by the fact that the docker image's entry point is simply invoking the code, rather than passing a message of some sort.

In this case, you wouldn't happen to know how to inject multiple invocations into awslambda? The supervisor communication you were referring to isn't documented anywhere?

from docker-lambda.

jscheid avatar jscheid commented on June 17, 2024

Never mind, I see now, it comes from your mock.

from docker-lambda.

jscheid avatar jscheid commented on June 17, 2024

One more question, are you opposed to introducing a build step, for Babel or such?

from docker-lambda.

binarymist avatar binarymist commented on June 17, 2024

Seems like a very common case:
aws/aws-sam-cli#227
aws/aws-sam-cli#239

from docker-lambda.

devoto13 avatar devoto13 commented on June 17, 2024

One idea for the simple protocol is to use stdin for sending events and stdout for returning responses. The format is \n-separated JSON strings. This should provide the same protocol for all runtimes, however communication layer needs to be done for each runtime separately.

E.g.

// containers stdin
{"name":"event1"}
{"name":"event2"}
// containers stdout
{"status":200,"body":"Processed event1"}
{"status":200,"body":"Processed event2"}

So the communication layer will read events from stdin line by line and call lambda without shutting down the container. Then it will write the result of lambda execution as JSON string + \n to stdout, so it can be either seen by the developer or picked up by the wrapper (like SAM CLI).

This should also be pretty easy to integrate into existing NPM module.

Some other questions:

  • Should it be a new container type in addition to run and build or updated run container? Former sounds more reasonable to not break backwards compatibility, but brings some maintenance overhead.
  • How to handle reporting? One option would be to wrap lambda result into another JSON object and add time/memory data as keys in the wrapper object. Is it even relevant here?

Limitations:

  • Restarting the server, when code changes is out of scope. The primary use case for this feature is to be able to run tests faster and reloading is not very relevant there.

Does this sound reasonable?
I plan to work on the proof of concept in the coming days.

from docker-lambda.

mhart avatar mhart commented on June 17, 2024

Unfortunately that scheme is very unlikely to work in practice @devoto13.

While I make every effort to redirect normal stdout from the Lambda outputs to stderr, it doesn't work in all runtimes, and it doesn't work if ppl are outputting to stdout directly (ie using a custom logger instead of console.log).

The only thing I come close to guaranteeing is that the last line of the docker container's stdout is going to be the result of the lambda handler – because there should be no more output from the handler after that.

However, to guarantee that every line of stdout could be an event is unlikely to work for many lambdas.

from docker-lambda.

devoto13 avatar devoto13 commented on June 17, 2024

Thanks for you feedback, @mhart! I see how this protocol may be problematic in practice.

Another approach may be to expose HTTP API from the container, but it will involve much more effort and will make usage more complex comparing to stdin/stdout approach. I would prefer to not go this road.

What if we prefix handler result lines with some Unicode character to make them distinguishable from the lines, which were output by the handler itself? While it is impossible to guarantee that this character won't be printed by the users' code I think it is possible to pick a character, which will minimize the risk. We may also allow to customize this control character using an environment variable to allow users to solve possible collisions.

Control character may be either invisible (like zero-width space) or meaningful (like an arrow). In this case direct usage won't print something weird and wrappers will be able to use it.

from docker-lambda.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.