peaclab / microfaas-worker Goto Github PK
View Code? Open in Web Editor NEWFaaS on small, embedded-system-like compute nodes
License: MIT License
FaaS on small, embedded-system-like compute nodes
License: MIT License
Currently, when the orchestrator receives a worker request from a worker that's not registered in WORKERS
(even if it is registered in AVAILABLE_WORKERS
), its default behavior is to just drop the connection:
try:
w = WORKERS[str(self.worker_id)]
except KeyError:
log.error("Worker with unknown ID %s attempted to connect", self.worker_id)
return
We'd be much better-served by telling such a worker to power-down by changing the default behavior shown above, and/or by updating the Worker
state machine to tell inactive workers to power-off. This would minimize the power draw of inactive workers.
(Post-fix bug report; documenting here for posterity)
Running a VM cluster with the latest code on the refactor branch was frequently producing warnings about unhandled requests, e.g.
Sep 03 12:02:53 beaglebone python3[1762]: INFO:root:Processed results of invocation csmZbi from worker 103
Sep 03 12:02:53 beaglebone python3[1762]: INFO:root:Processed results of invocation ixjaZm from worker 104
Sep 03 12:02:53 beaglebone python3[1762]: INFO:root:Sending pkill to VMWorker104
Sep 03 12:02:53 beaglebone python3[1762]: ERROR:root:VMWorker104 made request but no output events set
Sep 03 12:02:53 beaglebone python3[1762]: WARNING:root:Telling VMWorker104 to reboot due to unhandled request
Sep 03 12:02:53 beaglebone python3[1762]: INFO:root:Transmitted work to VMWorker105
Sep 03 12:02:54 beaglebone python3[1762]: INFO:root:Processed results of invocation FlS9lG from worker 105
Sep 03 12:02:54 beaglebone python3[1762]: INFO:root:Transmitted work to VMWorker106
Sep 03 12:02:54 beaglebone python3[1762]: INFO:root:Processed results of invocation b1oC9x from worker 106
Sep 03 12:02:54 beaglebone python3[1762]: INFO:root:Transmitted work to VMWorker103
Sep 03 12:02:55 beaglebone python3[1762]: INFO:root:Transmitted work to VMWorker105
Sep 03 12:02:55 beaglebone python3[1762]: INFO:root:Processed results of invocation 1JIlkt from worker 105
Sep 03 12:02:55 beaglebone python3[1762]: INFO:root:Processed results of invocation uf84SY from worker 103
Sep 03 12:02:55 beaglebone python3[1762]: INFO:root:Transmitted work to VMWorker106
Sep 03 12:02:55 beaglebone python3[1762]: INFO:root:Processed results of invocation Hh1bw7 from worker 106
Sep 03 12:02:55 beaglebone python3[1762]: INFO:root:Sending pkill to VMWorker106
Sep 03 12:02:55 beaglebone python3[1762]: ERROR:root:VMWorker106 made request but no output events set
Sep 03 12:02:55 beaglebone python3[1762]: WARNING:root:Telling VMWorker106 to reboot due to unhandled request
Sep 03 12:02:56 beaglebone python3[1762]: INFO:root:Transmitted work to VMWorker105
Sep 03 12:02:56 beaglebone python3[1762]: INFO:root:Transmitted work to VMWorker103
Sep 03 12:02:56 beaglebone python3[1762]: INFO:root:Processed results of invocation 1S6Zsc from worker 103
Sep 03 12:02:56 beaglebone python3[1762]: INFO:root:Sending pkill to VMWorker103
Sep 03 12:02:56 beaglebone python3[1762]: ERROR:root:VMWorker103 made request but no output events set
Sep 03 12:02:56 beaglebone python3[1762]: WARNING:root:Telling VMWorker103 to reboot due to unhandled request
Sep 03 12:02:56 beaglebone python3[1762]: INFO:root:Attempting to power up VMWorker106
Sep 03 12:02:57 beaglebone python3[1762]: INFO:root:Processed results of invocation iKHAFI from worker 105
Sep 03 12:02:57 beaglebone python3[1762]: INFO:root:Attempting to power up VMWorker104
This was concerning because it seemed that VMWorkers were being told to reboot unnecessarily right after being (appropriately) pkilled. The most likely explanation, however, was that the VMWorker was able to squeeze one more worker request after the call to pkill job was executed but before the QEMU process was actually terminated, so the orchestrator wasn't expecting this. In other words, this is a harmless (though annoying) bug, as the VM is getting killed no matter what it's being told to do by the orchestrator in its final moments.
This bug should be fixed by 4d3b7d3. Please close this after confirmation.
This repo is getting a little messy as we try to split the code up into proper modules. It would make sense to move the worker-side code (e.g., worker.py, micropg.py) to a different repo: probably best to create a "worker-filesystem" repo that just has our initramfs file structure and then worker code all-in-one.
Observing behavior where VM server starts OOM-killing near the end of our experiments due to apparently infinite creation of VMs. Log tail looks like:
Aug 20 11:08:22 beaglebone python3[490]: INFO:root:Attempting to power up VMWorker103
Aug 20 11:08:26 beaglebone python3[490]: INFO:root:Transmitted work to VMWorker105
Aug 20 11:08:26 beaglebone python3[490]: INFO:root:Processed results of invocation drZG71 from worker 105
Aug 20 11:08:28 beaglebone python3[490]: INFO:root:Transmitted work to VMWorker105
Aug 20 11:08:28 beaglebone python3[490]: INFO:root:Processed results of invocation pNUgnz from worker 105
Aug 20 11:08:29 beaglebone python3[490]: INFO:root:Transmitted work to VMWorker105
Aug 20 11:08:29 beaglebone python3[490]: INFO:root:Processed results of invocation LXgBkQ from worker 105
Aug 20 11:08:29 beaglebone python3[490]: INFO:root:Transmitted work to VMWorker108
Aug 20 11:08:29 beaglebone python3[490]: INFO:root:Processed results of invocation 6yEaaU from worker 108
Aug 20 11:08:34 beaglebone python3[490]: INFO:root:Attempting to power up VMWorker112
Aug 20 11:08:40 beaglebone python3[490]: INFO:root:Attempting to power up VMWorker109
Aug 20 11:08:41 beaglebone python3[490]: INFO:root:Attempting to power up VMWorker106
Aug 20 11:08:44 beaglebone python3[490]: INFO:root:Attempting to power up VMWorker115
Aug 20 11:08:45 beaglebone python3[490]: INFO:root:Attempting to power up VMWorker114
Aug 20 11:08:47 beaglebone python3[490]: INFO:root:Attempting to power up VMWorker104
Aug 20 11:08:49 beaglebone python3[490]: INFO:root:Attempting to power up VMWorker116
Aug 20 11:08:50 beaglebone python3[490]: INFO:root:Attempting to power up VMWorker110
Aug 20 11:08:51 beaglebone python3[490]: INFO:root:Attempting to power up VMWorker111
Aug 20 11:08:53 beaglebone python3[490]: INFO:root:Attempting to power up VMWorker113
Aug 20 11:09:09 beaglebone python3[490]: INFO:root:Attempting to power up VMWorker107
Aug 20 11:09:15 beaglebone python3[490]: INFO:root:Transmitted work to VMWorker106
Aug 20 11:09:22 beaglebone python3[490]: INFO:root:Attempting to power up VMWorker103
Aug 20 11:09:34 beaglebone python3[490]: INFO:root:Attempting to power up VMWorker112
Aug 20 11:09:40 beaglebone python3[490]: INFO:root:Attempting to power up VMWorker109
Aug 20 11:09:44 beaglebone python3[490]: INFO:root:Attempting to power up VMWorker115
Aug 20 11:09:45 beaglebone python3[490]: INFO:root:Attempting to power up VMWorker114
Aug 20 11:09:47 beaglebone python3[490]: INFO:root:Attempting to power up VMWorker104
Aug 20 11:09:49 beaglebone python3[490]: INFO:root:Attempting to power up VMWorker116
Aug 20 11:09:50 beaglebone python3[490]: INFO:root:Attempting to power up VMWorker110
Aug 20 11:09:51 beaglebone python3[490]: INFO:root:Attempting to power up VMWorker111
Aug 20 11:09:53 beaglebone python3[490]: INFO:root:Attempting to power up VMWorker113
The orchestrator labels all result log files with the -vm
postfix, even when working with BBB clusters. BBB-clusters should be labeled with a -bbb
postfix, and "mixed" clusters can be labeled with something like -mixed
When a VMWorker transitions into the OFF
state, sometimes the pkill
command fails to halt the corresponding VM before it makes another worker request. This triggers the following warning/exception:
Aug 19 13:36:57 beaglebone python3[20799]: WARNING:root:VMWorker104 made request but no output events set
and/or
Aug 19 13:36:57 beaglebone python3[20799]: ----------------------------------------
Aug 19 13:36:57 beaglebone python3[20799]: Exception happened during processing of request from ('192.168.1.104', 42042)
Aug 19 13:36:57 beaglebone python3[20799]: Traceback (most recent call last):
Aug 19 13:36:57 beaglebone python3[20799]: File "/usr/lib/python3.7/socketserver.py", line 650, in process_request_thread
Aug 19 13:36:57 beaglebone python3[20799]: self.finish_request(request, client_address)
Aug 19 13:36:57 beaglebone python3[20799]: File "/usr/lib/python3.7/socketserver.py", line 360, in finish_request
Aug 19 13:36:57 beaglebone python3[20799]: self.RequestHandlerClass(request, client_address, self)
Aug 19 13:36:57 beaglebone python3[20799]: File "/usr/lib/python3.7/socketserver.py", line 720, in __init__
Aug 19 13:36:57 beaglebone python3[20799]: self.handle()
Aug 19 13:36:57 beaglebone python3[20799]: File "/home/debian/MicroFaaS/orchestrator.py", line 103, in handle
Aug 19 13:36:57 beaglebone python3[20799]: self.data = self.request.recv(12288).strip()
Aug 19 13:36:57 beaglebone python3[20799]: ConnectionResetError: [Errno 104] Connection reset by peer
Aug 19 13:36:57 beaglebone python3[20799]: ----------------------------------------
These warnings/exceptions can be safely caught and ignored, as pkill
ensures the VM is properly shut down, even if it lets a few erroneous worker requests slip out during the second or two it needs to kill QEMU.
(Note that this is in reference to code that's currently on the refactor
branch.)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.