openfaas / of-watchdog Goto Github PK
View Code? Open in Web Editor NEWReverse proxy for STDIO and HTTP microservices
License: MIT License
Reverse proxy for STDIO and HTTP microservices
License: MIT License
It should be possible to post a multipart request to the watchdog and work with the parts in such a way that Content-Disposition header information (such as the filename) is available to the function.
If there is only one part, feed it to stdin.
If possible, allow to access multiple parts by their names as stream, too.
Make content-disposition information available to the function similar to the usual request headers.
(The Content-Disposition header is defined as a response header only, but it may occur in multipart/formdata for requests).
There is no special support for multipart requests.
The related issue openfaas/faas#344 asked for multipart support by means of a json object containing all base64 encoded parts, that might not be ideal in terms of memory requirements for big multipart requests.
In openfaas/faas#345 it was concluded that support for multipart should go into of-watchdog
To match the classic watchdog 1-for-1 the serializing mode should capture stderr (optionally)
When combine_output
is set to false stderr goes to the container logs, when true it comes back in the function response.
Example for testing: port=8081 mode=serializing fprocess="stat x" ./of-watchdog
When calling localhost:8081 you should see the output of stderr either in the container logs or in the response - currently we see it in neither place.
I build a image base openresty, build script in https://github.com/feifeiiiiiiiiiii/faas_template,
occur fatal error
I think of-watchdog should support fprocess as daemon proccess to service
Forking - openresty []
2018/09/12 19:53:51 Started logging stdout from function.
2018/09/12 19:53:51 Started logging stderr from function.
2018/09/12 19:53:51 OperationalMode: http
2018/09/12 19:53:51 Writing lock file at: /var/folders/90/txbmh4fj7qb8p9_7kpzwnx4s43bnfk/T/.lock
2018/09/12 19:53:51 Error reading stdout: EOF
I think should judge error info, if err is EOF, we should ignore
feifeiiiiiiiiiii/openresty-openfaas
Docker version docker version
(e.g. Docker 17.0.05 ): 18.05.0-ce
Are you using Docker Swarm or Kubernetes (FaaS-netes)? Kubernetes
Operating System and version (e.g. Linux, Windows, MacOS): MacOS
Link to your project or a code example to reproduce issue:
https://github.com/feifeiiiiiiiiiii/faas_template
Where possible let's add unit test coverage starting with config parsing/reading and then moving on to test the various handlers using https://golang.org/pkg/net/http/httptest/.
I'd like to see a series of small and well-defined PRs. Please prioritize the HTTP mode.
Alex
In order to integrate with my org's log aggregator I need to provide pure JSON logs from all deployed apps. That means that the logs will need to be parsed as a JSON as such:
{"level":"info","ts":"2019-09-05T15:33:26.302Z","caller":"function/file.go:64","msg":"Parsing the payload from the handler", taskId: 14}
...
{"ts":"2019-09-05T15:35:35.212Z","method":"POST","path":"/","status":"200 OK","ContentLength": 121}
2019/09/05 15:33:26 stdout: {"level":"info","ts":"2019-09-05T15:33:26.302Z","caller":"function/file.go:64","msg":"Parsing the payload from the handler", taskId: 14}
2019/09/05 15:33:26 POST / - 200 OK - ContentLength: 121
I'm raising two issues here and I don't think they should have the same solution. So let's tackle one issue at a time:
This could be achieved either by simply using fmt.Println instead of log, or by setting the flags on the current logger to: log.SetFlags(0)
.
This could be optional by passing setting an environment variable like disable_logger_prefix
We need to have an option to print the response status in a formatted log, adding a structured JSON log can be simply enough by either mocking the JSON format with Printf so it can be performant. Or importing once of the available loggers which will be able to print the JSON format.
Again this should be optional and set by an environment variable like logger_format
.
We are aggregating all the logs across all of our products into a log aggregator to be indexed and made queryable. At the moment, watchdog uses the default logger of golang which doesn't support a JSON structure and adds a timestamp prefix. That means we aren't able to use the watchdog as it is now, and we will need to fork the project and make the necessary adjustments
In the cases where invocation is not utilizing the stdout/err pipes for communication with the watchdog the watchdog should not modify or impose structure on the logs passing through from these event streams. This allows me as an application author to manage how my logs are presented in order to be processed downstream, for example if I wanted to use structured logging such as with JSON or CSV. Ideally the Watchdog should not mangle the output of my application except to make it available to the platform log driver.
There are some logs that need to be emitted from the watchdog itself:
• Errors parsing config
• Forked process termination
• Issues reading from the pipe files
• Errors communicating with metrics services
These of-watchdog
logs should be able to be easily and optionally be skipped or selected on by log parsers/shippers so that I can have an unobscured event trail of logs as emitted by my application.
There are several undesirable behaviors that exist in the current implementation that highlight the mangling that is currently being done by the Watchdog.
There is a hard limit on the number of bytes that can be in a single line from the wrapped function process. For instance a message of 300 bytes emitted from the underlying function would be split to two lines of 256 bytes and the remaining 44 bytes below. Even worse it can be split across output from other go routines (such as the watchdog itself).
Example:
2019/08/09 00:01:42 stderr: 2019/08/09 00:01:42 {"abexi":"yxgdndacmbhgwhadjnba","aclra":"dbgmdiytcwloabveajbe","aczai":"nlqlibaatlyhdaopovfo","aereu":"nuzjqxmzotarlutmygms","afgie":"hlgizmhgzptxtfrgkaqq","ahugb":"qxifgcyvgcazaefizhgw","ajxwk":"afikzruuywsuwkobbuor","akgza":"altlhtuzh
2019/08/09 00:01:42 GET / - 200 OK - ContentLength: 2
2019/08/09 00:01:42 stderr: oidz","zxbqo":"tukdfklasvyafstrdpoj","zynxg":"bfrwcxacobgabksdrjdi","zzawj":"lvncjvkyrwrlixxqcahq","zzhms":"ouyykvnikbudiryewvos","zzttv":"wvtuerakfsxlplgaftry"}
HttpRunner
These files are being piped into the same output file Stderr with a prefix. This is somewhat surprising since for the HttpRunner both of these files are written to, however Stdout does not contain any output from my wrapped application.The prefix itself isn't that bad as it can allow me to easily determine if a log line is from the wrapped process or the of-watchdog itself. However combined with the hard limit of 256 bytes splitting lines up, if I have a structured log line it becomes extremely difficult to parse these structures consistently.
Example:
stdout: My function's original log message to stdout
stderr: My function's original log message to stderr
The current implementation is calling Println and passing a buffer of length 256. This is resulting in all of the bytes to be written out and causing double new lines in those cases as a newline is being copied from the original buffer and the another one is applied to the end of the byte buffer and if you're using a viewer that doesn't strip null bytes (such as a tool that writes to a file) those null bytes are cluttering up the output.
Example log file:
2019/08/26 23:11:21 stderr: Starting application server
������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������
I originally proposed a solution involving a line based tokenizer from bufio
as a way to attempt to output align the mirrored logs with those of the wrapped function application. I believe this addresses the problems the best in that log lines are persisted from the underlying wrapped application and re-emitted with the common prefixing done by the watchdog as they were originally written.
Another possible solution is to increase the read buffer to be larger than expected output via making it a tunable parameter to the of-watchdog. So for applications that are doing verbose structured logging a much larger buffer can be provided versus something closer to the default provided now for a terse application.
I tested this using a contrived http mode go function named chatty-fn
that just logs requests that it receives and a bunch of other extra information. I built the chatty-fn
docker container and included different watchdogs to wrap it. I then boot the docker container directly, and issue network requests directly to it.
# Within chatty-fn
docker build -t chatty-fn:latest .
docker run -p 8080:8080 chatty-fn:latest > stdout.log 2> stderr.log
curl -X POST localhost:8080 -H 'Content-Type: application/json' --data '{"hello": "world"}'
I have been attempting to add structured logging to an application running in a container. Using a tool like filebeat to parse, ship, and then extract those structured logs into a searchable system. Due to the small buffer size currently implemented it makes it difficult to handle these json log lines that extend beyond 256 bytes.
Docker version docker version
(e.g. Docker 17.0.05 ):
Docker 19.03.1
Are you using Docker Swarm or Kubernetes (FaaS-netes)?
N/A
Operating System and version (e.g. Linux, Windows, MacOS):
Linux and MacOS
Link to your project or a code example to reproduce issue:
https://github.com/cconger/chatty-fn
When the forked process exist, the watchdog could log the exit code and release the lock file to trigger a container restart.
If the function process exists, the watchdog should log the exit code/stdio and release the lock file.
Currently the watchdog only logs that it couldn't read the stdio.
Hi OpenFaas, thanks a lot for this amazing project.
May I ask whether you have any plan about when to merge of-watchdog into faas?
Give information where the http server listen on
It could be nice to see which addr/port the http server is listen on
You need to know that the port 8080 is default, and in binds to 0.0.0.0.
Give information before binding the http server or when it actually is bind
From my experimenting with the of-watchdog and looking at the code it seems the write_debug env variable is currently not used .
Either write_debug is removed from the docs or it is implemented as specified by the README.MD
Temporarily remove write_debug from the docs until it is implemented.
Docker version docker version
(e.g. Docker 17.0.05 ):
Are you using Docker Swarm or Kubernetes (FaaS-netes)?
Operating System and version (e.g. Linux, Windows, MacOS):
Link to your project or a code example to reproduce issue:
Is it possible to use of-watchdog
in HTTP mode to support websockets?
I created a of-watchdog which using go-http template.
set max_inflight: 5
with 80 Concurrent request. this was still only one pod instance on k8s
Structured Logging messages should not be split between lines. I want my message to appear:
stdout: {"level":"info", "msg":"No module named my-module", "pipe":"stderr", "time":"2019-08-08T01:47:55Z", "context": "this is a contrived long message of more than 256 bytes", "invoker": "cconger", "extra bits": ["there", "are", "so", "many", "things", "on", "this", "error"]}
Currently the behavior is to slurp 256 bytes per log line. This can cause long messages like the one above to be split:
stdout: {"level":"info", "msg":"No module named my-module", "pipe":"stderr","time":"2019-08-08T01:47:55Z", "context": "this is a contrived long message of more than 256 bytes", "invoker": "cconger", "extra bits": ["there", "are", "so", "many", "things", "on
stdout: ", "this", "error"]}
Use the token Scanner for newlines from bufio
instead of a fixed buffer size.
My functions use a structured logging library to have rich structured logging, however due to the log lines being split by the watchdog it becomes tricky to parse the properly when they exceed 256 bytes.
When the forked process doesn't respond to http calls, the watchdog could release the lock file to trigger a container restart.
If the function http server timeouts, the watchdog should log the timeout exception and release the lock file.
Maybe implement a retry mechanism and exit after several timeouts?
Currently the watchdog only logs the http error without detection a timeout.
I am using OpenFaaS for a long running process. (Currently 3 minute jobs but plan to scale this up to 2 hour jobs).
I am using async-function to queue these jobs with a callback url.
I am also using the golang-http template.
2019/02/20 01:04:16 stderr: 2019/02/20 01:04:16 Starting job
2019/02/20 01:06:55 Upstream HTTP request error: Post http://127.0.0.1:8081/: EOF
It is not clear to me why this is set to 127.0.0.1:8081 as no docker containers are running on this port. I assume that this is the port the of-watchdog go process is running within the container, hence I am opening the issue here.
The callback url is receiving an empty POST with 503 as the function status code.
This does not occur when the handler returns a response earlier (e.g. if there is a 4xx error before processing).
Callback URL receives the data the handler returns.
It appears the Upstream HTTP request causes an error, which then causes the POST to the callback url to be empty.
docker version
(e.g. Docker 17.0.05 ):Client:
Version: 18.09.2
API version: 1.39
Go version: go1.10.6
Git commit: 6247962
Built: Sun Feb 10 04:13:47 2019
OS/Arch: linux/amd64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 18.09.2
API version: 1.39 (minimum version 1.12)
Go version: go1.10.6
Git commit: 6247962
Built: Sun Feb 10 03:42:13 2019
OS/Arch: linux/amd64
Experimental: false
Are you using Docker Swarm or Kubernetes (FaaS-netes)?
Docker Swarm
Operating System and version (e.g. Linux, Windows, MacOS):
Ubuntu 18.04
Similar behaviour to the classic watchdog:
HTTP/1.1 200 OK
Server: nginx/1.13.12
Date: Wed, 07 Nov 2018 14:37:08 GMT
Content-Type: text/html
Content-Length: 176
Connection: keep-alive
X-Call-Id: 6afdc9b1-ac15-47fa-ba83-3a546ab9193d
X-Duration-Seconds: 0.047277
X-Start-Time: 1541601428214993189
Strict-Transport-Security: max-age=15724800; includeSubDomains
HTTP/2 200
server: nginx/1.13.12
date: Wed, 07 Nov 2018 14:36:22 GMT
content-type: application/cloudevents+json
content-length: 267
x-call-id: 0e8128ca-d51f-4aa1-b2f2-2fdc1c04f615
x-start-time: 1541601382939613945
strict-transport-security: max-age=15724800; includeSubDomains
Consistent experience across classic and of-watchdog. Greater compatibility for users once they migrate to of-watchdog
When reviewing a support issue related to the Gateway, I ended up reviewing the HTTP exector to find any context timeouts. It does have a timeout during the proxy to the function implementation, which is fine, but the processing of the error case from the HTTP client does not need the select
statement it currently has.
The client error check can be simplified because if the err
is already not nil
, then either context has already failed or some other error has occurred and we do not need to wait for the timeout. It should be equivalent to use this
if err != nil {
log.Printf("Upstream HTTP request error: %s\n", err.Error())
if reqCtx.Err() == context.DeadlineExceeded {
log.Printf("Upstream HTTP killed due to exec_timeout: %s\n", f.ExecTimeout)
w.Header().Set("X-Duration-Seconds", fmt.Sprintf("%f", time.Since(startedTime).Seconds()))
w.WriteHeader(http.StatusGatewayTimeout)
return nil
}
// Error unrelated to context / deadline
w.Header().Set("X-Duration-Seconds", fmt.Sprintf("%f", time.Since(startedTime).Seconds()))
w.WriteHeader(http.StatusInternalServerError)
return nil
}
This is in fact, more explicit and accurate, because it only checks for timeouts and ignores cancels, so the http response code is more accurate. According to the context docs, the context error can be DeadlineExceeded
or Canceled
, per https://golang.org/pkg/context/#pkg-variables. We either want to handle Canceled
separately or treat it as a generic error.
When the client errors, we potentially wait for the context timeout, like this
if err != nil {
log.Printf("Upstream HTTP request error: %s\n", err.Error())
// Error unrelated to context / deadline
if reqCtx.Err() == nil {
w.Header().Set("X-Duration-Seconds", fmt.Sprintf("%f", time.Since(startedTime).Seconds()))
w.WriteHeader(http.StatusInternalServerError)
return nil
}
select {
case <-reqCtx.Done():
{
if reqCtx.Err() != nil {
// Error due to timeout / deadline
log.Printf("Upstream HTTP killed due to exec_timeout: %s\n", f.ExecTimeout)
w.Header().Set("X-Duration-Seconds", fmt.Sprintf("%f", time.Since(startedTime).Seconds()))
w.WriteHeader(http.StatusGatewayTimeout)
return nil
}
}
}
We should have Travis / Go & OpenFaaS GitHub badges in our README.md file just like openfaas/faas openfaas/faas-netes openfaas/faas-cli and etc.
Not shown, manually have to discover the build if it's passing/failing or a PR etc.
See the repos mentioned above and retrofit for this project.
Add Lambda Custom Runtime mode
The of-watchdog has several modes including http, streaming and a classic mode which simulates the classic watchdog.
I carried out a PoC that showed we can support functions directly from AWS Lambda using the new custom runtime and published the first solution of its kind on GitHub. The code was just a PoC and needs some refactoring to be hardened.
https://github.com/alexellis/lambda-on-openfaas-poc
I was able to show that given a file-system created by the docker-lambda project we can take code written for a custom runtime on AWS Lambda and run it on OpenFaaS with Kubernetes or any of the other back-ends.
Rather than using channels on their own, for synchronization we should also use a SyncMap or a Map with a RWMutex as demonstrated in my inlets project: https://github.com/alexellis/inlets/compare/sync
I'm looking for a volunteer to add this mode to the of-watchdog and port across the example shown in the repo alexellis/lambda-on-openfaas-poc. This will then allow us to create a template and base Docker images to allow people to run their AWS Lambda Node/Python/Go etc projects on OpenFaaS.
Alex
We should publish a Docker image for each binary we produce and then create a multi-arch manifest to collect them all together.
This gives a speed boost, but also makes the Dockerfiles easier to manage and update.
Download via curl in each Dockerfile
See how this was done in the openfaas/faas project by myself and @rgee0 and how it was subsequently applied to the templates in openfaas/templates.
Update CI:
Update of-watchdog tempaltes
curl
to from the new base Docker image.I noticed that the architecture says that:
This version of the of-watchdog brings new features for high-throughput and enables re-use of expensive resources such as database connection pools or machine-learning models.
as well as the golang-http-template mentioned that.
How is this feature implemented?
I would expect the JVM to receive SIGTERM and then terminate.
I am running a HTTP4S scala webserver in this project:
https://github.com/hejfelix/fp-exercises-and-grading/blob/master/http4s_faas/openfaas/Dockerfile
JVM never shuts down before docker container is killed
Not sure, it seems like watchdog is not forwarding the shutdown hook?
Add a shutdown hook to any function running in http
mode
I want to be able to clean up resources, e.g. database connections, unfinished operations, etc.
Running on docker swarm on my macbook pro
Client: Docker Engine - Community
Version: 18.09.0-ce-beta1
API version: 1.39
Go version: go1.10.4
Git commit: 78a6bdb
Built: Thu Sep 6 22:41:53 2018
OS/Arch: darwin/amd64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 18.09.0-ce-beta1
API version: 1.39 (minimum version 1.12)
Go version: go1.10.3
Git commit: 78a6bdb
Built: Thu Sep 6 22:49:35 2018
OS/Arch: linux/amd64
Experimental: true
This is more of a meta-issue. I'm curious as to whether there's any interest in adding a linter to the project. I tested golangci-lint
, and it runs in 1.685 seconds. I used the following config:
linters:
enable:
- golint
- gosec
- interfacer
- unconvert
- dupl
- goconst
- gocyclo
- goimports
- misspell
- scopelint
- gofmt
It showed the following output. Although most of these lint identifications aren't super important, some of them might catch places where the code could be simpler, or where the code might be ambiguous.
config/config_test.go:164:32: Using the variable on range scope `testCase` in function literal (scopelint)
actual, err := New([]string{testCase.env})
^
config/config_test.go:170:18: Using the variable on range scope `testCase` in function literal (scopelint)
if process != testCase.wantProcess {
^
config/config_test.go:171:42: Using the variable on range scope `testCase` in function literal (scopelint)
t.Errorf("Want process %v, got: %v", testCase.wantProcess, process)
^
executor/afterburn_runner.go:96:10: Error return value of `w.Write` is not checked (errcheck)
w.Write(bodyBytes)
^
executor/http_runner.go:94:21: Error return value of `cmd.Process.Signal` is not checked (errcheck)
cmd.Process.Signal(syscall.SIGTERM)
^
executor/http_runner.go:187:10: Error return value of `w.Write` is not checked (errcheck)
w.Write(bodyBytes)
^
executor/serializing_fork_runner.go:26:10: Error return value of `w.Write` is not checked (errcheck)
w.Write([]byte(err.Error()))
^
main.go:168:23: Error return value of `functionInvoker.Start` is not checked (errcheck)
functionInvoker.Start()
^
main.go:298:23: Error return value of `functionInvoker.Start` is not checked (errcheck)
functionInvoker.Start()
^
config/config.go:130:2: should use 'return <expr>' instead of 'if <expr> { return <bool> }; return <bool>' (gosimple)
if env[key] == "true" {
^
executor/http_runner.go:155:3: should use a simple channel send/receive instead of select with a single case (gosimple)
select {
^
main.go:31:26: printf-style function with dynamic format string and no further arguments should use print-style function instead (staticcheck)
fmt.Fprintf(os.Stderr, configErr.Error())
^
main.go:108:5: should omit comparison to bool constant, can be simplified to !suppressLock (gosimple)
if suppressLock == false {
^
main.go:129:3: redundant break statement (gosimple)
break
^
main.go:132:3: redundant break statement (gosimple)
break
^
main.go:135:3: redundant break statement (gosimple)
break
^
main.go:258:2: should merge variable declaration with assignment on next line (gosimple)
var envs []string
^
main.go:335:55: should omit comparison to bool constant, can be simplified to !lockFilePresent() (gosimple)
if atomic.LoadInt32(&acceptingConnections) == 0 || lockFilePresent() == false {
In the Dockerfile where we run the tests, we would have to install the linter, and run the linters.
It would require that we take the current repo, fix, or ignore, the linter issues, and then add a linter configuration, as well as configuration to invoke the linter at CI time.
Basically keeping up to date the linter, and the linter configuration.
See the effort required up front.
The classic watchdog supports parsing integer values i.e. 60
and interpreting that as 60s
whenever a non-golang duration is given. of-watchdog should do that too, but @bmcstdio mentioned he didn't see that.
Only supporting Golang duration.
https://github.com/openfaas-incubator/of-watchdog/blob/master/config/config.go#L118-L127
Update the parsing code to use the following:
https://github.com/openfaas/faas/blob/master/watchdog/readconfig.go#L31
node10-express
s
i.e. 5
5ms
when the user may expect it to be 5s
This change is to the streaming mode of of-watchdog to add a named pipe with which the running subprograms can send control messages back to of-watchdog in specify the HTTP response code and headers. This design will be done in such a way so as to not impact the behavior for any called functions which do not make use of the pipe provided. The motivation behind this change is that I need functions to be able to return specific response codes other than 200 for functions which I call.
The of-watchdog process will create a new named pipe when a function is called, then set the environment variable CONTROL_PIPE
before calling the sub process. The of-watchdog will then listen on the control pipe file, if it recieves a JSON blob of the following format, it will use it to set the response code and headers before writing the output of the sub command. If no JSON blob is received before the subprocess begins to output on it's standard out, an HTTP response code of 200 will be sent with the first message.
Pros: Functions will now be able to return custom response codes and headers per request.
Cons: None.
A few days of coding time.
Minimal, if any.
This change will be 100% backwards compatible with the existing program.
N/A
Paths should be forwarded to the upstream function.
I tried to run a SimpleHTTPServer in Python and noticed the file-browser wasn't passing through the path.
Update the proxying code to pass the requestURI
port=8081 mode=http fprocess="python -m SimpleHTTPServer" upstream_url="http://127.0.0.1:8000" ./of-watchdog
Browse into a directory and see it doesn't work
Running as a Go binary outside of Docker.
We should be careful to test this change since the templates only bind to a single path anyway of /
Implement graceful shutdown from classic watchdog
See changes and notes on this commit:
This should be largely a copy/paste exercise but will require testing.
QueryString should be available in the function when using HTTP mode
The QueryString is not forwarded or consumed by templates
diff --git a/executor/http_runner.go b/executor/http_runner.go
index c73d5cf..fa15cf4 100644--- a/executor/http_runner.go
+++ b/executor/http_runner.go
@@ -96,7 +96,13 @@ func (f *HTTPFunctionRunner) Start() error {
// Run a function with a long-running process with a HTTP protocol for communication func (f *HTTPFunctionRunner) Run(req FunctionRequest, contentLength int64, r *http.Request, w http.ResponseWriter) error{
- request, _ := http.NewRequest(r.Method, f.UpstreamURL.String(), r.Body)
+ upstreamURL := f.UpstreamURL.String()
+
+ if len(r.URL.RawQuery) > 0 {
+ upstreamURL += "?" + r.URL.RawQuery
+ }
+
+ request, _ := http.NewRequest(r.Method, upstreamURL,r.Body)
for h := range r.Header {
request.Header.Set(h, r.Header.Get(h))
}
Hey all,
In the documentation, read_timeout
, write_timeout
, and exec_timeout
are explained, but their defaults are not mentioned. This can be a huge headache while troubleshooting as the error raised by a timeout doesn't always specify that a timeout is to blame.
Mention that the default timeouts are "10s"
Pretty self-explanatory.
In my case, I had a faas function that took ~10s to execute. I had set the exec_timeout
but not the read_timeout
or write_timeout
. My function was attempting to return, but the caller was receiving an unexplained 502 error (honestly still not sure why I was getting a 502 instead of a 408, that one might be on faas, not watchdog). It took a while to realize that a timeout might be to blame.
I've been reading through the documentation of OpenFaas, watchdog and of-watchdog, and I was wondering something. I know that watchdog is a lightweight HTTP server that is used to route requests to the function it was deployed with, but I'm wondering why you need it exactly? Can't the function just expose an HTTP server of its own and deploy it by itself?
Please correct me if I'm wrong, but Knative, another K8S serverless framework, doesn't seem to provide any solution similar to watchdog. Why not?
Don't take my question the wrong way, I am just interested in understanding the idea behind this project.
Dear maintainers,
I wrote my template that is similar to https://github.com/openfaas-incubator/python-flask-template/tree/master/template/python3-flask.
The different from the original is that my template uses stream response (or, chunked response) as follows.
@app.route("/", defaults={"path": ""}, methods=["POST", "GET"])
@app.route("/<path:path>", methods=["POST", "GET"])
def main_route(path):
...
def gen():
yield "1"
time.sleep(1)
yield "2"
time.sleep(1)
yield "3"
return Response(gen())
When I invoke a function from the template, the function does not return response chunk by chunk.
It blocks 3 seconds, and then, returns whole the response (1, 2 and 3) all at once.
Function returns response chunk by chunk.
Function blocks 3 seconds, and then, returns whole the response all at once.
The following code blocks until whole the response returned.
And openfaas/faas's following code also blocks, maybe.
Docker version 18.09.1, build 4c52b90
Docker swarm
Linux (vagrant, vm.box = "bento/centos-7.4")
Thank you.
bingLoggingPipe prints empty line (even without timestamp) when printing stdout/stderr from function
I would expect empty lines to be ignored or properly timestamped to keep the line structure (for log collectors)
With this in a function handler:
fmt.Println("hello")
fmt.Print(" there")
fmt.Println(", world")
the function container output is
Forking - ./handler []
2019/09/21 16:10:53 OperationalMode: http
2019/09/21 16:10:53 Started logging stderr from function.
2019/09/21 16:10:53 Started logging stdout from function.
2019/09/21 16:10:53 Writing lock-file to: /tmp/.lock
2019/09/21 16:10:53 Metrics server. Port: 8081
2019/09/21 16:12:01 stdout: hello
2019/09/21 16:12:01 stdout: there
2019/09/21 16:12:01 stdout: , world
(2 blank lines due to \r\n
of the Println
)
Right-strip scanner.Text()
?
Provided in Current behaviour
Add Http_Transfer_Encoding env-var to all modes but http. It was added to the classic watchdog and should be added here for the streaming and forking modes.
Copy the design from openfaas/faas#1423
Serializing & streaming mode should return 500 when exit code is non-zero
Unlike the classic watchdog it is returning 200 despite a non-zero exit code.
Example:
port=8081 mode=serializing fprocess="stat x" ./of-watchdog
curl localhost:8081 -i
HTTP/1.1 200 OK
stat x
should clearly return a non-zero exit-code and a message to stderr.
When in HTTP mode, redirects seem to be proxied, not actually redirected
When I
res.redirect('http://google.com')
I expect when I visit my cloud function in the browser to redirect to google.com
http://127.0.0.13112/function/myfunction
)google.com
(but the URL bar doesn't show google)From my intuition, it looks like a proxy is happening here.
Is there a way to create a flag that says something like redirects
or allow_redirects
or?
Would be ideal to use cloud functions for things like OAuth or other web based services
0.4.0/of-watchdog
Setting a command to fprocess which is containing a '=' character is being cut out after the '=' character.
When setting fprocess="node --max_old_space_size=4096 index.js"
it should execute the index.js script with the given --max_old_space_size
option set to whatever value is given to it (4096 in this example).
When setting fprocess="node --max_old_space_size=4096 index.js"
it fails to execute the command and gives the following output when sending a request to the function:
$ fprocess="node --max_old_space_size=4096 index.js" ./of-watchdog
2018/12/28 15:29:40 OperationalMode: streaming
2018/12/28 15:29:40 Writing lock-file to: /tmp/.lock
2018/12/28 15:29:45 Running node
2018/12/28 15:29:45 Started logging stderr from function.
2018/12/28 15:29:45 stderr: Error: missing value for flag --max_old_space_size of type int
Try --help for options
node: bad option: --max_old_space_size
2018/12/28 15:29:45 Error reading stderr: read |0: file already closed
2018/12/28 15:29:45 Took 0.008821 secs
2018/12/28 15:29:45 exit status 9
The issue is caused by the mapEnv function in config.go which is called with a slice of strings of the environment variables in key=value
form.
It splits the fprocess environmemnt variable with a '=' sign separator to map fprocess as a key to the command as a value.
When the command is containing a '=' sign it splits the string to unknown number of substrings, but the mapped value (the command which is ran by of-watchdog) is only based on the first substring.
This results to fprocess=node --max_old_space_size=4096 index.js
being mapped as mapped["fprocess"]="node --max_old_space_size"
which causes the error.
Possible fixes for the mapEnv function in https://github.com/openfaas-incubator/of-watchdog/blob/85505a7210cf413e455f8a03d74ba94d9a9fcd30/config/config.go#L89-L101
mapped[parts[0]] = strings.Join(parts[1:], "=")
parts := strings.SplitN(val, "=", 2)
Both would result to the mapping correctly as mapped["fprocess"]="node --max_old_space_size=4096 index.js"
.
It's possible to set environment variables in the function's YAML file as described here: https://github.com/openfaas-incubator/node10-express-template
For example node can set NODE_OPTIONS="--max-old-space-size=4096"
to use the option, though this may not be suitable for all use cases.
Benchmark for solution 1 (strings.Join):
func BenchmarkJoin(b *testing.B) {
val := "fprocess=node --max_old_space_size=4096 index.js"
parts := strings.Split(val, "=")
for n := 0; n < b.N; n++ {
strings.Join(parts[1:], "=")
}
}
Result:
Running tool: C:\Go\bin\go.exe test -benchmem -run=^$ benchmarks -bench ^(BenchmarkJoin)$
goos: windows
goarch: amd64
pkg: benchmarks
BenchmarkJoin-8 30000000 45.8 ns/op 48 B/op 1 allocs/op
PASS
ok benchmarks 1.557s
Success: Benchmarks passed.
Benchmark for solution 2 (strings.SplitN):
func BenchmarkSplitn(b *testing.B) {
val := "fprocess=node --max_old_space_size=4096 index.js"
for n := 0; n < b.N; n++ {
strings.SplitN(val, "=", 2)
}
}
Result:
Running tool: C:\Go\bin\go.exe test -benchmem -run=^$ benchmarks -bench ^(BenchmarkSplitn)$
goos: windows
goarch: amd64
pkg: benchmarks
BenchmarkSplitn-8 20000000 63.0 ns/op 32 B/op 1 allocs/op
PASS
ok benchmarks 1.495s
Success: Benchmarks passed.
Overall the use of strings.SplitN (solution 2) to solve this issue seems better.
git clone https://github.com/openfaas-incubator/of-watchdog.git
go get -u github.com/openfaas-incubator/of-watchdog/config
go get -u github.com/openfaas-incubator/of-watchdog/executor
go build
fprocess="node --max_old_space_size=4096 index.js" ./of-watchdog
I've deployed a nodeJS function based on https://github.com/openfaas-incubator/node10-express-template which required an increase of the maximum memory allocation for the function to run correctly.
I encountered this issue and its temporal fix (solution 3) by setting the --max_old_space_size=4096
option which caused the issue and forced me to set it as an environment variable for the function in the YAML file.
docker version 18.06
I'm using Docker Swarm
(FaaS-netes)
Operating System and version (e.g. Linux, Windows, MacOS): Ubuntu 16.04, Windows 10 Build 17134
It would be nice to be able to limit the number of concurrent requests in flight. For example, if I have a workload where the amount of memory per request uses up to 2GB (let's say we disable GC), and I have 9 GB of memory, I can safely keep 4 requests in parallel. If I accidentally invoke the 5th request, my container will explode, killing everything.
The watchdog will continue to accept requests, and pass them on. It's up to the workload to figure out how to handle this.
Add a concurrency limit to the number of onflight requests. This could just be middleware that gets added in buildRequestHandler
. If more than N requests are in flight, it would just return an HTTP 429.
One thing we may want to consider is to make it so that rather than returning a 429 immediately, we queue. I believe that if we do this, it's best we pull in https://godoc.org/golang.org/x/sync/semaphore, but there are tradeoffs here.
Per the security announcement https://groups.google.com/forum/#!topic/golang-announce/65QixT3tcmg we should update the docker build layers to at least Go 1.11.13
The builder layer in the Dockerfile should use golang:1.11
openfaas/faas#1291
openfaas/templates#170
openfaas/faas-netes#494
openfaas/nats-queue-worker#66
#78
https://github.com/openfaas-incubator/faas-idler/issues/32
openfaas/golang-http-template#28
openfaas-incubator/faas-federation#4
openfaas-incubator/vcenter-connector#27
openfaas-incubator/faas-memory#3
openfaas-incubator/faas-rancher#8
openfaas/faas-swarm#56
openfaas/ingress-operator#10
https://github.com/openfaas-incubator/openfaas-operator/issues/87
If you invoke a function (use of-watchdog HTTP mode) after it's been scaled to zero, a new function pod has been successfully created but get Server returned unexpected status code: 500 -
result.
If you look into the function pod log below:
Forking - node [bootstrap.js]
2020/03/10 11:23:19 Started logging stderr from function.
2020/03/10 11:23:19 Started logging stdout from function.
2020/03/10 11:23:19 OperationalMode: http
2020/03/10 11:23:19 Timeouts: read: 1m5s, write: 1m5s hard: 1m0s.
2020/03/10 11:23:19 Listening on port: 8080
2020/03/10 11:23:19 Writing lock-file to: /tmp/.lock
2020/03/10 11:23:19 Metrics listening on port: 8081
2020/03/10 11:23:21 Upstream HTTP request error: Post http://127.0.0.1:3000/: dial tcp 127.0.0.1:3000: connect: connection refused
2020/03/10 11:23:27 stdout: OpenFaaS Node.js listening on port: 3000
you can tell of-watch
throw an Upstream HTTP request error
6 seconds before the express Node.js server started to listen at port 3000.
When invoking a function that has been scaled to zero
, the function needs to be successfully executed and return the correct response.
Extra response time due to the pod initialisation is acceptable.
When invoking a function that has been scaled to zero
, a Server returned unexpected status code: 500 -
response is returned.
Possible Solution 1:
Upstream HTTP request error
Possible Solution 2:
/_/health
). Before the internal HTTP server is up, the readiness probe should not return 200 status code.scale to zero
feature of faas-idler
and deploy the function with com.openfaas.scale.zero=true
labelfaas-idler
This issue makes scale to zero
feature unusable as the first request after scaled to zero
will always fail.
CLI:
commit: ea687659ecf14931a29be46c4d2866899d36c282
version: 0.11.8
Gateway
uri: http://127.0.0.1:8080
version: 0.18.10
sha: 80b6976c106370a7081b2f8e9099a6ea9638e1f3
commit: Update Golang versions to 1.12
Provider
name: openfaas-operator
orchestration: kubernetes
version: 0.14.1
sha: e747b6ace86bc54184d899fa10cf46dada331af1
Host header should be forwarded to the upstream function.
Since Host header is not a part of request.Header field it is not copied by the copyHeaders function.
Do request.Host = r.Host
before copyHeaders(request.Header, &r.Header)
in http_runner.go
We need to construct URI links that depends on the Host header. This information is lost when function is called. Also seems that the same modification need to be done for the gateway as well.
Docker version docker version
(e.g. Docker 17.0.05 ):
17.12.1-ce
Are you using Docker Swarm or Kubernetes (FaaS-netes)?
FaaS-netes
Operating System and version (e.g. Linux, Windows, MacOS):
CentOS 7
Link to your project or a code example to reproduce issue:
I tried to create sample express app from :
https://github.com/openfaas-incubator/node8-express-template
But I see the following error in logs after deployment. I am new to docker/openfaas so sincere apologies if this is due to some setup issue on my end.
faas-node-express.1.dp85hxbcd6i3@pi03 | Forking - node [index.js]
faas-node-express.1.dp85hxbcd6i3@pi03 | 2018/06/07 03:54:17 Started logging stderr from function.
faas-node-express.1.dp85hxbcd6i3@pi03 | 2018/06/07 03:54:17 Started logging stdout from function.
faas-node-express.1.dp85hxbcd6i3@pi03 | 2018/06/07 03:54:17 Error reading stdout: EOF
Docker version docker version
(e.g. Docker 17.0.05 ):18.05.0-ce
Are you using Docker Swarm or Kubernetes (FaaS-netes)? Docker Swarm
Operating System and version (e.g. Linux, Windows, MacOS): Raspian Lite
Link to your project or a code example to reproduce issue:
To enable the "batch job" use-case, users should be able to specify a "one shot" mode or parameter. This would allow unlimited requests to /healthz and /metrics, but only a single request to /
, after which it would shutdown the binary process.
This is partially to work around limitations in Kubernetes jobs with daemons, web-servers and side-cars which keep the job in a "running" status.
kubernetes/kubernetes#25908
kubernetes/enhancements#753
Argo workflows does appear to work in "sidecar" mode without any additional changes to the watchdog, but I suspect building on Kubernetes Jobs would be cleaner from a dependencies point of view.
Example with figlet container:
https://twitter.com/alexellisuk/status/1148239010034311169
Example in Argo docs on sidecars:
https://github.com/argoproj/argo/blob/master/examples/README.md#sidecars
We could offer an env-var override to disable chunked encoding in HTTP mode. The side-effect is that we would have to cache the request in memory before proxying it to the upstream_url - but this also adds greater compatibility with frameworks like PHP Swoole.
PHP Swoole for instance cannot support transfer-encoding of chunked.
If an env-var is set then buffer the response to gauge the length and then forward on to the upstream_url.
Currently, we are able to run fwatchdog
in a FROM scratch
image, but we cannot run the default healthcheck since there is no [
/test
command to check for a file's existence as far as I can tell.
We could add a subcommand for fwatchdog
which in turn performs the health check instead of relying on a [
/test
command being present , i.e.
HEALTHCHECK --interval=2s CMD ["/fwatchdog", "healthcheck"]
How serializing works ?
Hi all, I use serializing mode, and I get nothing in STDIN (while streaming works).
Overall, how to use serializing mode ? I want to manipulate the stdin request and set status code, body, headers ... inside my handler, and send out to the stdout. I think it's the best suited mode, but i can get nothing.
POST http://openfaas_gateway:8000/function/name
{"foo": "bar"}
With a PHP script :
<?php
$stdin = file_get_contents("php://stdin");
// $stdin is empty ...
fwrite(STDOUT, $stdout);
// How to write a HTTP response correctly with stdout ?
This Dockerfile with the script above.
FROM php:alpine
RUN apk add --no-cache git curl
RUN curl -sSLf https://github.com/openfaas-incubator/of-watchdog/releases/download/0.4.6/of-watchdog > /usr/bin/fwatchdog && chmod +x /usr/bin/fwatchdog
COPY index.php /usr/src/function
WORKDIR /usr/src/function
ENV function_process="php index.php"
ENV mode="serializing"
HEALTHCHECK --interval=3s CMD [ -e /tmp/.lock ] || exit 1
CMD ["fwatchdog"]
I want to manipulate the stdin request and set status code, body, headers ... inside my function with stdout.
Docker 18.09.1
Docker swarm
macOs
PHP 7.3
This issue is a feature request asking to add a catch-all capability to the static mode.
The static server should redirect every possible url pointed to the function to the root path.
Example: http:127.0.0.1:8080/function/hello/subpath
-> http:127.0.0.1:8080/function/hello
In Nginx, this is accomplished with the try_files
directive.
https://docs.nginx.com/nginx/admin-guide/web-server/serving-static-content/#trying-several-options
Currently, requesting any function subpath, i.e http:127.0.0.1:8080/function/hello/subpath
, will return a 404 error.
This capability is mandatory in the context of Single Page Applications, like React.js apps.
Using Hash History
adding #
in the URL brings other issues and is not a valid solution for professional modern web applications.
A Single Page Application technically only have a single index.html, and a server should redirect every possible url pointed to the domain at this index.html located at the root.
In Nginx, we implement a catch-all rule for SPA like so:
location / {
try_files $uri $uri/ index.html;
}
Minimal React.js Single Page Application packaged for OpenFaaS with the dockerfile template and the watchdog in static mode: https://github.com/Janaka-Steph/react-spa-openfaas
Dear maintainers,
README says that execution timeout will be disabled if exec_timeout is set to 0.
Exec timeout for process exec'd for each incoming request (in seconds). Disabled if set to 0.
However, when I set exec_timeout to 0, my function terminated as soon as it starts.
Function never terminates while it is running.
Function terminates as soon as it starts.
docker swarm log says:
$ docker service log -f myfunction
Upstream HTTP request error: Post http://127.0.0.1:5000/: context deadline exceeded
Upstream HTTP killed due to exec_timeout: 0s
Maybe this line causes immediate timeout.
https://github.com/openfaas-incubator/of-watchdog/blob/bae373954932a07d89ab926457d237f03f6c60dc/executor/http_runner.go#L127
You should set timeout only when ExecTimeout is non-zero value.
ctx := context.Background()
if f.ExecTimeout != 0 {
var cancel context.CancelFunc
ctx, cancel = context.WithTimeout(ctx, f.ExecTimeout)
defer cancel()
}
ENV exec_timeout="0s"
to DockerfileDocker version 18.09.1, build 4c52b90
Docker swarm
Linux (vagrant, vm.box = "bento/centos-7.4")
Thank you.
For of-watchdog
to take the place of the original watchdog, it should be fully backwards compatible with the original watchdog.
Support environmental variables cgi_headers
, marshal_request
, and combine_output
with their corresponding feature.
The above features are not fully supported.
Migrate over the tests and code from the original watchdog.
Maintaining two versions of watchdog doubles the amount of work one needs to do. Once of-watchdog
in serializing mode becomes backwards compatible, it can replace the original watchdog
while the other modes can get worked on.
As an operator I want to enable HPAv2 auto-scaling with custom Pod metrics on Kubernetes.
Syncing the patch from:
Template creators would specify a new mode called static
that would allow them to serve static content that they specified over http.
Users cannot use the watchdog for serving static content. They would have to create their own static server or use an existing solution like nginx.
We would have a new mode called static
and a new variable called publish
, publish would have the relative path to the directory that the user wants to serve.
Something that we could add in the future is analytics so that users can now which blog posts or page was the most visited.
While creating a template for static sites I was not able to reuse the watchdog.
Docker version docker version
(e.g. Docker 17.0.05 ):
Are you using Docker Swarm or Kubernetes (FaaS-netes)?
Operating System and version (e.g. Linux, Windows, MacOS):
Link to your project or a code example to reproduce issue:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.