Code Monkey home page Code Monkey logo

scuttle's Introduction

Scuttle

scuttle Is a wrapper application that makes it easy to run containers next to Istio sidecars. It ensures the main application doesn't start until envoy is ready, and that the istio sidecar shuts down when the application exits. This particularly useful for Jobs that need Istio sidecar injection, as the Istio pod would otherwise run indefinitely after the job is completed.

This application, if provided an ENVOY_ADMIN_API environment variable, will poll indefinitely with backoff, waiting for envoy to report itself as live, implying it has loaded cluster configuration (for example from an ADS server). Only then will it execute the command provided as an argument.

All signals are passed to the underlying application. Be warned that SIGKILL cannot be passed, so this can leave behind a orphaned process.

When the application exits, unless NEVER_KILL_ISTIO_ON_FAILURE has been set and the exit code is non-zero, scuttle will instruct envoy to shut down immediately.

Environment variables

Variable Purpose
ENVOY_ADMIN_API This is the path to envoy's administration interface, in the format http://127.0.0.1:15000. If provided, scuttle will poll this url at /server_info waiting for envoy to report as LIVE. If provided and local (127.0.0.1 or localhost), then envoy will be instructed to shut down if the application exits cleanly.
NEVER_KILL_ISTIO If provided and set to true, scuttle will not instruct istio to exit under any circumstances.
NEVER_KILL_ISTIO_ON_FAILURE If provided and set to true, scuttle will not instruct istio to exit if the main binary has exited with a non-zero exit code.
SCUTTLE_LOGGING If provided and set to true, scuttle will log various steps to the console which is helpful for debugging
START_WITHOUT_ENVOY If provided and set to true, scuttle will not wait for envoy to be LIVE before starting the main application. However, it will still instruct envoy to exit.
WAIT_FOR_ENVOY_TIMEOUT If provided and set to a valid time.Duration string greater than 0 seconds, scuttle will wait for that amount of time before starting the main application. By default, it will wait indefinitely. If QUIT_WITHOUT_ENVOY_TIMEOUT is set as well, it will take precedence over this variable
ISTIO_QUIT_API If provided scuttle will send a POST to /quitquitquit at the given API. Should be in format http://127.0.0.1:15020. This is intended for Istio v1.3 and higher. When not given, Istio will be stopped using a pkill command.
GENERIC_QUIT_ENDPOINTS If provided scuttle will send a POST to the URL given. Multiple URLs are supported and must be provided as a CSV string. Should be in format http://myendpoint.com or http://myendpoint.com,https://myotherendpoint.com. The status code response is logged (if logging is enabled) but is not used. A 200 is treated the same as a 404 or 500. GENERIC_QUIT_ENDPOINTS is handled before Istio is stopped.
QUIT_WITHOUT_ENVOY_TIMEOUT If provided and set to a valid duration, scuttle will exit if Envoy does not become available before the end of the timeout and not continue with the passed in executable. If START_WITHOUT_ENVOY is also set, this variable will not be taken into account. Also, if WAIT_FOR_ENVOY_TIMEOUT is set, this variable will take precedence.

How Scuttle stops Istio

Scuttle has two methods to stop Istio. You should configure Scuttle appropriately based on the version of Istio you are using.

Istio Version Method
1.3 and higher /quitquitquit endpoint
1.2 and lower pkill command

1.3 and higher

Version 1.3 of Istio introduced an endpoint /quitquitquit similar to Envoy. By default this endpoint is available at http://127.0.0.1:15020 which is the Pilot Agent service, responsible for managing envoy. (Source)

To enable this, set the environment variable ISTIO_QUIT_API to http://127.0.0.1:15020.

1.2 and lower

Versions 1.2 and lower of Istio have no supported method to stop Istio Sidecars. As a workaround Scuttle stops Istio using the command pkill -SIGINT pilot-agent.

To enable this, you must add shareProcessNamespace: true to your Pod definition in Kubernetes. This allows Scuttle to stop the service running on the sidecar container.

Note: This method is used by default if ISTIO_QUIT_API is not set

Example usage in your Job's Dockerfile

FROM python:latest
# Below command makes scuttle available in path
COPY --from=redboxoss/scuttle:latest /scuttle /bin/scuttle
WORKDIR /app
COPY /app/ ./
ENTRYPOINT ["scuttle", "python", "-m", "my_app"]

Credits

Origin code is forked from the envoy-preflight project on Github, which works for envoy but not for Istio sidecars.

scuttle's People

Contributors

jackkleeman avatar jacobsvante avatar almonteb avatar ptzianos avatar klengyel avatar cbuto avatar tariq1890 avatar linjmeyer avatar

Stargazers

Rodrigo Valeri avatar Denver avatar Ely Alvarado avatar  avatar  avatar  avatar Jonathan Yu avatar  avatar Jack Wilsdon avatar Michael Fiford avatar Panagiotis PJ Papadomitsos avatar Damien Parbhakar avatar  avatar Hubert avatar Tony Qu avatar siddharth avatar RomainPhilippot avatar Jordan Garrison avatar Entrapta Jones avatar Atul Arora avatar Daniel Lopez Monagas avatar Kajal Tiwari avatar Markus Mohoritsch avatar Abhijeet Singh avatar Adry Happier avatar  avatar  avatar Hemabh Ravee avatar Artem Savichev avatar Enrique González avatar Andrey Shevtsov avatar Iwan Aucamp avatar Fredrik Broman avatar Karthik Krishnamurthy avatar Youssef avatar Albert avatar Rafed Ramzi avatar Andy Wu avatar Benjamin Morehouse avatar Alberto Cavalcante avatar Tomas Fernando Wallick avatar zbv avatar vearne avatar Matheus Moraes avatar Fernando Cagale avatar Nick Gooding avatar Gorin Roman avatar JP Earnest avatar Brian Glogower avatar Renan avatar Rachel Yang avatar Allisson Azevedo avatar Dimitri Pommier avatar Bryan A. S. avatar Johannes Brüderl avatar Radek Szymczyszyn avatar fizz avatar João Lucas Lucchetta avatar Flavio Silva avatar  avatar Victor Leung avatar matlegit avatar Brett avatar Taeho Kim avatar Batuhan Apaydın avatar Emre Savcı avatar Yiğitcan UÇUM avatar  avatar Abhishek Polavarapu avatar Michael Haselton avatar genix avatar Jimmy Song avatar Steve Hipwell avatar Joel Longtine avatar  avatar Connor Brinton avatar Pablo Crivella avatar  avatar Fred Lu avatar Kano Kurihara avatar Caleb Hankins avatar Greg Bray avatar Jason Skrzypek avatar Ahmed Yakout avatar Sachin Thakur avatar Furkan avatar Ji-Young Park avatar Piotr Kieszczyński avatar Christian Posta avatar Ben van B avatar Daniel Quackenbush avatar Thomas Kosiewski avatar Matt Burdan avatar Agastyo Satriaji Idam avatar Derek Tamsen avatar Cqshinn avatar Eugene McCullagh avatar Peter C avatar Raju Dawadi avatar Daniel Grimm avatar

Watchers

James Cloos avatar  avatar  avatar Kano Kurihara avatar Andrey Savelyev avatar

scuttle's Issues

istio-proxy left running without app container ever starting - stuck pods

Hi,

We sometimes (maybe every 100th run) get logs like this for our CronJob-created pods:

2020-06-13T14:15:06.871856039Z scuttle: Logging is now enabled
2020-06-13T14:15:06.871942104Z scuttle: Blocking until envoy starts
2020-06-13T14:15:09.820310817Z scuttle: Blocking finished, envoy has started
2020-06-13T14:15:09.832039127Z scuttle: Received exit signal, exiting

When this happens it leaves istio-proxy running for some reason. This results in the job never finishing. Which means a new job will never start (because we don't allow more than one running job).

As you can see from the log messages above it seems that scuttle exits pretty much immediately after envoy has started is logged.

We're running Scuttle 1.2.3, Istio 1.5 and Kubernetes 1.16.

Looking in main.go I noticed this comment:
// Signal received before the process even started. Let's just exit.

And that does indeed seem to be the case.

I can't find any relevant istio-proxy logs more than this line: 2020-06-13T14:15:08.986202Z info Envoy proxy is ready

Does anyone have similar experiences? What can be done to fix this?

I'm not fluent in golang so I don't feel comfortable digging further into the code / submitting a PR, but some suggestions maybe to help with debugging these kinds of issues:

  1. Split the following if statement into two separate ones: https://github.com/redboxllc/scuttle/blob/v1.3.0/main.go#L43 - One for the err == nil case and one for the errors.Is(err, context.Canceled) case. Add as much context as possible to the logging.
  2. More detailed logging here: https://github.com/redboxllc/scuttle/blob/v1.3.0/main.go#L70 (which signal was received, move comment Signal received before the process even started. Let's just exit. into log message)

@linjmeyer What are your thoughts? Do you have some experience in debugging this? (I noticed the same log message in #23)

Unknown SIGURG (urgent I/O condition) signal causing Scuttle / Istio to die

We are seeing a strange issue with scuttle in which we are seeing a mysterious signal "urgent i/o condition" being sent to scuttle and causing envoy / istio sidecar to stop as soon as it's started. The issue is summarized with these lines lines - the first showing the istio-proxy is read, the second scuttle acknowledging this, the 3rd, getting the signal and the rest showing that it's quitting.

2021-03-25 11:06:37.934	istio-proxy	2021-03-25T11:06:37.934856Z info Envoy proxy is ready 
2021-03-25 11:06:38.267	e24a559ffa452bb7e284a4f3690fa5d3	2021-03-25T11:06:38Z scuttle: Blocking finished, Envoy has started 
2021-03-25 11:06:38.295	e24a559ffa452bb7e284a4f3690fa5d3	2021-03-25T11:06:38Z scuttle: Received signal 'urgent I/O condition', exiting 
2021-03-25 11:06:38.295	e24a559ffa452bb7e284a4f3690fa5d3	2021-03-25T11:06:38Z scuttle: Kill received: (Action: Stopping Istio with API, Reason: ISTIO_QUIT_API is set, Exit Code: 1) 
2021-03-25 11:06:38.295	e24a559ffa452bb7e284a4f3690fa5d3	2021-03-25T11:06:38Z scuttle: Stopping Istio using Istio API 'http://127.0.0.1:15000' (intended for Istio >v1.2) 
2021-03-25 11:06:38.316	e24a559ffa452bb7e284a4f3690fa5d3	2021-03-25T11:06:38Z scuttle: Sent quitquitquit to Istio, status code: 200 

Here is an full export of the logs from the node, and container as well as our istiod pods:

@timestamp	Node / container logs	Log
2021-03-25 11:06:21.123	ec2node-x	ena 0000:00:07.0 eth2: Local page cache is disabled for less than 16 channels
2021-03-25 11:06:26.893	ec2node-x	http: superfluous response.WriteHeader call from github.com/docker/docker/api/server/httputils.WriteJSON (httputils_write_json.go:11) 
2021-03-25 11:06:32.581	ec2node-x	I0325 11:06:32.581736    7921 setters.go:77] Using node IP: "10.234.36.215"
2021-03-25 11:06:33.862	ec2node-x	{"level":"info","ts":"2021-03-25T11:06:33.862Z","caller":"/usr/local/go/src/runtime/proc.go:203","msg":"CNI Plugin version: v1.7.5 ..."}
2021-03-25 11:06:33.933	ec2node-x	I0325 11:06:33.932823    7921 prober.go:124] Readiness probe for "e24a559ffa452bb7e284a4f3690fa5d3_namespace(4e64d45d-52ea-4c19-a4b2-cda5b1b72c09):istio-proxy" failed (failure): Get http://100.66.110.107:15021/healthz/ready: dial tcp 100.66.110.107:15021: connect: connection refused
2021-03-25 11:06:35.933	ec2node-x	I0325 11:06:35.932833    7921 prober.go:124] Readiness probe for "e24a559ffa452bb7e284a4f3690fa5d3_namespace(4e64d45d-52ea-4c19-a4b2-cda5b1b72c09):istio-proxy" failed (failure): Get http://100.66.110.107:15021/healthz/ready: dial tcp 100.66.110.107:15021: connect: connection refused
2021-03-25 11:06:36.736	istio-proxy	2021-03-25T11:06:36.735981Z warning envoy runtime Unable to use runtime singleton for feature envoy.http.headermap.lazy_map_min_size 
2021-03-25 11:06:36.736	istio-proxy	2021-03-25T11:06:36.736016Z warning envoy runtime Unable to use runtime singleton for feature envoy.http.headermap.lazy_map_min_size 
2021-03-25 11:06:36.736	istio-proxy	2021-03-25T11:06:36.736506Z warning envoy runtime Unable to use runtime singleton for feature envoy.http.headermap.lazy_map_min_size 
2021-03-25 11:06:36.736	istio-proxy	2021-03-25T11:06:36.736543Z warning envoy runtime Unable to use runtime singleton for feature envoy.http.headermap.lazy_map_min_size 
2021-03-25 11:06:36.765	istio-proxy	2021-03-25T11:06:36.765549Z info xdsproxy Envoy ADS stream established 
2021-03-25 11:06:36.765	istio-proxy	2021-03-25T11:06:36.765651Z info xdsproxy connecting to upstream XDS server: istiod.istio-system.svc:15012 
2021-03-25 11:06:36.767	istio-proxy	2021-03-25T11:06:36.767262Z warning envoy main there is no configured limit to the number of allowed active connections. Set a limit via the runtime key overload.global_downstream_max_connections 
2021-03-25 11:06:36.777	discovery	2021-03-25T11:06:36.777047Z info ads ADS: new connection for node:sidecar~100.66.110.107~e24a559ffa452bb7e284a4f3690fa5d3.namespace~namespace.svc.cluster.local-5 
2021-03-25 11:06:36.780	discovery	2021-03-25T11:06:36.780741Z info ads CDS: PUSH for node:e24a559ffa452bb7e284a4f3690fa5d3.namespace resources:196 
2021-03-25 11:06:36.846	discovery	2021-03-25T11:06:36.846468Z info ads EDS: PUSH for node:e24a559ffa452bb7e284a4f3690fa5d3.namespace resources:133 empty:0 cached:133/133 
2021-03-25 11:06:36.854	istio-proxy	2021-03-25T11:06:36.854774Z info sds resource:default new connection 
2021-03-25 11:06:36.854	istio-proxy	2021-03-25T11:06:36.854786Z info sds resource:ROOTCA new connection 
2021-03-25 11:06:36.854	istio-proxy	2021-03-25T11:06:36.854832Z info sds Skipping waiting for gateway secret 
2021-03-25 11:06:36.854	istio-proxy	2021-03-25T11:06:36.854837Z info sds Skipping waiting for gateway secret 
2021-03-25 11:06:36.933	istio-proxy	2021-03-25T11:06:36.933776Z info cache Root cert has changed, start rotating root cert for SDS clients 
2021-03-25 11:06:36.933	istio-proxy	2021-03-25T11:06:36.933801Z info cache GenerateSecret default 
2021-03-25 11:06:36.934	istio-proxy	2021-03-25T11:06:36.934135Z info sds resource:default pushed key/cert pair to proxy 
2021-03-25 11:06:37.055	istio-proxy	2021-03-25T11:06:37.054956Z info cache Loaded root cert from certificate ROOTCA 
2021-03-25 11:06:37.055	istio-proxy	2021-03-25T11:06:37.055140Z info sds resource:ROOTCA pushed root cert to proxy 
2021-03-25 11:06:37.083	discovery	2021-03-25T11:06:37.082941Z info ads LDS: PUSH for node:e24a559ffa452bb7e284a4f3690fa5d3.namespace resources:163 
2021-03-25 11:06:37.336	discovery	2021-03-25T11:06:37.332927Z info ads RDS: PUSH for node:e24a559ffa452bb7e284a4f3690fa5d3.namespace resources:60 
2021-03-25 11:06:37.934	istio-proxy	2021-03-25T11:06:37.934856Z info Envoy proxy is ready 
2021-03-25 11:06:38.267	e24a559ffa452bb7e284a4f3690fa5d3	2021-03-25T11:06:38Z scuttle: Blocking finished, Envoy has started 
2021-03-25 11:06:38.295	e24a559ffa452bb7e284a4f3690fa5d3	2021-03-25T11:06:38Z scuttle: Received signal 'urgent I/O condition', exiting 
2021-03-25 11:06:38.295	e24a559ffa452bb7e284a4f3690fa5d3	2021-03-25T11:06:38Z scuttle: Kill received: (Action: Stopping Istio with API, Reason: ISTIO_QUIT_API is set, Exit Code: 1) 
2021-03-25 11:06:38.295	e24a559ffa452bb7e284a4f3690fa5d3	2021-03-25T11:06:38Z scuttle: Stopping Istio using Istio API 'http://127.0.0.1:15000' (intended for Istio >v1.2) 
2021-03-25 11:06:38.316	e24a559ffa452bb7e284a4f3690fa5d3	2021-03-25T11:06:38Z scuttle: Sent quitquitquit to Istio, status code: 200 
2021-03-25 11:06:38.318	discovery	2021-03-25T11:06:38.318887Z info ads ADS: "100.66.110.107:58984" sidecar~100.66.110.107~e24a559ffa452bb7e284a4f3690fa5d3.namespace~namespace.svc.cluster.local-5 terminated with stream closed 
2021-03-25 11:06:38.318	istio-proxy	2021-03-25T11:06:38.318183Z warning envoy config StreamAggregatedResources gRPC config stream closed: 13,  
2021-03-25 11:06:38.318	istio-proxy	2021-03-25T11:06:38.318470Z info xdsproxy disconnected from XDS server: istiod.istio-system.svc:15012 
2021-03-25 11:06:38.319	istio-proxy	2021-03-25T11:06:38.319013Z warning envoy config StreamSecrets gRPC config stream closed: 13,  
2021-03-25 11:06:38.319	istio-proxy	2021-03-25T11:06:38.319029Z warning envoy config StreamSecrets gRPC config stream closed: 13,  
2021-03-25 11:06:38.319	istio-proxy	2021-03-25T11:06:38.319067Z info sds resource:ROOTCA connection is terminated: rpc error: code = Canceled desc = context canceled 
2021-03-25 11:06:38.319	istio-proxy	2021-03-25T11:06:38.319086Z error sds Remote side closed connection 
2021-03-25 11:06:38.319	istio-proxy	2021-03-25T11:06:38.319067Z info sds resource:default connection is terminated: rpc error: code = Canceled desc = context canceled 
2021-03-25 11:06:38.319	istio-proxy	2021-03-25T11:06:38.319115Z error sds Remote side closed connection 
2021-03-25 11:06:38.402	istio-proxy	2021-03-25T11:06:38.402249Z info Epoch 0 exited normally 
2021-03-25 11:06:38.402	istio-proxy	2021-03-25T11:06:38.402272Z info No more active epochs, terminating 
2021-03-25 11:06:38.460	ec2node-x	time="2021-03-25T11:06:38.460784673Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
2021-03-25 11:06:38.680	e24a559ffa452bb7e284a4f3690fa5d3	2021-03-25T11:06:38Z scuttle: Received signal 'urgent I/O condition', passing to child 
2021-03-25 11:06:38.872	ec2node-x	{"level":"info","ts":"2021-03-25T11:06:38.872Z","caller":"/usr/local/go/src/runtime/proc.go:203","msg":"CNI Plugin version: v1.7.5 ..."}
2021-03-25 11:06:38.991	ec2node-x	I0325 11:06:38.990897    7921 kubelet.go:1960] SyncLoop (PLEG): "e24a559ffa452bb7e284a4f3690fa5d3_namespace(4e64d45d-52ea-4c19-a4b2-cda5b1b72c09)", event: &pleg.PodLifecycleEvent{ID:"4e64d45d-52ea-4c19-a4b2-cda5b1b72c09", Type:"ContainerDied", Data:"80a75a8a226d881053607260c8837ac9c1627a888b1443aa4186528c22219898"}
2021-03-25 11:06:38.991	ec2node-x	I0325 11:06:38.991714    7921 topology_manager.go:219] [topologymanager] RemoveContainer - Container ID: 80a75a8a226d881053607260c8837ac9c1627a888b1443aa4186528c22219898

We are using Scuttle v1.3.1.

My understanding is this signal is

SIGURG       P2001      Ign     Urgent condition on socket (4.2BSD)

from the man pages.

I am trying to get as much information as I can to understand this issue, but ultimately I would like to know if it is it possible to ignore this signal? Or increase logging to determine where it's receiving this from?

Any background on why this would occur is appreciated.

Please let me know,
Thank you!

Contribution guidelines

Hi folks,

Do you have a contribution guideline for things like:

  • How to create a PR for this repo
  • Contact point for maintainers
  • ...etc

Thanks!

QUIT_WITHOUT_ENVOY_TIMEOUT not working

I tried setting the environment variable QUIT_WITHOUT_ENVOY_TIMEOUT however it appears that it is not being used. This is done in a Docker like the following;

ENV QUIT_WITHOUT_ENVOY_TIMEOUT=15s
ENTRYPOINT ["/path/to/scuttle", "..."]

All I get is the following message indefinitely (NB: Envoy is not running)

2020-11-10T17:11:40Z scuttle: Scuttle 1.3.1 starting up, pid 1
2020-11-10T17:11:40Z scuttle: Logging is now enabled
2020-11-10T17:11:40Z scuttle: Blocking until Envoy starts

Propagate container failure to pod status

We have a pod that runs tests (main container) and an istio sidecar. Once scuttle is added to the flow, even if the test container has failed tests, the entire pod enters 'Completed' state as appose to an 'Error' state (which occurs if scuttle is not present).
Is it possible to propagate the status of the failed test container to the pod?
e.g

NAME    READY   STATUS      RESTARTS   AGE
test      0/2           Error                  0          38m

and not:

NAME     READY   STATUS      RESTARTS   AGE
test           0/2      Completed             0          38m

error as seen in pod:

  containerStatuses:
  - containerID: docker://...
    image: istio/proxyv2:1.5.8
    name: istio-proxy
    state:
      terminated:
        containerID: docker://...
        exitCode: 0
        finishedAt: "2021-01-19T09:36:40Z"
        reason: Completed
        startedAt: "2021-01-19T09:07:19Z"
  - containerID: docker://...
    name: test
    state:
      terminated:
        containerID: docker://...
        exitCode: 120
        finishedAt: "2021-01-19T09:36:40Z"
        reason: Error
        startedAt: "2021-01-19T09:07:19Z"

lean pod yaml:

apiVersion: v1
kind: Pod
metadata:
  name: test
  annotations:
    sidecar.istio.io/inject: "true"
  restartPolicy: Never
  containers:
    - name: test
      command: ["scuttle", "/bin/sh", "-c"] 
     env:
        - name: ENVOY_ADMIN_API
          value: "http://127.0.0.1:15000"
        - name: ISTIO_QUIT_API
          value: "http://127.0.0.1:15000"

Segfault when ISTIO_QUIT_API fails to respond

This is with Scuttle v1.2.1 (the latest as of this writing).

https://github.com/redboxllc/scuttle/blob/master/main.go#L135

scuttle: Stopping Istio using Istio API 'http://127.0.0.1:15020' (intended for Istio >v1.2)
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x7c2c07]

goroutine 1 [running]:
main.killIstioWithAPI()
	/home/runner/work/scuttle/scuttle/main.go:135 +0x287
main.kill(0x0)
	/home/runner/work/scuttle/scuttle/main.go:110 +0xcc
main.main()
	/home/runner/work/scuttle/scuttle/main.go:84 +0x31f

Attempting to dereference resp which is nil.

scuttle will wait forever if it cannot reach envoy during quit

In a CronJob where the application ran under scuttle, istio envoy proxy started and the job ran. Primary application exited and scuttle logged that it was posting the quitquitquit api command to envoy. However it appears that the istio proxy had, for whatever reason, already terminated. Scuttle continued to probe the missing envoy API for hours, keeping the Job running and blocking further executions.

Feature Request: Update for Istio 1.3 and /quitquitquit endpoint

  • Istio 1.3 brings a /quitquitquit endpoint similar to Envoy
  • Feature should be added to support this endpoint

Proposal:

  • Add env variable for Istio version
  • When >=1.3 or not set, use /quitquitquit
  • When <1.3, use pkill
  • Add Istio version to logging output
  • Add this to documentation/README. Additionally add the sharedNamespace requirement when using pkill on Kubernetes to docs.

why the package install occurred error

Hi @ALL,
I want to install the package scuttle, but it occurred error like below:

image
What happen to my R ?
why its head file beachmat3/beachmat.h not found? it make me confused.
I try anyways that i can do had this question unsolved.
Any kindly help would be appreciated.
Best

Extend docker image to support arm architecture

executing a docker image built for arm64 (aka intel) architecture on arm64 (aka m1, m2) macs is very slow, as to become un-practical.

Docker allows a single docker image to contains “binaries” for multiple architecture and choose at build/run time the appropriate image.

#56 offers a solution.

Add `QUIT_WITHOUT_ENVOY_TIMEOUT` environment variable

We encountered a very interesting case the other day at work.

*This happened using Istio 1.3.3 but it affects also other versions of Istio.

In a big cluster with lots of namespaces and services, when a Job was started using scuttle to wrap the main process, the Envoy process of the istio-proxy sidecar was OOM killed. Despite this, scuttle kept waiting for ever. So I suggest that we add an environment variable that instructs scuttle to exit if Envoy hasn't become available within that timeframe.

Update to x/text 0.3.3

A security scan found that the Scuttle 1.3.6 binary is using version 0.3.0 of x/text, which is vulnerable to CVE-2020-14040. Could x/text be updated to at least version 0.3.3?

Document: How to use with Istio

These env vars:

        env:
        - name: ISTIO_QUIT_API
          value: http://127.0.0.1:15020
        - name: ENVOY_ADMIN_API 
          value: http://127.0.0.1:15000

works on istio 1.3.3

But without ENVOY_ADMIN_API, it hangs. Note its port, which is different from the README's port of Envoy's admin API.

Dynamically determine whether istio-proxy sidecar is present before blocking to wait for it to start

I have a use case that may be pertinent for others interested in using scuttle for their project. My use case revolves around software that wants to use scuttle conditionally, depending on whether Istio sidecar injection is installed or not, without the software (Kubernetes resources, Helm chart, etc.) having any knowledge about Istio being present or not.

  • The Kubernetes workload would be statically configured with the appropriate scuttle ENV vars (such as ENVOY_ADMIN_API and ISTIO_QUIT_API).
  • scuttle would conditionally enforce waiting for Envoy to start based on heuristics where scuttle would inspect its own Pod to determine if istio-proxy is present.
    • If the istio-proxy container is present in the Pod, scuttle will block and wait for ENVOY_ADMIN_API to respond, per its current behavior.
    • If the istio-proxy container is not present in the Pod, scuttle will effectively disable itself, as if the ENV vars were never provided, per its current behavior.

This I feel makes scuttle more robust in environments that cannot dynamically configure their ENV vars or command arguments based on the pre-determined knowledge that Istio is present or not, and yet need a solution to properly wait for the istio-proxy sidecar to be up and ready to go before they begin network activity.

I hope this makes sense. This could be a good first contribution by someone (like myself), and this heuristic check can itself be conditional for starters. It will involve importing the client-go Kubernetes client and doing intra-cluster inspection of its own Pod. Technically the shareProcessNamespace would allow for a solution without a kube-apiserver client, but I think it's more elegant to inspect the workload metadata versus the OS-level namespace.

BONUS: This same heuristic could be used to intelligently enable/disable a default value for ENVOY_ADMIN_API and ISTIO_QUIT_API. As noted in those two issues (#9 and #12), it is a concern that having those variables default to a value will make it unpleasant to run the container in a testing or development fashion. Intraspecting the Kubernetes workload (if Kubernetes even exists, and the istio-proxy sidecar exists) to flip on the default value might be a good approach to the problem.

Consider using Istio readiness check instead of Envoy admin API

Current the ENVOY_ADMIN_API is used to prevent application startup until Istio/Envoy is ready. Istio itself has a readiness probe (used by k8s healthchecks) at http://127.0.0.1:15020/healthz/ready. Looking at the master branch of Istio this endpoint verifies the Istio Admin API is ready (or it will time out), and internally verifies Envoy is operational.

I think using Istio's readiness endpoint makes more sense as we can be more confident that not only Envoy is ready, but Istio itself is too. Most likely this would mean deprecating ENVOY_ADMIN_API as the name of the variable wouldn't make sense.

Add timestamps to scuttle logging

When something goes wrong with scuttle, Istio or the underlying executable being run you can get logs like:

scuttle: Logging is now enabled
scuttle: Blocking until envoy starts
scuttle: Blocking finished, envoy has started
scuttle: Received exit signal, exiting

Timestamps would help understand what happened with logs like this. In the above it's not possible to determine if scuttle was hung, for how long it ran, etc.

Scuttle doesn't use a real logging package/framework right now, it may be better long term to investigate what logging options exist in the go ecosystem.

Verify Scuttle works with Istio 1.5

Istio 1.5 comes with a lot of changes, particularly around it's architecture and the merging of several services into one binary.

It should be verified that Scuttle is compatible with Istio 1.5, and if not any bugs/changes should be raised as new GitHub issues.

Thank you

For open sourcing this project and providing a blog post about how to use it! It's very appreciated! ❤️

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.