Code Monkey home page Code Monkey logo

bouncer's People

Contributors

allantaylor8907 avatar arminaaki avatar arunsathiya avatar asvoboda avatar atatkin avatar domenickp avatar dreamlibrarian avatar holtwilkins avatar kevinloganbs avatar so0k avatar svc-excavator-bot avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bouncer's Issues

AWS SDK supports AWS_REGION

currently bouncer aws client is hardcoding the os.getenv for AWS_DEFAULT_REGION and overwrites region in aws config object - however aws sdk itself supports both AWS_REGION and AWS_DEFAULT_REGION

this makes bouncer slightly less intuitive to use..

The fix for me was to use this null_resource:

resource "null_resource" "concourse_worker_bouncer" {
  # A map of arbitrary strings that, when changed, will force the null resource to be replaced, re-running any associated provisioners.
  # Used to trigger based on the evaluation of conditionally gated mutually exclusive launchtemplates
  triggers {
    lc_change = "${element(concat(aws_launch_template.concourse_worker_launchtemplate.*.id, aws_launch_template.concourse_worker_launchtemplate_ephemeral.*.id), 0)}"
  }

  provisioner "local-exec" {
    # Bounce all nodes in this ASG using the canary method
    command = "AWS_DEFAULT_REGION=${var.region} ./bouncerw canary -a '${aws_autoscaling_group.concourse_worker_asg.name}:${var.concourse_worker_instance_count}'"
  }
}

Classification around Canary

Hi,

I was just doing some testing with this tool with an ALB and ASG and a Canary deployment. One thing I noticed was bouncer would assert instance health outside of the ASG.

Bouncer seemed to only assert the instance had booted and not that it was fully operational within the ASG at this point it then started to drain connections and deregister the existing instance which caused a blip of downtime before the ASG had asserted the canary instance was healthy.

Is that to be expected that the "Canary" health is not asserted this way?

Wrapper script failing to download bouncer

It appears something has changed with bintray. This has been working in my environment for an extended period of time

I am not able to get any response from

BOUNCER_VERSION=$(curl -s "https://api.bintray.com/packages/palantir/releases/bouncer" | grep -Eoh '"latest_version":"\S*?"' | cut -d ':' -f 2 | cut -d ',' -f 1 | sed 's/"//g')

as such the the download command provides me with an empty file:

wget -q -O bouncer.tgz "https://palantir.bintray.com/releases/com/palantir/bouncer/bouncer/${BOUNCER_VERSION}/bouncer-${BOUNCER_VERSION}-linux-amd64.tgz"

I have tried switching things around to get the package directly from github like:

wget -q -O bouncer.tgz "https://github.com/palantir/bouncer/releases/download/${BOUNCER_VERSION}bouncer-${BOUNCER_VERSION}-linux-amd64.tgz"

I am able to download from github but then get a different error:

null_resource.server_bouncer_consul_cluster: Provisioning with 'local-exec'...
null_resource.server_bouncer_consul_cluster (local-exec): Executing: ["/bin/sh" "-c" "../../../bouncerw rolling -a 'XXXXXXXXXXXXXXX:5' 'stage'"]
null_resource.server_bouncer_consul_cluster (local-exec): BOUNCER_VERSION is not set. Looking for the latest bouncer release...
null_resource.server_bouncer_consul_cluster (local-exec): Installing bouncer version 0.10.0
module.consul_cluster.aws_launch_configuration.launch_configuration.deposed: Destruction complete after 0s
null_resource.server_bouncer_consul_cluster (local-exec): ../../../bouncerw: line 29: ./bouncer: cannot execute binary file: Exec format error
null_resource.server_bouncer_consul_cluster: Creation complete after 2s (ID: 578946282353217746)

[Feature Request] Canary Deploy Timeout: Reduce Desired Capacity To Original Value

Hey,

When using your tool in Canary mode, testing with new instances that fail the lifecycle hook and do not go into service in the ASG bouncer just continues until it hits the default timeout (20 mins). Is there a way to make bouncer put the desired capacity back to its original value when bouncer hits the timeout value?

If this isn't possible could you add it as a feature request?

Cheers
Richard

Spot fleet support

Hi, I was wondering if there is any plan to add support for spot fleets (aws_spot_fleet_request)?

We use a mix of auto scaling groups and spot fleets and it would be great if you could cycle instances in a spot fleet in the same way you can currently with auto scaling groups using bouncer.

Thanks

[feature request] Add ability to check consul service health

Not all of our ASG are behind ALBs, and EC2 health check not really showing us much in terms of health of the application.

It would be neat to add an option to specify the Consul service of your application running on that ASG and use that to determine if Bouncer should kill the old instances or not.

Bouncer file not found issue

Details mentioned here: https://stackoverflow.com/questions/64820508/terraform-issue-in-gitlab

null_resource.server_canary_bouncer (local-exec): Executing: ["/bin/sh" "-c" "./bouncer canary -a 'my-asg':$(aws autoscaling describe-auto-scaling-groups --auto-scaling-group-name 'my-asg' --query 'AutoScalingGroups[0].DesiredCapacity')"]
null_resource.server_canary_bouncer (local-exec): /bin/sh: ./bouncer: No such file or directory
Error: Error running command './bouncer canary -a 'my-asg':$(aws autoscaling describe-auto-scaling-groups --auto-scaling-group-name 'my-asg' --query 'AutoScalingGroups[0].DesiredCapacity')': exit status 127. Output: /bin/sh: ./bouncer: No such file or directory
[terragrunt] 2020/11/12 12:16:31 Hit multiple errors:
exit status 1
Cleaning up file based variables
00:01
ERROR: Job failed: exit code 1

install docs

There's no clear what documented to install this tool.

ECS

More of a question, but wondering if you've used bouncer with ECS clusters. If so, what's the recommended approach? If not, any interest making bouncer ECS "aware"?

Getting errors when following the document to scale the ASGs

I'm new to bouncer. I'm following ReadMe document to bounce my ASGs but getting errors doing so. I'm trying to bounce in serial mode
./bouncer serial -a example-asg:2
INFO[0000] Beginning bouncer serial run
ERRO[0005] ASG desired capacity doesn't match expected starting value ASG=example-asg desired_capacity actual=1 desired_capacity given=2
FATA[0005] error validating initial ASG state

Can someone help me with the usage and scale up and down my asg.

Add support for AWS Launch Templates

currently Bouncer only uses the LaunchConfigurationName to determine if an instance needs to be replaced.

This does not work for ASG using LaunchTemplates aws-sdk ref

func isInstanceOld(asgInst *autoscaling.Instance, ec2Inst *ec2.Instance, launchConfigName *string, force bool, startTime time.Time) bool {
	if asgInst.LaunchConfigurationName == nil {
		log.WithFields(log.Fields{
			"InstanceID": *asgInst.InstanceId,
		}).Debug("Instance marked as old because launch config is nil")
		return true
	}

	if *asgInst.LaunchConfigurationName != *launchConfigName {
		log.WithFields(log.Fields{
			"InstanceID":   *asgInst.InstanceId,
			"LaunchConfig": *asgInst.LaunchConfigurationName,
		}).Debug("Instance marked as old because of launch config")
		return true
	}

	// In force mode, mark any node that was launched before this runner was started as old
	if force {
		if startTime.After(*ec2Inst.LaunchTime) {
			log.WithFields(log.Fields{
				"InstanceID": *asgInst.InstanceId,
				"LaunchTime": *ec2Inst.LaunchTime,
			}).Debug("Instance marked as old because of launch time (force mode)")
			return true
		}
	}

	return false
}

Molly guard ASGs against bouncer

We should have a mechanism to tag ASGs as not bouncer-compatible so we can't accidentally automatically ruin the world (e.g. in our environment oops i murdered GHE everyone wants to kill me.)

Better handling of AWS rate limiting

I've recently been seeing bouncer fail in our TF runs due to AWS rate limiting:

time="2019-07-16T17:07:20Z" level=fatal msg="error in run: error building ASGSet: Error getting information for ASG myservice: error getting AWS ASG object: Error describing ASGs: Throttling: Rate exceeded
	status code: 400, request id: 2c3a4f65-a7ec-11e9-a003-3ff6a1d0860b" 

While the AWS SDK includes some default retry logic, it seems that the number of retries or the backoff rate is not high enough. In addition to increasing that, it would be nice to make bouncer warn about but otherwise ignore throttling errors so that it isn't interrupted in the middle of operation. Since it is already polling based, skipping a single check due to an error shouldn't be a problem.

Alternately, maybe there is a way to reduce the number of API calls made during a run to reduce the chance of hitting rate limits.

darwin-arm64 support

Open request for packaging to support darwin-arm64 architecture for M1 macbooks

Batch-canary not cleaning up extra nodes

I ran into a situation where my ASG got into this state:

level=info msg="Killing a batch of nodes" Extra nodes=1 Healthy nodes=8 Old nodes=0

It seems that bouncer got stuck in this state and never killed off the extra healthy new node and instead eventually continued to add new nodes to the ASG.

bouncer script not found.

Hi Team,
I trying to use bouncer, but something missing here, please let me know, how to fix this.
version: 0.8.7

~/bouncer(master)$ ./bouncerw serial --help
Attempting to lock .bouncer_download_lock
Lock acquired
BOUNCER_VERSION is not set. Looking for the latest bouncer release...
Installing bouncer version 0.8.7
tar: bouncer: Cannot open: File exists
tar: Exiting with failure status due to previous errors
Releasing lock on .bouncer_download_lock
Lock released
./bouncerw: line 58: ./bouncer: Is a directory

Install lastest version failed

Hi,
we are using bouncer for canary deployment.

We've updated from version 0.8.0 to 0.12.0 and are using latest version of bouncerw file.
There is an bug in line 39 with jq.
https://github.com/palantir/bouncer/blob/master/bouncerw#L39

Since pull request #116 jq is called without any arguements:
Lock acquired BOUNCER_VERSION is not set. Looking for the latest bouncer release... jq - commandline JSON processor [version 1.5-1-a5b5cbe] Usage: jq [options] <jq filter> [file...] jq is a tool for processing JSON inputs, applying the given filter to its JSON text inputs and producing the filter's results as JSON on standard output. The simplest filter is ., which is the identity filter, copying jq's input to its output unmodified (except for formatting). For more advanced filters see the jq(1) manpage ("man jq") and/or https://stedolan.github.io/jq Some of the options include: -c compact instead of pretty-printed output; -n use null as the single input value; -e set the exit status code based on the output; -s read (slurp) all inputs into an array; apply filter to it; -r output raw strings, not JSON texts; -R read raw strings, not JSON texts; -C colorize JSON; -M monochrome (don't colorize JSON); -S sort keys of objects on output; --tab use tabs for indentation; --arg a v set variable $a to value <v>; --argjson a v set variable $a to JSON value <v>; --slurpfile a f set variable $a to an array of JSON texts read from <f>; See the manpage for more options. Installing bouncer version gzip: stdin: unexpected end of file tar: Child returned status 1 tar: Error is not recoverable: exiting now chmod: cannot access './bouncer': No such file or directory

As a workaround, we have set BOUNVER_VERSION to a fixed value

Optional delay during canary deployment

Hi,
I love this util, just a request.

Is it possible to add a delay during canary deployment please. E.g When we select canary, it launches additional replica of EC2 and deploys the container, performs healthcheck, and then terminates older EC2. In this scenario can we introduce a delay after healthcheck and before old EC2 terminattion

thanks,

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.