Code Monkey home page Code Monkey logo

blacksmith's Introduction

Blacksmith

Building production grade services for Cloud Foundry

Stay tuned!

blacksmith's People

Contributors

alexanelli avatar christian-roggia avatar daviddob avatar dennisjbell avatar itsouvalas avatar jhunt avatar krutten avatar lnguyen avatar norman-abramovitz avatar pururval avatar thomasmitchell avatar wayneeseguin avatar xiujiao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

blacksmith's Issues

BOSH Maintenance Tasks

Blacksmith needs some sort of background job (goroutine on a ticker?) that checks the BOSH director and marks service instances whose backing deployment have disappeared as such. Otherwise, we can get into a situation where there are no deployments on the director, but a service is "at the limit" w.r.t. number of service deployments in the Vault index.

We also need another scheduled task (goroutine on a ticker?) to run bosh-cleanup against the BOSH director. This should be configurable, in case someone wants to manage their BOSH director out-of-band, and doesn't want Blacksmith "cleaning up" things they were using.

Management UI Should Display Orgs / Spaces

It would be nice if the "Services" section could show org and space names, based on data in the CF CCDB.

This means we need to track it, first, and the operator will have to configure CF credentials for querying CCDB. This should be optional.

Show output of init scripts during service provision

As a person writing / tuning a forge, it would be nice to know why a service initialization script failed (unbound variable? James doesn't know how to use safe?)

Getting the stdout / stderr of the init script in the logs would be nice.

Add support for global stores to shield backups feature

SHIELD backups feature currently only support named stores for tenants while UUID-provided stores can be both global or belong to a tenant.

Example: if the store "S3" is a global store it can only be specified in the store configuration of the shield feature as UUID, if the store "S3" belongs to a tenant and therefore it is not global then it can be specified in the store as both UUID or name.

Add support for referencing global stores here to support both FindStore and FindGlobalStore.

Service creation failes due to "instance was last in '' state, BOSH task 0"

While deploying a new service via blacksmith, the deployment was done fine, but the service status stalls with create in progress.

Trying to debug shows the following:

2020-03-25 11:41:13.986 DEBUG  [ad0106c4-03b9-4717-9ae6-a90b3e1ecdb7 postgresql/postgresql-small] deployment started, BOSH task 5788039
2020-03-25 11:41:13.986 DEBUG  [ad0106c4-03b9-4717-9ae6-a90b3e1ecdb7 postgresql/postgresql-small] tracking service instance in the vault 'db' index
2020-03-25 11:41:13.989 DEBUG  [ad0106c4-03b9-4717-9ae6-a90b3e1ecdb7 postgresql/postgresql-small] updating service status in the vault
2020-03-25 11:41:13.990 DEBUG  [vault track ad0106c4-03b9-4717-9ae6-a90b3e1ecdb7] tracking action 'provision', task 5788039
2020-03-25 11:41:13.992 DEBUG  [ad0106c4-03b9-4717-9ae6-a90b3e1ecdb7 postgresql/postgresql-small] started provisioning
2020-03-25 11:41:17.379 DEBUG  [ad0106c4-03b9-4717-9ae6-a90b3e1ecdb7] last-operation check received; checking state of service deployment
2020-03-25 11:41:17.380 DEBUG  [ad0106c4-03b9-4717-9ae6-a90b3e1ecdb7] instance was last in '' state, BOSH task 0
2020-03-25 11:41:17.380 ERROR  [ad0106c4-03b9-4717-9ae6-a90b3e1ecdb7] invalid state '' found in the vault
2020-03-25 11:42:19.854 DEBUG  [ad0106c4-03b9-4717-9ae6-a90b3e1ecdb7] last-operation check received; checking state of service deployment
2020-03-25 11:42:19.856 DEBUG  [ad0106c4-03b9-4717-9ae6-a90b3e1ecdb7] instance was last in '' state, BOSH task 0
2020-03-25 11:42:19.856 ERROR  [ad0106c4-03b9-4717-9ae6-a90b3e1ecdb7] invalid state '' found in the vault

The tracked action seems to be correct in vault:

# /var/vcap/packages/safe/bin/safe get secret/ad0106c4-03b9-4717-9ae6-a90b3e1ecdb7/task
--- # secret/ad0106c4-03b9-4717-9ae6-a90b3e1ecdb7/task
action: provision
params: '{}'
task: "5788039"

The deployment:

$ bosh vms -d postgresql-small-ad0106c4-03b9-4717-9ae6-a90b3e1ecdb7
Using environment 'https://10.0.1.6:25555' as user 'admin'

Task 5793461. Done

Deployment 'postgresql-small-ad0106c4-03b9-4717-9ae6-a90b3e1ecdb7'

Instance                                         Process State  AZ  IPs        VM CID                                VM Type  Active
postgresql/631c443e-9056-4860-bb6d-735f34fa290f  running        z1  10.0.1.99  7e173ab2-0d2b-45f7-8ad1-9ab9297c82e3  default  true

1 vms

Succeeded

We're using blacksmith-boshrelease v1.1.0 and the example above was done with postgresql-forge-boshrelease v0.3.1, but with redis-forge-boshrelease v0.4.2 it's the same result.

I can't find what could be wrong here, so any help or hints is appreciated.

Simple Management UI

As an operator of Blacksmith service brokers, I would like an authenticated (single-user HTTP Basic Auth is fine) web UI for performing the following admin-like tasks:

  • Seeing what services have been deployed
  • Retrieving manifests for those service deployments
  • Retrieving task output from those service deployments
  • Reviewing errors that the blacksmith daemon has encountered (currently only available in debug log)

Blacksmith not reporting failed deployment provisions to CF

Had an incident this morning where a CF user provisioned a service, the BOSH deployment failed, and Blacksmith told CF that it was good to go. The bind of course failed, because the credentials.yml merge couldn't grab the job IPs from the deployment metadata.

What should have happened: Blacksmith should have noticed that the provisioning failed, and let CF know that the service instance was bad. That way, operators could troubleshoot the correct problem.

Make the log readout 'opt-in' or paginate it or something

Problem: In large environments, when opening up the UI to get a quick view of quota usage or service instances, page can take a long time to load because it's trying to load/display logs. If I'm not there to look at logs, the wait isn't useful.

Either make the logs a clickable link for when you need to see them, or paginate the logs to display fewer logs and let the page load faster.

Blacksmith UI cannot handle service IDs != service names

Something in the front-end marrying of limit quota data to plans breaks down, and the whole UI won't load. Try overriding just the params.postgresql_service_name key to something like 'foo', in a Genesis deploy, to reproduce.

Thanks @cwb124 for the heads-up on this one!

Service update support?

It looks like the current service broker has not implemented update yet: https://github.com/blacksmith-community/blacksmith/blob/master/broker.go#L343

It looks like the upgrade branch has, at last part, of an implementation: https://github.com/blacksmith-community/blacksmith/blob/upgrade/broker.go#L345

What I'm seeing...

$ boss -k -T update -f cfe4175b-88f0-4452-b411-b6897a7c2402
<snip>
=================================
PATCH /v2/service_instances/cfe4175b-88f0-4452-b411-b6897a7c2402 HTTP/1.1
Host: 192.168.124.103:443
User-Agent: Go-http-client/1.1
Content-Length: 24
Authorization: Basic YmxhY2tzbWl0aDpnckhDVWIyczJpUmppc3ZZSm1lWGNIdUJhVEZDYzdjb0lxdW54WktJaU5haXBFeXRmZnFMTFdVN0pzMHdtVWx6
X-Broker-Api-Version: 2.14
Accept-Encoding: gzip

{"service_id":"service"}

=================================
HTTP/1.1 500 Internal Server Error
Content-Length: 34
Connection: keep-alive
Content-Type: application/json
Date: Fri, 25 Feb 2022 20:08:25 GMT
Keep-Alive: timeout=20
Server: nginx/1.19.0

{"description":"not implemented"}


!!! API 500 Internal Server Error

Thanks!

When deleting a service via CF cli that was created via blacksmith broker, deletion of service fails

When I try and run a `cf delete-service , I get the following error:

FAILED
Server error, status code: 502, error code: 10001, message: Service instance : Service broker error: http: no Location header in response

The deployment/VM is deleted on the blacksmith director, but cf thinks the service still exists and is essentially orphaned.

It can be resolved by using the -f flag on the end but let's be honest, that's exhausting.

Skip scheduling and descheduling of backups for non-supported services

Services that are not supported for backup scheduling should skip calls to CreateSchedule and DeleteSchedule entirely to avoid misleading logs. The current approach works but logs are not accurate and a better approach should be implemented.

Example: rabbitmq is enabled but redis isn't - redis instances provisioning should skip entirely calls to the SHIELD client.

Blacksmith Doesn't Upload Manifest Releases

I put a url and sha1 in the forge manifest for a service deployment, and the deployment simply fails stating that the release does not exist. Apparently, this is handled specially via the CLI, not the director.

Update blacksmith to look at the releases top-level key in any manifest it deploys, and instruct the director to do a deploy.

Alternatively, blacksmith could just upload all the releases it can find when it boots (similar to stemcells), and the forges would just need to identify them specially.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.