Building production grade services for Cloud Foundry
Stay tuned!
Building production services
License: MIT License
Blacksmith needs some sort of background job (goroutine on a ticker?) that checks the BOSH director and marks service instances whose backing deployment have disappeared as such. Otherwise, we can get into a situation where there are no deployments on the director, but a service is "at the limit" w.r.t. number of service deployments in the Vault index.
We also need another scheduled task (goroutine on a ticker?) to run bosh-cleanup
against the BOSH director. This should be configurable, in case someone wants to manage their BOSH director out-of-band, and doesn't want Blacksmith "cleaning up" things they were using.
:enhancement:
It would be nice if the "Services" section was listed chronologically, by date provisioned, and if that field were visible in the table.
This means we need to track it, first.
It would be nice if the "Services" section could show org and space names, based on data in the CF CCDB.
This means we need to track it, first, and the operator will have to configure CF credentials for querying CCDB. This should be optional.
As a person writing / tuning a forge, it would be nice to know why a service initialization script failed (unbound variable? James doesn't know how to use safe?)
Getting the stdout / stderr of the init script in the logs would be nice.
SHIELD backups feature currently only support named stores for tenants while UUID-provided stores can be both global or belong to a tenant.
Example: if the store "S3" is a global store it can only be specified in the store
configuration of the shield feature as UUID, if the store "S3" belongs to a tenant and therefore it is not global then it can be specified in the store
as both UUID or name.
Add support for referencing global stores here to support both FindStore
and FindGlobalStore
.
While deploying a new service via blacksmith, the deployment was done fine, but the service status stalls with create in progress
.
Trying to debug shows the following:
2020-03-25 11:41:13.986 DEBUG [ad0106c4-03b9-4717-9ae6-a90b3e1ecdb7 postgresql/postgresql-small] deployment started, BOSH task 5788039
2020-03-25 11:41:13.986 DEBUG [ad0106c4-03b9-4717-9ae6-a90b3e1ecdb7 postgresql/postgresql-small] tracking service instance in the vault 'db' index
2020-03-25 11:41:13.989 DEBUG [ad0106c4-03b9-4717-9ae6-a90b3e1ecdb7 postgresql/postgresql-small] updating service status in the vault
2020-03-25 11:41:13.990 DEBUG [vault track ad0106c4-03b9-4717-9ae6-a90b3e1ecdb7] tracking action 'provision', task 5788039
2020-03-25 11:41:13.992 DEBUG [ad0106c4-03b9-4717-9ae6-a90b3e1ecdb7 postgresql/postgresql-small] started provisioning
2020-03-25 11:41:17.379 DEBUG [ad0106c4-03b9-4717-9ae6-a90b3e1ecdb7] last-operation check received; checking state of service deployment
2020-03-25 11:41:17.380 DEBUG [ad0106c4-03b9-4717-9ae6-a90b3e1ecdb7] instance was last in '' state, BOSH task 0
2020-03-25 11:41:17.380 ERROR [ad0106c4-03b9-4717-9ae6-a90b3e1ecdb7] invalid state '' found in the vault
2020-03-25 11:42:19.854 DEBUG [ad0106c4-03b9-4717-9ae6-a90b3e1ecdb7] last-operation check received; checking state of service deployment
2020-03-25 11:42:19.856 DEBUG [ad0106c4-03b9-4717-9ae6-a90b3e1ecdb7] instance was last in '' state, BOSH task 0
2020-03-25 11:42:19.856 ERROR [ad0106c4-03b9-4717-9ae6-a90b3e1ecdb7] invalid state '' found in the vault
The tracked action seems to be correct in vault:
# /var/vcap/packages/safe/bin/safe get secret/ad0106c4-03b9-4717-9ae6-a90b3e1ecdb7/task
--- # secret/ad0106c4-03b9-4717-9ae6-a90b3e1ecdb7/task
action: provision
params: '{}'
task: "5788039"
The deployment:
$ bosh vms -d postgresql-small-ad0106c4-03b9-4717-9ae6-a90b3e1ecdb7
Using environment 'https://10.0.1.6:25555' as user 'admin'
Task 5793461. Done
Deployment 'postgresql-small-ad0106c4-03b9-4717-9ae6-a90b3e1ecdb7'
Instance Process State AZ IPs VM CID VM Type Active
postgresql/631c443e-9056-4860-bb6d-735f34fa290f running z1 10.0.1.99 7e173ab2-0d2b-45f7-8ad1-9ab9297c82e3 default true
1 vms
Succeeded
We're using blacksmith-boshrelease
v1.1.0 and the example above was done with postgresql-forge-boshrelease
v0.3.1, but with redis-forge-boshrelease
v0.4.2 it's the same result.
I can't find what could be wrong here, so any help or hints is appreciated.
As an operator of Blacksmith service brokers, I would like an authenticated (single-user HTTP Basic Auth is fine) web UI for performing the following admin-like tasks:
For troubleshooting, I would like to see the credentials for a service, from the management UI, much like we display the manifest.yml and task.logs.
Had an incident this morning where a CF user provisioned a service, the BOSH deployment failed, and Blacksmith told CF that it was good to go. The bind of course failed, because the credentials.yml merge couldn't grab the job IPs from the deployment metadata.
What should have happened: Blacksmith should have noticed that the provisioning failed, and let CF know that the service instance was bad. That way, operators could troubleshoot the correct problem.
Problem: In large environments, when opening up the UI to get a quick view of quota usage or service instances, page can take a long time to load because it's trying to load/display logs. If I'm not there to look at logs, the wait isn't useful.
Either make the logs a clickable link for when you need to see them, or paginate the logs to display fewer logs and let the page load faster.
When blacksmith is installed in an environment with a director with UAA enabled, blacksmith will stop working because the token UAA token it has expires.
Something in the front-end marrying of limit quota data to plans breaks down, and the whole UI won't load. Try overriding just the params.postgresql_service_name
key to something like 'foo', in a Genesis deploy, to reproduce.
Thanks @cwb124 for the heads-up on this one!
It looks like the current service broker has not implemented update yet: https://github.com/blacksmith-community/blacksmith/blob/master/broker.go#L343
It looks like the upgrade
branch has, at last part, of an implementation: https://github.com/blacksmith-community/blacksmith/blob/upgrade/broker.go#L345
What I'm seeing...
$ boss -k -T update -f cfe4175b-88f0-4452-b411-b6897a7c2402
<snip>
=================================
PATCH /v2/service_instances/cfe4175b-88f0-4452-b411-b6897a7c2402 HTTP/1.1
Host: 192.168.124.103:443
User-Agent: Go-http-client/1.1
Content-Length: 24
Authorization: Basic YmxhY2tzbWl0aDpnckhDVWIyczJpUmppc3ZZSm1lWGNIdUJhVEZDYzdjb0lxdW54WktJaU5haXBFeXRmZnFMTFdVN0pzMHdtVWx6
X-Broker-Api-Version: 2.14
Accept-Encoding: gzip
{"service_id":"service"}
=================================
HTTP/1.1 500 Internal Server Error
Content-Length: 34
Connection: keep-alive
Content-Type: application/json
Date: Fri, 25 Feb 2022 20:08:25 GMT
Keep-Alive: timeout=20
Server: nginx/1.19.0
{"description":"not implemented"}
!!! API 500 Internal Server Error
Thanks!
When I try and run a `cf delete-service , I get the following error:
FAILED
Server error, status code: 502, error code: 10001, message: Service instance : Service broker error: http: no Location header in response
The deployment/VM is deleted on the blacksmith director, but cf thinks the service still exists and is essentially orphaned.
It can be resolved by using the -f flag on the end but let's be honest, that's exhausting.
Services that are not supported for backup scheduling should skip calls to CreateSchedule
and DeleteSchedule
entirely to avoid misleading logs. The current approach works but logs are not accurate and a better approach should be implemented.
Example: rabbitmq is enabled but redis isn't - redis instances provisioning should skip entirely calls to the SHIELD client.
I put a url
and sha1
in the forge manifest for a service deployment, and the deployment simply fails stating that the release does not exist. Apparently, this is handled specially via the CLI, not the director.
Update blacksmith
to look at the releases
top-level key in any manifest it deploys, and instruct the director to do a deploy.
Alternatively, blacksmith
could just upload all the releases it can find when it boots (similar to stemcells), and the forges would just need to identify them specially.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.