hackoregon / team-budget Goto Github PK
View Code? Open in Web Editor NEWRepo for the Budget team's backend code
License: MIT License
Repo for the Budget team's backend code
License: MIT License
Create data endpoint for the Bureau totals for a Service Area
Filtered by
Emergency Comm. | 2014-15 | Personal | $140,034,432
Emergency Comm. | 2014-15 | IMS | $153,234,432
Emergency Comm. | 2014-15 | EMS | $134,344,432
Emergency Mgmt. | 2014-15 | Personal | $34,024,432
Emergency Mgmt. | 2014-15 | IMS | $43,444,334
Emergency Mgmt. | 2014-15 | EMS | $85,343,324
PPB | 2014-15 | Personal | $123,343,234
PPB | 2014-15 | IMS | $163,344,324
PPB | 2014-15 | EMS | $173,432,342
I haven't been very closely involved with the actual budget DB we are extracting our data from, but every now and then I'll compare my local api responses with the production ones and I see a large difference in the number of rows in certain tables.
Just review the number of records in each table in the production DB to ensure it's accurate. Must happen before demo-day.
Brian Grant and I have been working all this week to find a way to stabilize the Docker containers in AWS.
Problem is, the containers deploy and start fine, but the app inside them doesn't respond quickly enough for the AWS "Health Check" to consider them healthy containers, so they never get scheduled in rotation by the load balancers. Instead, AWS automatically starts new containers from the same deployed image and replaces the "unhealthy" containers. However, since the image is the same every time, AWS can never get "healthy" containers.
End result: Budget is running on ancient code, and can't be updated in AWS until the code in the container image stabilizes.
Current hypothesis (advanced by Dan, and which fits all the behaviours I've observed to date) is this: in looking at the code in views.py
, it appears the app is loading all unfiltered models at startup, which means it's loading the entire database table at startup. And since some of the tables are outsized, this is causing the startup to slow down until it exceeds all reasonable timeouts for determining that the contents of the container are healthy enough to be put into rotation.
According to the second article below, "Django load all applications and their 'models.py' at startup and executes its code."
Dan fired a number of resources my way, and while I'd love to use them to resolve the problem, this isn't a great use of my limited Python skills. This could use some app dev expertise. Can any of you folks help out?
Here's the links:
https://docs.djangoproject.com/en/1.10/topics/performance/
http://kartowicz.com/dryobates/2014-10/queries-run-django-projects-startup/
https://gun.io/blog/fast-as-fuck-django-part-1-using-a-profiler/
We have at least two developers on the Budget team who are using Windows machines for local development work.
When @mxmoss (using Windows 10 + Docker Toolbox for Windows) tried to run docker-compose up
from the current "dockerize" branch of this project, he encountered this error:
moss@DESKTOP-9LE20OE MINGW64 /c/develop/python/team-budget/budget_proj (dockerize)
$ docker-compose up
Starting budgetproj_web_1
ERROR: for web Cannot start service web: oci runtime error: container_linux.go:247: starting container process caused "exec: \"/code/docker-entrypoint.sh\": stat /code/docker-entrypoint.sh: no such file or directory"
ERROR: Encountered errors while bringing up the project.
moss@DESKTOP-9LE20OE MINGW64 /c/develop/python/team-budget/budget_proj (dockerize)
Further testing by myself with other Docker containers on Windows 10 + Docker Toolbox for Windows resulted in similar errors (due to the same problem - can't find the script inside the container).
These same Docker containers will build and run successfully on Linux, MacOS and MacOS + Docker Toolbox for Mac configurations.
Modify History Table, Functional Area Name column length
Functional Area Name in history table - the field isn't long enough for the data.
Needs to be greater than 48 characters
Consider: Review lengths of other columns?
(Aaron had a tool to analyze the CSV files for data lengths)
As Jim pointed out in his comments on PR #86, the /ocrb and /history endpoints are behaving poorly.
"When accessing the services on AWS, I get a 502 Gateway Error
for these two URLs:
# Returns 502
http://hacko-integration-658279555.us-west-2.elb.amazonaws.com/budget/ocrb/
http://hacko-integration-658279555.us-west-2.elb.amazonaws.com/budget/history/
but I get 200 OK and data when I supply query parameters. For example:
# Returns 200 OK
http://hacko-integration-658279555.us-west-2.elb.amazonaws.com/budget/ocrb/?fy=2015-16
http://hacko-integration-658279555.us-west-2.elb.amazonaws.com/budget/history/?fiscal_year=2015-16&bureau_code=PS
I am wondering if there is too much data for the bare endpoints /ocrb and /history. That could definitely be a problem for /history, which is a fairly large data set. In contrast, I do not see any problems with bare /kpm:
# No problems with these URLs:
http://hacko-integration-658279555.us-west-2.elb.amazonaws.com/budget/kpm/
http://hacko-integration-658279555.us-west-2.elb.amazonaws.com/budget/code/
I've since observed 504 errors as well (Gateway Timeout), but haven't inferred a pattern.
As of the "secrets" branch, we are now enabling Budget team developers to configure their local instance of the Django app to use the AWS EC2-hosted PostgreSQL instance as the data layer.
The new configuration (from secrets branch) has no impact on the code in views.py (e.g. find_ocrb_data()
, find_kpm_data()
) that pulls in data through the local CSV files. Thus work needs to be done to switch the API code so that each endpoint (e.g. /kpm/
, /ocrb/
and /summary/
) pulls its data from whichever PostgreSQL instance (AWS or local) is currently configured in the running app.
To generate the CYOA UI for users, the back end will take in the following as data inputs:
This means we'll need tabular data (probably rendered out to JSON) that captures this information. Most or all of this data should be available from the Budget in Brief document.
Django assumes our webserver (nginx?) will server the static files in production, which is why our swagger views are all messed up.
Simplest fix as far as I can tell is adding and enabling this:
https://github.com/kennethreitz/dj-static Following the directions in the readme should be sufficient.
Current automation has not yet populated the docker-push.sh
script being called in the after_success section of the .travis.yml
file.
Currently experiencing ImportError: cannot import name 'project_config' in Travis build:
https://travis-ci.org/hackoregon/team-budget/builds/210124740
When I run the same ./budget_proj/bin/start-proj.sh
script in local computer as the script called by .travis.yml
, I get the same error.
Running docker exec -it [container_id] /bin/bash
, I can see that the project_config.sh
script exists in /code/budget_proj/
, which can only occur if the aws s3 cp
command completes successfully.
So the question is, exactly which process is running that causes this final output in the Travis log:
File "/code/budget_proj/settings.py", line 15, in <module>
from . import project_config
ImportError: cannot import name 'project_config'
docker-compose -f budget_proj/docker-compose.yml up --build
from start-proj.sh
?python3 manage.py migrate
in docker-entrypoint.sh
?budget_proj.
for python(3) to find the project_config? Do we have to change from .
to some other reference?Example line:
f = '../Data/Budget_in_Brief_OCRB_data_All_Years.csv'
Should be using sys.path and the filepath to to find the csv, not just a string.
As @jimtyhurst asked in PR47, we have so far arbitrarily assigned a version of Python to use in the Django app.
Currently the Dockerfile specifies FROM python:3.5
, whereas the current version of Python is 3.6.0.
Is there any reason not to use Python 3.6?
Online budget data goes back to Brass Reports for FY 2005-06
Create data endpoint for the total budget by Service Area
Filter by Fiscal Year
Result: See example (below). Amount in dollars
Concern: How to create a summary endpoint?
Example results (Amounts not accurate)
Pub. Safety | 2014-15 | Personal | $140,034,432
Pub. Safety | 2014-15 | IMS | $153,234,432
Pub. Safety | 2014-15 | EMS | $134,344,432
Parks | 2014-15 | Personal | $34,024,432
Parks | 2014-15 | IMS | $43,444,334
Parks | 2014-15 | EMS | $85,343,324
Pub. Util | 2014-15 | Personal | $123,343,234
Pub. Util | 2014-15 | IMS | $163,344,324
Pub. Util | 2014-15 | EMS | $173,432,342
As @jimtyhurst asked in PR47, do we need to enable migrate
and import
every time the Django app starts?
Jim says, "Each time the image starts, it runs the migrations and imports the data as specified in docker-entrypoint.sh, right? I don't think it should do those things. In general, the application should be configured to access an existing database, so why should we run the migrations and import the data every time the web app starts?"
As to the former question, from what I see in PgAdmin at the moment, the tables appear to contain only the number of rows of data that we have in the source CSVs. So at least for the moment, this gives us room to figure out the more deterministic approach.
As to the latter, I can imagine at least three scenarios we might have to deal with:
as a Budget team developer, I want to work from a local PostgreSQL installation (a) to reduce the lag time for each query and (2) to protect other developers from unfinished and unvetted changes I'm experimenting with in any API development work I do.
as a Budget team developer, I want to use an automated script to perform the creation of the Production version of the Budget database.
as a Budget team developer, I want to use an automated script to perform database additions (migrations/schema additions and data additions/imports).
as a Budget team developer, I want to use an automated script to perform necessary data changes such as migrations/schema alterations (not additions but changes to existing schema objects) and data transformations (changes to existing data).
Examine the detailed naming and footnotes for each Service Area's Key Performance Measures.
It is possible that the naming of one or more KPMs, or the text of the accompanying footnote, changes from year to year.
If that is true, we cannot use a single column for "Key Performance Measure" and "Footnote" in a single table that captures the data over multiple years.
Use the data from Budget in Brief FY 2016-17 and FY 2015-16 as a baseline to validate if the text data for these fields is exactly the same from year to year.
Work is proceeding on the "second half" of the Budget in Brief documents - all the data representing the budget breakdown by Service Area.
What are we going to do with the "first half" of these documents?
I recommend we take some proposals to our contact(s) in the City Budget Office, alongside a demo of the Service Area representations in whatever state they're at.
Options include:
As a consumer of the API, I want to be able to retrieve budget data from multiple years, so that I can present it to people who want to review the budget.
As developers have been learning Django and developing iteratively, the API now has redundant endpoints that demonstrate different styles of implementation. For example, /ocrb
, /ocrb-prod
, and /summary
all return budget data from the "Operating and Capital Requirements by Bureau" (OCRB) tables in the "Budget in Brief" brochures.
/ocrb
was developed first, as a quick prototype, which reads data directly from CSV files. It served that purpose well and allowed developers to demonstrate an API during the very first week that coding started. Unfortunately, it works around the Django framework's Model, which makes it difficult to implement filtering and sorting, because it is not using framework classes for the list of objects./summary
was developed next, taking advantage of the importcsv.py
script, which allowed us to load the CSV files into a relational database. Initially, we used the embedded sqlite3 database for development, but the code works unchanged with the current PostgreSQL database running independently on an AWS EC2 instance. /summary
was written during the second week and it enhanced the API to allow for query parameters to filter the data and it returns the data sorted in a standard order. It allows for case-insensitive matching of query parameter values to field values, but requires case-sensitive matching of query parameter names to field names./ocrb-prod
was developed the next week, taking advantage of the data which by this time was deployed in a PostgreSQL database on an AWS EC2 instance. This implementation uses an alternative approach for handling query parameters, which provides for more concise code, which can handle future query parameters without code changes. It requires exact case-sensitive matching of query parameter values to field values and exact case-sensitive matching of query parameter names to field names.There are two endpoints for Key Performance Measure (KPM) data with parallel implementations to the corresponding OCRB endpoints:
/kpm
reads data from a CSV file./kpm-prod
reads data from the database configured in project_config.py
.It is time to reconcile those different implementations, refactor, and present a simple API for accessing budget data with just one endpoint for OCRB data and one endpoint for Key Performance Measures (KPM).
It might just be as simple as choosing the latest implementation, but there are a few factors to consider, such as:
Based on our limited testing of Linux developer machines in Issue #81, it appears that at present, Linux hosts cannot successfully build the docker container, so we'll go with Python 3.5 for now.
If it becomes necessary or desirable to use Python 3.6 or later, we'll have to re-investigate the ability of Linux hosts to build the docker container locally.
So just ran into this scenario:
We want to pass query params to a view that do not map to a model field. An example of this would be ?format=json
, which is one way DRF can decide to render a JSON response rather than HTML for the exploreable API. With our current query filtering we get an error like this:
http://hacko-integration-658279555.us-west-2.elb.amazonaws.com/budget/ocrb/?format=json
When it should just return a json response, like the KPM endpoint does (no filtering implemented)
http://hacko-integration-658279555.us-west-2.elb.amazonaws.com/budget/kpm/?format=json
Possible solutions:
Number 2 will have to cover:
I vote Number 1 as I see it as the better solution. That said, Number 2 sounds like more fun.
Goal: enable Travis to test the APIs running in Docker container e.g. using the python manage.py test
pattern.
Approach selected: having run into a dead-end enabling the "import . from project_config" approach in issue #58, let's pursue the alternative approach raised at the Budget Team meeting on 2017-03-13: "source env.sh".
In this implementation, rather than download the project_config.py
file and import it in settings.py
, we'll download an env.sh
file where all settings are exported
as environment variables, that can be accessed by settings.py
(and other bits of code).
This also shares implementation with the proposal by backend developers to allow them to switch database layers from AWS to any local database from their developer computers.
At the 2017-02-06 Budget team meeting, the group converged on the first User Story to start building as a team.
It takes the form of this story map:
One significant question to answer later: which revenue sources do not "scale linearly"? Which revenue sources are sent to specific {Service Areas, Bureaus}?
For the initial version of this Story, we decided to assume that all proportions are uniform (e.g. for every $$ contributed in taxes, the % allocated to e.g. Parks and Recreation would be calculated as a straight % of overall city spending).
Note: this story could have three variants: one for Property Owners, one for Businesses and one for Renters.
Start building out a Django/DRF API that will return the initially-JSON'ized data that Megan is working on.
This will allow us to iterate quickly on what the front end team members will need to be able to render the CYOA UI, and will also allow the devops team members to get the automation ball rolling for the API layer.
We need to reconcile the number of records in the CSV input files with the records in the staging database.
At the moment, our settings.py file is tightly coupled with our deployment details. If you don't know some of these secret things, you can't run manage.py test
or manage.py runserver
locally.
To fix this, we we'll implement multiple settings.py files, one per required environment. We'll update manage.py
to default to using the development settings, unless the DJANGO_SETTINGS_MODULE
environment variable is set.
The team has now built working API endpoints that emit data when requested.
Let's generate a Docker container around these APIs so they can be run on a local developer box. Once the "local build" is working, we can quickly move to the "and migrate the container to AWS ECS" to demonstrate a Budget API working in the cloud.
We will enable developers and the staging/integration environment to connect to the AWS EC2-hosted PostgreSQL instance.
To ensure that database secrets are never published to GitHub or Docker Hub, we need to separate out these secrets from settings.py
file. We will implement the model recommended by the DevOps squad that has been implemented in the backend service pattern.
User Story: As a customer who is also involved in cable access production, I want to know how much $$ goes to the Mt Hood Cable Regulatory Commission. I guess this should be shown per year, so I can see the funding trend.
Steps to get the data:
Here is a portion of the result...
{ "code_type": "division_code", "code": "CBMH", "description": "Mt Hood Cable Regulatory Commission" },
http://127.0.0.1:8000/history/?division_code=CBMH&fiscal_year=2011-12
Get a bunch of data elements similar to this:
{ "fund_center_code": "CBMH000001", "fund_code": "GENERAL", "functional_area_code": "CDCC00", "object_code": "EMS", "fund_center_name": "Mt. Hood Cable Regulatory Commission", "fund_name": "General Fund", "functional_area_name": "Cable Communications", "accounting_object_name": "External Materials and Services", "service_area_code": "CD", "program_code": "CDCC", "sub_program_code": "CDCC00", "fund_center": "CBMH000001", "division_code": "CBMH", "bureau_code": "CB", "bureau_name": "Office for Community Technology", "fiscal_year": "2011-12", "amount": 0 }, { "fund_center_code": "CBMH000001", "fund_code": "SPEC_REV", "functional_area_code": "CDCC00", "object_code": "IMS", "fund_center_name": "Mt. Hood Cable Regulatory Commission", "fund_name": "Special Revenue", "functional_area_name": "Cable Communications", "accounting_object_name": "Internal Materials and Services", "service_area_code": "CD", "program_code": "CDCC", "sub_program_code": "CDCC00", "fund_center": "CBMH000001", "division_code": "CBMH", "bureau_code": "CB", "bureau_name": "Office for Community Technology", "fiscal_year": "2011-12", "amount": 99187 },
Questions/ Comments:
This summary would return:
In PR#20 the "NA" values in the Amount field of the KPM data were dropped.
Question for the City Budget Office: are "NA" KPM values meaningfully distinct from a blank KPM value?
If the City Budget Office deems it equivalent in the representation we develop in the future, it's fine to leave that data out of the KPM endpoint data. If they deem it a meaningful distinction, we will need to find a way to represent that data somehow to those consuming the KPM endpoint.
I performed a walkthrough of the new README in advance of tomorrow's Hackathon - make sure I know what others will be using, and see if there's any issues that I could clear up.
I discovered one issue that I don't know how to immediately solve, so rather than delay the solution I'm posting my finding in hopes that others might know how to solve.
I'm following this version of the README and I got through to Step 4 of "setting up your development environment". When I run the ./budget_proj/manage.py makemigrations
command, I receive this error in return:
./budget_proj/manage.py makemigrations
Traceback (most recent call last):
File "./budget_proj/manage.py", line 22, in <module>
execute_from_command_line(sys.argv)
File "/Users/mike/code/~hackoregon/team-budget/budget_venv/lib/python3.5/site-packages/django/core/management/__init__.py", line 367, in execute_from_command_line
utility.execute()
File "/Users/mike/code/~hackoregon/team-budget/budget_venv/lib/python3.5/site-packages/django/core/management/__init__.py", line 359, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/Users/mike/code/~hackoregon/team-budget/budget_venv/lib/python3.5/site-packages/django/core/management/__init__.py", line 208, in fetch_command
klass = load_command_class(app_name, subcommand)
File "/Users/mike/code/~hackoregon/team-budget/budget_venv/lib/python3.5/site-packages/django/core/management/__init__.py", line 40, in load_command_class
module = import_module('%s.management.commands.%s' % (app_name, name))
File "/Users/mike/code/~hackoregon/team-budget/budget_venv/lib/python3.5/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 986, in _gcd_import
File "<frozen importlib._bootstrap>", line 969, in _find_and_load
File "<frozen importlib._bootstrap>", line 958, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 673, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 665, in exec_module
File "<frozen importlib._bootstrap>", line 222, in _call_with_frames_removed
File "/Users/mike/code/~hackoregon/team-budget/budget_venv/lib/python3.5/site-packages/django/core/management/commands/makemigrations.py", line 11, in <module>
from django.db.migrations.autodetector import MigrationAutodetector
File "/Users/mike/code/~hackoregon/team-budget/budget_venv/lib/python3.5/site-packages/django/db/migrations/autodetector.py", line 13, in <module>
from django.db.migrations.questioner import MigrationQuestioner
File "/Users/mike/code/~hackoregon/team-budget/budget_venv/lib/python3.5/site-packages/django/db/migrations/questioner.py", line 12, in <module>
from .loader import MigrationLoader
File "/Users/mike/code/~hackoregon/team-budget/budget_venv/lib/python3.5/site-packages/django/db/migrations/loader.py", line 10, in <module>
from django.db.migrations.recorder import MigrationRecorder
File "/Users/mike/code/~hackoregon/team-budget/budget_venv/lib/python3.5/site-packages/django/db/migrations/recorder.py", line 12, in <module>
class MigrationRecorder(object):
File "/Users/mike/code/~hackoregon/team-budget/budget_venv/lib/python3.5/site-packages/django/db/migrations/recorder.py", line 26, in MigrationRecorder
class Migration(models.Model):
File "/Users/mike/code/~hackoregon/team-budget/budget_venv/lib/python3.5/site-packages/django/db/migrations/recorder.py", line 27, in Migration
app = models.CharField(max_length=255)
File "/Users/mike/code/~hackoregon/team-budget/budget_venv/lib/python3.5/site-packages/django/db/models/fields/__init__.py", line 1043, in __init__
super(CharField, self).__init__(*args, **kwargs)
File "/Users/mike/code/~hackoregon/team-budget/budget_venv/lib/python3.5/site-packages/django/db/models/fields/__init__.py", line 166, in __init__
self.db_tablespace = db_tablespace or settings.DEFAULT_INDEX_TABLESPACE
File "/Users/mike/code/~hackoregon/team-budget/budget_venv/lib/python3.5/site-packages/django/conf/__init__.py", line 53, in __getattr__
self._setup(name)
File "/Users/mike/code/~hackoregon/team-budget/budget_venv/lib/python3.5/site-packages/django/conf/__init__.py", line 39, in _setup
% (desc, ENVIRONMENT_VARIABLE))
django.core.exceptions.ImproperlyConfigured: Requested setting DEFAULT_INDEX_TABLESPACE, but settings are not configured. You must either define the environment variable DJANGO_SETTINGS_MODULE or call settings.configure() before accessing settings.
Is there a missing step where we'd run source bin/env.sh
or similar to populate the DJANGO_SETTINGS_MODULE? Or is this an irreproducible artifact of my ever-shifting dev environment?
PR #64 trivially tests the /ocrb
endpoint.
Similar testing should be added for the remaining endpoints.
I talked to Megan at the end of the 2017-02-06 Budget team meeting, and we tried to think of some small “spike” or experiment that the database team members could perform. The best idea we thought of that would help move our knowledge forward fastest is to continue down the road Ron pursued today, by trying to find the BRASS snapshot location of any cell of data from any table or bullet in the Budget In Brief document.
If we get any hits, that would start to untangle the relationship between BRASS tables and the Budget In Brief. If not, it ups the urgency on meeting with Shannon to build up some SQL queries to generate that data.
Let's assume for the moment that we enable the default Django endpoints (by running the python3 manage.py migrate
command during Docker build , which enables for example the /admin endpoint).
Assuming these are not desired or secure to run in cloud (Integration/Production), it would be good to have an actual strategy for mitigating their exposure.
Ideally disabling them entirely in Django would be good. If that can't be done (and so far Django's docs aren't helping), then blocking access from the Internet is next best - either by putting a routing filter in place in the container (gunicorn?), or some kind of container policy if possible, or EC2 security policy that blocks requests from even getting into the container.
Generate a skeleton wireframe for 2017-02-13 Budget team meeting
Jay McGrath swapped in pytest for django's test runner a month ago on the Housing backend, and his reasoning was to increase the flexibility in storing the test scripts - django test runner expects all tests in [app_dir]/tests.py, whereas pytest can handle test files stored anywhere in the project.
We don't have a big swath of tests to manage so far, but if we do then this looks like a scalable solution to managing a big collection of tests.
If it turns out this is worth pursuing, or if others wish to look into it, here's the core docs:
http://pytest-django.readthedocs.io/en/latest/
Please write documentation so that other people can recreate what you have accomplished. For example: moving SQL server image to PostgreSQL. Write down the steps you take. Document what you are doing including mistakes and crazy error messages. This is a learning process. We learn from each other.
I'm dealing with a bunch of fallout of trying to get Docker + Travis + the Budget Django app's pathing to work together for a successful build. See DevOps Issue 34 for details.
One idea I'm coming to is that once I figure out all the path dependencies that result from moving the Docker files into saner locations, I'd like to remove the hard-coded path dependencies that I'm building in to get this sucker to build again.
So far I've hard-coded /budget_proj/bin/ into the following files:
My half-baked idea is to insert some kind of PATH=$PATH;/budget_proj/bin/; export PATH
command somewhere that it will do the most good (i.e. for all commands running in the Travis container).
Perhaps something like this will be needed to reduce the hard-coded pathing that is creeping into the Docker container runtime commands as well.
Create data endpoint for showing budget data change over time
Want to show the change in budget totals by service area for each fiscal year
Result is in dollars per service area per fiscal year
Consider that this data will be used in a Sankey diagram or stream graph
We don't have revenue data. Can we get that data?
Not required for MVP
We have scraped the tabular data for Service Areas from the past two years' Budget in Brief documents - you see that in the Data folder in this repo (https://github.com/hackoregon/team-budget/tree/master/Data).
Next step is to scrape the text data on those Service Area pages to enable us to emit it in an API endpoint, so that it can be rendered inline alongside the OCRB and KPM data. The data includes:
Any tool that works will do. The tool used to scrape tabular data was Tabula; unknown at the moment if this would work for text data, or if a simple cut-and-paste would work well enough.
Question for City Budget Office contacts: must the SIMP bullets be displayed every time in the same order as they are presented in the Budget in Brief PDF documents?
Remove all ~300 tables from AWS EC2-hosted PostgreSQL that were previously imported from the BRASS database export.
This will enable us to re-use the PostgreSQL instance for the Budget-in-Brief data and the data export we received from City Budget Office the week of 2017-03-06.
Currently when running the budget_proj/app without a migrate step, the following message shows up after runserver:
python3 budget_proj/manage.py runserver
Performing system checks...
System check identified no issues (0 silenced).
You have 13 unapplied migration(s). Your project may not work properly until you apply the migrations for app(s): admin, auth, contenttypes, sessions.
Run 'python manage.py migrate' to apply them.
February 19, 2017 - 21:41:45
Django version 1.10.5, using settings 'budget_proj.settings'
Starting development server at http://127.0.0.1:8000/
Quit the server with CONTROL-C.
Not Found: /
The message of note is:
You have 13 unapplied migrations...
I see various projects do or do not include a step like this before runserver - don't know if that's required here:
python3 budget_proj/manage.py migrate
PR #64 trivially tests the /ocrb
endpoint.
Issue #65 can replicate that testing for the other endpoints.
DevOps class Assignment 5 and 6 presented some simple yet effective automated endpoint testing. The Housing backend team have been implementing some more sophisticated automated testing of their endpoints.
Our automation pipeline should ensure that only when the endpoints are still responding with valid data, should we automatically deploy updated code to staging/integration environment. [It remains to be decided whether and under what more stringent conditions to automatically deploy code to the not-yet-available production environment.]
For example:
/ocrb
endpoint is expected to respond with ~200 JSON records for the FY 2015-16 fiscal year, and all of a sudden the response drops to a single record (e.g. some kind of error message), that should be considered a failed build./ocrb
endpoint is expected to respond with numeric values for the "amount" field and instead it is sending alphanumeric data, that should be considered a failed build (e.g. maybe some columns in the model got mixed up)A small handful of automated tests to ensure that (a) JSON is being emitted (not some 500 error), (b) a reasonable number of records are being emitted ("reasonable" varying by endpoint obviously) and (c) the data is structured the way it is intended, that sounds to me like a minimum acceptable set of tests to consider the build "still emitting valid data responses".
We can progressively add more tests as the endpoint logic gets more sophisticated, and as we encounter any issues with code deployed to the staging/integration environment.
At this stage of development, and excluding rare moments when we will perform "pre-Demo Day" demos outside of the Hack Oregon audience, I see no reason to otherwise prevent new commits-to-master from being automatically deployed to our staging/integration environment in AWS.
There is active discussion on the team whether we even need the power of a PostgreqSQL database for the "Service Area Budgets" card of our MVP (assuming this is the most likely application for us to launch at Demo Day).
At the moment we:
Questions are:
This is a reasonable question to ask. The trade-off of not using the standard database may be worth the benefits we gain.
Load the current CSV of OCRB data into the Budget team's AWS EC2-hosted PostgreSQL instance.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.