This repository contains documentation for developers including:
- Writing Scrapers using Pupa
- Open Civic Data's Data Type Specifications
- Open Civic Data Proposals
Read these docs at https://open-civic-data.readthedocs.io/en/latest/
django app powering Open Civic Data API
License: Other
This repository contains documentation for developers including:
Read these docs at https://open-civic-data.readthedocs.io/en/latest/
A membership may have contact details. However, there is no API endpoint that publishes these contact details.
For example, datetime for organizations is saved in a different format than for bills.
Eg,
2014-08-14T12:00:28.274
vs
2015-04-09T00:24:06.485123+00:00
If this is a serialization issue, this should be standardized to the same pattern as accepted in the OCD API query string.
go through the code and tighten it up.
I've done a few starts and stops. Clean out unneeded cruft.
Stuff like ?name=Obama
, where it did a name__icontains
. There was some typing type stuff. Let's bring that back.
Infra is mostly in place with adjust_filters
(sorta). This is just syntax sugar for a full name__icontains
We should probably have the vote listings show less information
http://ocd.datamade.us/ocd-vote/2d1ad08c-83af-4795-991b-239d7f027821/
{
"voter_name": "Balcer, James",
"voter":
{
"image": "",
"sort_name": "",
"created_at": "2015-12-25T16:26:25.575",
"id": "ocd-person/b0d09c1d-4b9c-445f-bee5-848f0aac04b8",
"gender": "",
"locked_fields": "[]",
"national_identity": "",
"family_name": "",
"given_name": "",
"summary": "",
"birth_date": "",
"death_date": "",
"extras": "{}",
"name": "Balcer, James",
"biography": "",
"updated_at": "2015-12-25T16:26:25.575"
},
"option": "yes",
"note": ""
},
At the least, we should remove "locked_fields" I think we should cut even further back. Maybe only include the ocd_id.
I'm often interested in finding bills that have a current status, i.e. stuck in a particular committee.
Right now I can search for legislation that has ever been referred.
http://10.42.2.102/bills/?actions__description=Referred
Is there a way to limit this query to just the legislation where the last action was a referral?
for new contributors
so that people can figure out how to use the API
should be doable in OCD land, wasn't before
We were only adding related bills to agenda items when it was a "votable" event, assuming that informational and presentation agenda items didn't have related bills.
2016-03-14T05:39:45.732774+00:00 heroku[router]: at=info method=GET path="/ocd-event/f00dbad3-1537-4877-95af-7698492340e4" host=toronto-ocd-api.herokuapp.com request_id=eac1fb43-a602-449d-906a-1910b40c99cf fwd="174.117.101.98" dyno=web.1 connect=0ms service=3ms status=301 bytes=248
2016-03-14T05:39:45.730297+00:00 app[web.1]: WARNING Not Found: /ocd-event/f00dbad3-1537-4877-95af-7698492340e4
2016-03-14T05:39:45.832606+00:00 app[web.1]: ERROR Internal Server Error: /ocd-event/f00dbad3-1537-4877-95af-7698492340e4/
2016-03-14T05:39:45.832617+00:00 app[web.1]: Traceback (most recent call last):
2016-03-14T05:39:45.832618+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.4/site-packages/django/core/handlers/base.py", line 149, in get_response
2016-03-14T05:39:45.832622+00:00 app[web.1]: response = wrapped_callback(request, *callback_args, **callback_kwargs)
2016-03-14T05:39:45.832619+00:00 app[web.1]: response = self.process_exception_by_middleware(e, request)
2016-03-14T05:39:45.832620+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.4/site-packages/django/core/handlers/base.py", line 147, in get_response
2016-03-14T05:39:45.832632+00:00 app[web.1]: return self.dispatch(request, *args, **kwargs)
2016-03-14T05:39:45.832632+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.4/site-packages/django/views/generic/base.py", line 68, in view
2016-03-14T05:39:45.832634+00:00 app[web.1]: return bound_func(*args, **kwargs)
2016-03-14T05:39:45.832635+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.4/site-packages/django/views/decorators/csrf.py", line 58, in wrapped_view
2016-03-14T05:39:45.832633+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.4/site-packages/django/utils/decorators.py", line 67, in _wrapper
2016-03-14T05:39:45.832635+00:00 app[web.1]: return view_func(*args, **kwargs)
2016-03-14T05:39:45.832636+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.4/site-packages/django/utils/decorators.py", line 63, in bound_func
2016-03-14T05:39:45.832637+00:00 app[web.1]: return func.__get__(self, type(self))(*args2, **kwargs2)
2016-03-14T05:39:45.832638+00:00 app[web.1]: response = super(Endpoint, self).dispatch(request, *args, **kwargs)
2016-03-14T05:39:45.832637+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.4/site-packages/restless/views.py", line 106, in dispatch
2016-03-14T05:39:45.832639+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.4/site-packages/django/views/generic/base.py", line 88, in dispatch
2016-03-14T05:39:45.832640+00:00 app[web.1]: return fn(self, request, *args, **kwargs)
2016-03-14T05:39:45.832639+00:00 app[web.1]: return handler(request, *args, **kwargs)
2016-03-14T05:39:45.832640+00:00 app[web.1]: File "/app/.heroku/src/imago-master/imago/helpers.py", line 137, in _
2016-03-14T05:39:45.832641+00:00 app[web.1]: return fn(self, request, *args, **kwargs)
2016-03-14T05:39:45.832641+00:00 app[web.1]: File "/app/.heroku/src/imago-master/imago/helpers.py", line 122, in _
2016-03-14T05:39:45.832642+00:00 app[web.1]: File "/app/.heroku/src/imago-master/imago/helpers.py", line 391, in get
2016-03-14T05:39:45.832642+00:00 app[web.1]: serialized = serialize(obj, **config)
2016-03-14T05:39:45.832643+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.4/site-packages/restless/models.py", line 205, in serialize
2016-03-14T05:39:45.832643+00:00 app[web.1]: exclude=exclude, fixup=fixup)
2016-03-14T05:39:45.832644+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.4/site-packages/restless/models.py", line 109, in serialize_model
2016-03-14T05:39:45.832644+00:00 app[web.1]: data[k] = serialize(getattr(obj, k), **v)
2016-03-14T05:39:45.832645+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.4/site-packages/restless/models.py", line 193, in serialize
2016-03-14T05:39:45.832645+00:00 app[web.1]: return [subs(i) for i in src.all()]
2016-03-14T05:39:45.832645+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.4/site-packages/restless/models.py", line 193, in <listcomp>
2016-03-14T05:39:45.832646+00:00 app[web.1]: return [subs(i) for i in src.all()]
2016-03-14T05:39:45.832646+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.4/site-packages/restless/models.py", line 190, in subs
2016-03-14T05:39:45.832647+00:00 app[web.1]: exclude=exclude, fixup=fixup)
2016-03-14T05:39:45.832647+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.4/site-packages/restless/models.py", line 205, in serialize
2016-03-14T05:39:45.832648+00:00 app[web.1]: exclude=exclude, fixup=fixup)
2016-03-14T05:39:45.832648+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.4/site-packages/restless/models.py", line 109, in serialize_model
2016-03-14T05:39:45.832648+00:00 app[web.1]: data[k] = serialize(getattr(obj, k), **v)
2016-03-14T05:39:45.832649+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.4/site-packages/restless/models.py", line 193, in serialize
2016-03-14T05:39:45.832649+00:00 app[web.1]: return [subs(i) for i in src.all()]
2016-03-14T05:39:45.832650+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.4/site-packages/restless/models.py", line 193, in <listcomp>
2016-03-14T05:39:45.832650+00:00 app[web.1]: return [subs(i) for i in src.all()]
2016-03-14T05:39:45.832650+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.4/site-packages/restless/models.py", line 190, in subs
2016-03-14T05:39:45.832651+00:00 app[web.1]: exclude=exclude, fixup=fixup)
2016-03-14T05:39:45.832651+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.4/site-packages/restless/models.py", line 205, in serialize
2016-03-14T05:39:45.832651+00:00 app[web.1]: exclude=exclude, fixup=fixup)
2016-03-14T05:39:45.832652+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.4/site-packages/restless/models.py", line 109, in serialize_model
2016-03-14T05:39:45.832652+00:00 app[web.1]: data[k] = serialize(getattr(obj, k), **v)
2016-03-14T05:39:45.832652+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.4/site-packages/opencivicdata/models/event.py", line 121, in entity_name
2016-03-14T05:39:45.832671+00:00 app[web.1]: AttributeError: 'Bill' object has no attribute 'name'
2016-03-14T05:39:45.832667+00:00 app[web.1]: return self.bill.name
2016-03-14T05:39:45.891105+00:00 heroku[router]: at=info method=GET path="/ocd-event/f00dbad3-1537-4877-95af-7698492340e4/" host=toronto-ocd-api.herokuapp.com request_id=6edf762e-6ce7-4ada-84fd-3245d7e4d6ed fwd="174.117.101.98" dyno=web.1 connect=0ms service=92ms status=500 bytes=210
As of now, it's impossible to sort
search endpoints by created_at
and similar datetime
s. This may have something to do with #49.
The documentation for organization search says that the API will search for substrings. Doesn't seem to be happening.
These implementations are currently broken. If they're determined to be necessary, they need to be re-implemented. Otherwise, they should be removed from the docs.
.. rather then dump whole objects, doing ?fields=foo
should be a default view of foo
rather then all the fields.
Attempting to get this event: http://api.opencivicdata.org/ocd-event/005454ab-ddcd-4d95-bd22-1e121f0a73a1/ returns a 500 Error
The event is listed under http://api.opencivicdata.org/events/?jurisdiction_id=ocd-jurisdiction/country:us/state:il/place:chicago/government
curl -X "GET" "https://api.opencivicdata.org/bills/?sponsors.id=ocd-person%2F3b5daa7c-f127-472f-9bde-39d80f18e907&fields=id%2Ctitle"
will return an http 500.
should be possible now
(let it go less than default)
From old-imago
upstream as many chunks of helpers as we can, pretty neat stuff in there.
As of now, the boundaries endpoint is the only endpoint that uses offset + limit
pagination. All the others use page
.
For consistency's sake and ease of usage, I think it'd be best to change the paging on the boundaries endpoint to page
, if possible.
rather than an ugly exception
We know when an event is schedule for, we know whether it was confirmed. Imago should handle determining if the event has passed or not instead of pupa.
On my local machine:
// 20150521104713
// http://10.42.2.102/ocd-person/076e5dae-ca9a-460a-bcae-7ff925b620d8/
{
"traceback": "Traceback (most recent call last):\n File \"/projects/api.opencivicdata.org/src/imago/imago/helpers.py\", line 94, in get_fields\n ret = {x: fwrap(root[x]) for x in concrete}\n File \"/projects/api.opencivicdata.org/src/imago/imago/helpers.py\", line 94, in <dictcomp>\n ret = {x: fwrap(root[x]) for x in concrete}\nKeyError: 'given_name'\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File \"/projects/api.opencivicdata.org/virt/lib/python3.4/site-packages/restless/views.py\", line 106, in dispatch\n response = super(Endpoint, self).dispatch(request, *args, **kwargs)\n File \"/projects/api.opencivicdata.org/virt/lib/python3.4/site-packages/django/views/generic/base.py\", line 87, in dispatch\n return handler(request, *args, **kwargs)\n File \"/projects/api.opencivicdata.org/src/imago/imago/helpers.py\", line 137, in _\n return fn(self, request, *args, **kwargs)\n File \"/projects/api.opencivicdata.org/src/imago/imago/helpers.py\", line 122, in _\n return fn(self, request, *args, **kwargs)\n File \"/projects/api.opencivicdata.org/src/imago/imago/helpers.py\", line 381, in get\n related, config = get_fields(self.serialize_config, fields=fields)\n File \"/projects/api.opencivicdata.org/src/imago/imago/helpers.py\", line 96, in get_fields\n raise FieldKeyError(*e.args)\nimago.helpers.FieldKeyError: <FieldKeyError: given_name>\n",
"error": "<FieldKeyError: given_name>"
}
needs to be re-added.
Ensure that specs are totally covered.
I'm guessing we should prefetch some stuff rather then select for each.
We've got the select keys always, so this should be doable.
500: curl -X "GET" "https://api.opencivicdata.org/bills/?fields=id%2Ctitle%2Clegislative_session_id"
(yes I'm setting X-APIKEY)
legislative_session_id
is in the default fields on the BillDetail view.
I did a first pass, it's got pretty sane defaults, but we should go through and see if there's anything we want to cut/add.
are others using imago?
I don't think we're likely to use this for Open States' new API, in large part b/c this is based on https://github.com/dobarkod/django-restless which hasn't seen an update in ~2 years. (we went down this road w/ django-piston, leading to a fair amount of regret)
Probably about equally feature-complete I have https://github.com/openstates/ocd-jsonapi/ which is based on django-rest-framework (a well supported & maintained library) and which is compatible with the http://jsonapi.org/ spec. I'm not sure if we'll go that route either but I figured I'd mention it as an option.
(I'm also somewhat interested in exploring something GraphQL inspired for Open States API v2, if that is of interest to anyone else)
catch exceptions, define an error format, etc.
there are server errors for some event pages (but not all) from the nyc events listing
example: ocd-event/001297df-b47c-40b2-a1b0-f31afb8082a1(3rd event in the listing)
I get the gut feeling we can DoS / avoid pagination limits by hitting a related field on an object detail / list view.
We should limit the number of elements we return on related entities.
(I'm thinking jurisdiction.events or something, not in particular, but to give an example of the route you can use)
We should be able to flexibly search in datetime fields. The docs suggest that one could use everything from year alone to full datetime down to fractional seconds.
For example, this works:
http://api.opencivicdata.org/bills/?created_at=2015-04-09T00:24:06.485090
But this (and other variations on datetime) doesn't work:
http://api.opencivicdata.org/bills/?updated_at=2015
It would be very handy to tell what type of organization a person is a member of. Particularly useful for distinguishing between party and other types of orgs.
doing 100+ queries, mostly against vote count, probably just need a prefetch call
It'd be nice if these were standard across all search endpoints. Here're some examples:
Divisions:
Jurisdictions:
Organizations:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.