openelections / clarify Goto Github PK
View Code? Open in Web Editor NEWDiscover and parse results for jurisdictions that use Clarity-based election systems.
License: MIT License
Discover and parse results for jurisdictions that use Clarity-based election systems.
License: MIT License
Here is my simple script to try and pull unofficial precinct results from the 2015 governors race from the KY Clarity site.
import clarify
import wget
import os
import zipfile
import zipfile
import clarify
import wget
import os
j = clarify.Jurisdiction(url='http://results.enr.clarityelections.com/KY/15261/30235/en/summary.html', level='state')
subs=j.get_subjurisdictions()
name='detail'
end='.zip'
def getkyresults():
for x in range(0,len(subs)):
print subs[x].report_url('txt')
wget.download(subs[x].report_url('txt'))
hm=zipfile.ZipFile('detailtxt.zip')
zipfile.ZipFile.extractall(hm)
os.rename('detailtxt.zip', name+str(x)+end)
os.rename('detail.txt', name+str(x)+'.txt')
You can see that file 4 is a BadZipfile. This is because its a link to a 404 page.
The clarity web pages now seem to request a url like
http://results.enr.clarityelections.com/CO/63746/current_ver.txt?rnd=0.8333623006146083
in order to find out what the most recent version of the data is. It returns a single number, like "183878", which is then used to find the most recent version of the data, e.g.
http://results.enr.clarityelections.com/CO/63746/183878/Web01/en/summary.html#
The Clarify library should have a method to return that url or the like, so that the most recent state-wide xml detail file can be retrieved.
I find libraries that raising Exceptions early and often when users pass in invalid/unsupported much more pleasant to work with.
I'm developing a partial rewrite of the jurisdiction.py
constructor to raise exceptions if:
Since raising Exceptions in a potential future version where none are raised now is a breaking change, I'm submitting this Issue to gather feedback before submitting the PR.
Useful info:
Questions:
Ideas:
When calling get_subjurisdictions(). This occurs when running tests. Turns out it's a feature of requests, not a bug.
the tests using http were failing with connection refused. Replaced the urls with https in the test file, and the GET works but the tests are failing based on a mismatch between https/http
in Jurisdiction#_clarity_subjurisdiction_url
Does anyone have insight into why this Parser unit test is erroring?
======================================================================
FAIL: test__get_or_create_result_jurisdiction (test_parser.TestParser)
----------------------------------------------------------------------
Traceback (most recent call last):
File ".../clarify/tests/test_parser.py", line 72, in test__get_or_create_result_jurisdiction
self.assertEqual(parser._result_jurisdictions, [ result_jurisdiction ])
AssertionError: Lists differ: [Resu[154 chars]rted=None, precincts_reporting_percent=None, level='precinct')] != [Resu[154 chars]rted=None, precincts_reporting_percent=None, level='county')]
First differing element 0:
Resul[152 chars]orted=None, precincts_reporting_percent=None, level='precinct')
Resul[152 chars]orted=None, precincts_reporting_percent=None, level='county')
Diff is 892 characters long. Set self.maxDiff to None to see it.
----------------------------------------------------------------------
I have looked into trying to fix it but can't figure out if the left side or right side of the assertion is undesirable.
While the original use case didn't want to consider how to download and unpack the XML results, I think it will make it a lot easier for volunteers to deal with counties or states that use Clarity systems for their results if they could just run a command to download the results as CSV and then use simpler scripts or manually update the data to do any post-processing.
@chagan and I started working on this at the #NICAR17 hackathon.
Tasks
clarify results
(@ghing)To speed up tests and also be kind to the remote servers, test URL discovery by mocking the HTTP responses.
The patterns and packages described in Mocking Python Requests with Responses seem like a good start.
Provide a loader class, more as a reference implementation for a full workflow.
Stress in docs that this isn't the most efficient implementation (e.g. no caching).
(moved from #13)
Looking at precinct-level results from the 2014 and 2016 primaries in Sarpy County, Nebraska -- in both cases, I'm getting a vote type KeyError
when I try to access the Parser instance. I'm running Python 3.4.
The XML file for the 2014 primary is here.
Here's the code I'm stepping through for the 2014 results, up to the point where it breaks:
from __future__ import print_function
from zipfile import ZipFile
try:
from StringIO import StringIO as ioDuder
except ImportError:
from io import BytesIO as ioDuder
import unicodecsv
import requests
import clarify
county = 'Sarpy'
url = 'http://results.enr.clarityelections.com/NE/Sarpy/51545/184335/en/summary.html'
election_type = 'primary'
def clarify_sarpy():
'''
1. Fetch zipped XML file.
2. Unzip in memory.
3. Load into Clarify.
4. Loop over results, write to file.
'''
# discover path to zipfile and fetch
s = clarify.Jurisdiction(url=url, level='county')
r = requests.get(s.report_url('xml'), stream=True)
z = ZipFile(ioDuder(r.content))
# hand off to clarify
p = clarify.Parser()
p.parse(z.open('detail.xml'))
for result in p.results:
print(result)
Here's the traceback:
Traceback (most recent call last):
File "parse_2014_primary_sarpy_precinct.py", line 132, in <module>
clarify_sarpy()
File "parse_2014_primary_sarpy_precinct.py", line 35, in clarify_sarpy
p.parse(z.open('detail.xml'))
File "/home/cjwinchester/.virtualenvs/clarify/local/lib/python3.4/site-packages/clarify/parser.py", line 48, in parse
self._contests = self._parse_contests(tree, self._result_jurisdiction_lookup)
File "/home/cjwinchester/.virtualenvs/clarify/local/lib/python3.4/site-packages/clarify/parser.py", line 256, in _parse_contests
return [self._parse_contest(el, result_jurisdiction_lookup) for el in contest_els]
File "/home/cjwinchester/.virtualenvs/clarify/local/lib/python3.4/site-packages/clarify/parser.py", line 256, in <listcomp>
return [self._parse_contest(el, result_jurisdiction_lookup) for el in contest_els]
File "/home/cjwinchester/.virtualenvs/clarify/local/lib/python3.4/site-packages/clarify/parser.py", line 283, in _parse_contest
for r in self._parse_no_choice_results(contest_el, result_jurisdiction_lookup, contest):
File "/home/cjwinchester/.virtualenvs/clarify/local/lib/python3.4/site-packages/clarify/parser.py", line 322, in _parse_no_choice_results
subjurisdiction = result_jurisdiction_lookup[subjurisdiction_el.attrib['name']]
KeyError: 'ABSENTEE'
Thanks for taking a look!
Last night ran into an issue with a LiveVoterTurnout XML file that Clarify wouldn't parse.
Basically, each Contest had a Intrastate New Resident
precinct listed. But the master Precincts list didn't include it.
This led to a KeyError:
Traceback (most recent call last):
File "parserClarity.py", line 294, in <module>
p.parse( results_file )
File "/Users/kirkman/.virtualenvs/elections3/lib/python3.7/site-packages/clarify/parser.py", line 51, in parse
self._contests = self._parse_contests(tree)
File "/Users/kirkman/.virtualenvs/elections3/lib/python3.7/site-packages/clarify/parser.py", line 284, in _parse_contests
return [self._parse_contest(el) for el in contest_els]
File "/Users/kirkman/.virtualenvs/elections3/lib/python3.7/site-packages/clarify/parser.py", line 284, in <listcomp>
return [self._parse_contest(el) for el in contest_els]
File "/Users/kirkman/.virtualenvs/elections3/lib/python3.7/site-packages/clarify/parser.py", line 312, in _parse_contest
for c in self._parse_choices(contest_el, contest):
File "/Users/kirkman/.virtualenvs/elections3/lib/python3.7/site-packages/clarify/parser.py", line 370, in _parse_choices
for c_el in contest_el.xpath('Choice')]
File "/Users/kirkman/.virtualenvs/elections3/lib/python3.7/site-packages/clarify/parser.py", line 370, in <listcomp>
for c_el in contest_el.xpath('Choice')]
File "/Users/kirkman/.virtualenvs/elections3/lib/python3.7/site-packages/clarify/parser.py", line 409, in _parse_choice
subjurisdiction = self.get_result_jurisdiction(subjurisdiction_el.attrib['name'])
File "/Users/kirkman/.virtualenvs/elections3/lib/python3.7/site-packages/clarify/parser.py", line 219, in get_result_jurisdiction
return self._result_jurisdiction_lookup[name]
KeyError: 'Intrastate New Resident'
I'm attaching the ZIP file for reference. If I get some time next week I may try to see if I can find a solution and submit a pull request, but wanted to let you know about it.
I'm getting an error when I try to install using pip install clarify
(same error when running pip install git+git://github.com/openelections/clarify.git
):
Collecting clarify
Using cached https://files.pythonhosted.org/packages/ba/c0/8ccd65549a17ca116acdcd93a8279743cb88649dbe1b41eb6d4d6e12e374/Clarify-0.4.0.tar.gz
Complete output from command python setup.py egg_info:
error in Clarify setup command: 'tests_require' must be a string or list of strings containing valid project/version requirement specifiers; Expected version spec in unittest2; python_version < '3.4' at ; python_version < '3.4'
----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /private/var/folders/l4/fb0tgmgs4f9838724x0ybv6w0000gp/T/pip-install-oNS2PN/clarify/
<County>
elements in addition to <Precinct>
.<ElectionVoterTurnout>
in addition to <VoterTurnout>
.jurisdiction
instead of precinct
.level
attribute to Jurisdiction
class.We try to grab the lists of precincts from the VoterTurnout element and then look them up when setting the jurisdiction of vote objects. However, the precinct may not exist in the VoterTurnout entity, but may exist as a Precinct element under a VoteType element. This appears in the results file 20120522__ar__primary__van_buren__precinct.xml.
This happens in Parser._parse_contest() and results in a KeyError.
I've already fixed this in my working copy and am just creating this issue for reference. I'll push the change to GitHub and PyPI when I get home tonight.
Add support for looking up contests by text
:
parser = Parser()
parser.parse(f)
contest = parser.get_contest("U.S. President - DEM").
Add support for looking up precincts by name
.
parser = Parser()
parser.parse(f)
precinct = parser.get_jurisdiction("03 Bradley")
Add support for getting all results for a jurisdiction.
parser = Parser()
parser.parse(f)
precinct = parser.get_jurisdiction("03 Bradley")
results = precinct.results
Like West Virginia, which clarify fails to parse and return any information from.
Recently, when I use clarify
, Python has been spitting out this warning:
UnknownTimezoneWarning: tzname
CDT identified but not understood.
Pass "tzinfos" argument in order to correctly return a timezone-aware datetime.
In a future version, this will raise an exception.
I think it's from this line in clarify:
Line 67 in 12b4eb5
Anyway, it's not an exception yet, but I wanted to bring that to your attention. I don't know if something like this might be a solution, though obviously that specific answer is overkill.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.