Code Monkey home page Code Monkey logo

clarify's People

Contributors

carbonphyber avatar dlu avatar dwillis avatar ghing avatar gphemsley avatar tamilyn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

clarify's Issues

_parse_url not producing the correct report detail URLs

Here is my simple script to try and pull unofficial precinct results from the 2015 governors race from the KY Clarity site.

import clarify
import wget
import os
import zipfile

import zipfile
import clarify
import wget
import os

j = clarify.Jurisdiction(url='http://results.enr.clarityelections.com/KY/15261/30235/en/summary.html', level='state')
subs=j.get_subjurisdictions()

name='detail'
end='.zip'

def getkyresults():
for x in range(0,len(subs)):
    print subs[x].report_url('txt')
    wget.download(subs[x].report_url('txt'))
    hm=zipfile.ZipFile('detailtxt.zip')
    zipfile.ZipFile.extractall(hm)
    os.rename('detailtxt.zip', name+str(x)+end)
    os.rename('detail.txt', name+str(x)+'.txt')

You can see that file 4 is a BadZipfile. This is because its a link to a 404 page.

Add support for current_ver

The clarity web pages now seem to request a url like

http://results.enr.clarityelections.com/CO/63746/current_ver.txt?rnd=0.8333623006146083

in order to find out what the most recent version of the data is. It returns a single number, like "183878", which is then used to find the most recent version of the data, e.g.

http://results.enr.clarityelections.com/CO/63746/183878/Web01/en/summary.html#

The Clarify library should have a method to return that url or the like, so that the most recent state-wide xml detail file can be retrieved.

Stricter validation of constructor parameters

I find libraries that raising Exceptions early and often when users pass in invalid/unsupported much more pleasant to work with.

I'm developing a partial rewrite of the jurisdiction.py constructor to raise exceptions if:

  • the url parameter is invalid
  • the string is not a URL
  • the url string is not from a supported hostname

Since raising Exceptions in a potential future version where none are raised now is a breaking change, I'm submitting this Issue to gather feedback before submitting the PR.

Fix broken unit test for parser

Does anyone have insight into why this Parser unit test is erroring?

======================================================================
FAIL: test__get_or_create_result_jurisdiction (test_parser.TestParser)
----------------------------------------------------------------------
Traceback (most recent call last):
  File ".../clarify/tests/test_parser.py", line 72, in test__get_or_create_result_jurisdiction
    self.assertEqual(parser._result_jurisdictions, [ result_jurisdiction ])
AssertionError: Lists differ: [Resu[154 chars]rted=None, precincts_reporting_percent=None, level='precinct')] != [Resu[154 chars]rted=None, precincts_reporting_percent=None, level='county')]

First differing element 0:
Resul[152 chars]orted=None, precincts_reporting_percent=None, level='precinct')
Resul[152 chars]orted=None, precincts_reporting_percent=None, level='county')

Diff is 892 characters long. Set self.maxDiff to None to see it.

----------------------------------------------------------------------

I have looked into trying to fix it but can't figure out if the left side or right side of the assertion is undesirable.

Add CLI interface

While the original use case didn't want to consider how to download and unpack the XML results, I think it will make it a lot easier for volunteers to deal with counties or states that use Clarity systems for their results if they could just run a command to download the results as CSV and then use simpler scripts or manually update the data to do any post-processing.

@chagan and I started working on this at the #NICAR17 hackathon.

Tasks

  • Figure out which values from the Clarity detail XML to include in CSV and how to present them (e.g. # of precincts reporting)
  • Test! Test! Test! At the very least we should have an integration test for both a state and county XML file.
  • Write utility function for parsing jursidiction levels from Clarity URLS (@chagan)
  • Write utility function for retrieving lowest jurisdiction level from Clarity URL (@chagan)
  • Figure out why there are rows with no vote counts when calling clarify results (@ghing)
  • Update CLI to pull levels from url

KeyError when attempting to access Parser

(moved from #13)

Looking at precinct-level results from the 2014 and 2016 primaries in Sarpy County, Nebraska -- in both cases, I'm getting a vote type KeyError when I try to access the Parser instance. I'm running Python 3.4.

The XML file for the 2014 primary is here.

Here's the code I'm stepping through for the 2014 results, up to the point where it breaks:

from __future__ import print_function
from zipfile import ZipFile

try:
    from StringIO import StringIO as ioDuder
except ImportError:
    from io import BytesIO as ioDuder

import unicodecsv
import requests
import clarify


county = 'Sarpy'

url = 'http://results.enr.clarityelections.com/NE/Sarpy/51545/184335/en/summary.html'

election_type = 'primary'

def clarify_sarpy():
    '''
    1. Fetch zipped XML file.
    2. Unzip in memory.
    3. Load into Clarify.
    4. Loop over results, write to file.
    '''

    # discover path to zipfile and fetch
    s = clarify.Jurisdiction(url=url, level='county')
    r = requests.get(s.report_url('xml'), stream=True)
    z = ZipFile(ioDuder(r.content))

    # hand off to clarify
    p = clarify.Parser()
    p.parse(z.open('detail.xml'))

    for result in p.results:
        print(result)

Here's the traceback:

Traceback (most recent call last):
  File "parse_2014_primary_sarpy_precinct.py", line 132, in <module>
    clarify_sarpy()
  File "parse_2014_primary_sarpy_precinct.py", line 35, in clarify_sarpy
    p.parse(z.open('detail.xml'))
  File "/home/cjwinchester/.virtualenvs/clarify/local/lib/python3.4/site-packages/clarify/parser.py", line 48, in parse
    self._contests = self._parse_contests(tree, self._result_jurisdiction_lookup)
  File "/home/cjwinchester/.virtualenvs/clarify/local/lib/python3.4/site-packages/clarify/parser.py", line 256, in _parse_contests
    return [self._parse_contest(el, result_jurisdiction_lookup) for el in contest_els]
  File "/home/cjwinchester/.virtualenvs/clarify/local/lib/python3.4/site-packages/clarify/parser.py", line 256, in <listcomp>
    return [self._parse_contest(el, result_jurisdiction_lookup) for el in contest_els]
  File "/home/cjwinchester/.virtualenvs/clarify/local/lib/python3.4/site-packages/clarify/parser.py", line 283, in _parse_contest
    for r in self._parse_no_choice_results(contest_el, result_jurisdiction_lookup, contest):
  File "/home/cjwinchester/.virtualenvs/clarify/local/lib/python3.4/site-packages/clarify/parser.py", line 322, in _parse_no_choice_results
    subjurisdiction = result_jurisdiction_lookup[subjurisdiction_el.attrib['name']]
KeyError: 'ABSENTEE'

Thanks for taking a look!

KeyError when precinct isn't listed under <precincts>

Last night ran into an issue with a LiveVoterTurnout XML file that Clarify wouldn't parse.

Basically, each Contest had a Intrastate New Resident precinct listed. But the master Precincts list didn't include it.

This led to a KeyError:

Traceback (most recent call last):
  File "parserClarity.py", line 294, in <module>
    p.parse( results_file )
  File "/Users/kirkman/.virtualenvs/elections3/lib/python3.7/site-packages/clarify/parser.py", line 51, in parse
    self._contests = self._parse_contests(tree)
  File "/Users/kirkman/.virtualenvs/elections3/lib/python3.7/site-packages/clarify/parser.py", line 284, in _parse_contests
    return [self._parse_contest(el) for el in contest_els]
  File "/Users/kirkman/.virtualenvs/elections3/lib/python3.7/site-packages/clarify/parser.py", line 284, in <listcomp>
    return [self._parse_contest(el) for el in contest_els]
  File "/Users/kirkman/.virtualenvs/elections3/lib/python3.7/site-packages/clarify/parser.py", line 312, in _parse_contest
    for c in self._parse_choices(contest_el, contest):
  File "/Users/kirkman/.virtualenvs/elections3/lib/python3.7/site-packages/clarify/parser.py", line 370, in _parse_choices
    for c_el in contest_el.xpath('Choice')]
  File "/Users/kirkman/.virtualenvs/elections3/lib/python3.7/site-packages/clarify/parser.py", line 370, in <listcomp>
    for c_el in contest_el.xpath('Choice')]
  File "/Users/kirkman/.virtualenvs/elections3/lib/python3.7/site-packages/clarify/parser.py", line 409, in _parse_choice
    subjurisdiction = self.get_result_jurisdiction(subjurisdiction_el.attrib['name'])
  File "/Users/kirkman/.virtualenvs/elections3/lib/python3.7/site-packages/clarify/parser.py", line 219, in get_result_jurisdiction
    return self._result_jurisdiction_lookup[name]
KeyError: 'Intrastate New Resident'

I'm attaching the ZIP file for reference. If I get some time next week I may try to see if I can find a solution and submit a pull request, but wanted to let you know about it.

lincoln-county.zip

Install fails

I'm getting an error when I try to install using pip install clarify (same error when running pip install git+git://github.com/openelections/clarify.git):

Collecting clarify
  Using cached https://files.pythonhosted.org/packages/ba/c0/8ccd65549a17ca116acdcd93a8279743cb88649dbe1b41eb6d4d6e12e374/Clarify-0.4.0.tar.gz
    Complete output from command python setup.py egg_info:
    error in Clarify setup command: 'tests_require' must be a string or list of strings containing valid project/version requirement specifiers; Expected version spec in unittest2; python_version < '3.4' at ; python_version < '3.4'
    
    ----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /private/var/folders/l4/fb0tgmgs4f9838724x0ybv6w0000gp/T/pip-install-oNS2PN/clarify/

Support parsing state summary

  • Look for <County> elements in addition to <Precinct>.
  • <ElectionVoterTurnout> in addition to <VoterTurnout>.
  • Change models to be more general, jurisdiction instead of precinct.
  • Add level attribute to Jurisdiction class.

Precinct lookup fails when precinct not in VoterTurnout element

We try to grab the lists of precincts from the VoterTurnout element and then look them up when setting the jurisdiction of vote objects. However, the precinct may not exist in the VoterTurnout entity, but may exist as a Precinct element under a VoteType element. This appears in the results file 20120522__ar__primary__van_buren__precinct.xml.

This happens in Parser._parse_contest() and results in a KeyError.

I've already fixed this in my working copy and am just creating this issue for reference. I'll push the change to GitHub and PyPI when I get home tonight.

Parser API enhancements

Add support for looking up contests by text:

parser = Parser()
parser.parse(f)
contest = parser.get_contest("U.S. President - DEM").

Add support for looking up precincts by name.

parser = Parser()
parser.parse(f)
precinct = parser.get_jurisdiction("03 Bradley")

Add support for getting all results for a jurisdiction.

parser = Parser()
parser.parse(f)
precinct = parser.get_jurisdiction("03 Bradley")
results = precinct.results

UnknownTimezoneWarning

Recently, when I use clarify, Python has been spitting out this warning:

UnknownTimezoneWarning: tzname
CDT identified but not understood.  
Pass "tzinfos" argument in order to correctly return a timezone-aware datetime. 
In a future version, this will raise an exception.

I think it's from this line in clarify:

return dateutil.parser.parse(tree.xpath('/ElectionResult/Timestamp')[0].text)

Anyway, it's not an exception yet, but I wanted to bring that to your attention. I don't know if something like this might be a solution, though obviously that specific answer is overkill.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.