Code Monkey home page Code Monkey logo

Comments (15)

JoshData avatar JoshData commented on July 28, 2024

Hi. I'm not seeing that. Can you provide the command line args you're using and also an example filename? Thanks!

from congress.

swt83 avatar swt83 commented on July 28, 2024

So out of many examples, one is data/108/votes/2004/h405/data.json. If I do a search of the document, one of the legislator ids will be "0000000". I scraped it last night using ./run votes --congress=108 --session=2004 --force.

from congress.

konklone avatar konklone commented on July 28, 2024

Yeah, I see it. Run ./run votes --vote_id=h405-108.2004 and then look at data/108/votes/2004/h405. One of the voters is:

{
  "display_name": "Butterfield", 
  "id": "0000000", 
  "party": "D", 
  "state": "NC"
}

The 0's appear in the original data:
http://clerk.house.gov/evs/2004/roll405.xml

Something to report to the Clerk, I think.

from congress.

JoshData avatar JoshData commented on July 28, 2024

Ahha. I hadn't scraped that far back. On GovTrack I used to fall back to name/state (so, fwiw, the data is complete there: http://www.govtrack.us/data/us/108/rolls/h2004-405.xml).

from congress.

konklone avatar konklone commented on July 28, 2024

Okay. I'll report it to the Clerk.
On Feb 28, 2013 9:07 AM, "Joshua Tauberer" [email protected] wrote:

Ahha. I hadn't scraped that far back. On GovTrack I used to fall back to
name/state (so, fwiw, the data is complete there:
http://www.govtrack.us/data/us/108/rolls/h2004-405.xml).


Reply to this email directly or view it on GitHubhttps://github.com//issues/46#issuecomment-14237448
.

from congress.

JoshData avatar JoshData commented on July 28, 2024

I committed a check for 0000000. Maybe we want to make it an error condition?

from congress.

konklone avatar konklone commented on July 28, 2024

OK, I've made 0000000 an error condition. I also made improperly parsing the legis-num an error condition, and added "MOTION" to the list of acceptable values it can have.

(The ticket can stay open 'til the data's fixed.)

from congress.

konklone avatar konklone commented on July 28, 2024

This hasn't been fixed yet, and I've confirmed (myself, and with the Clerk) that it only affects this one person, and only a specific time frame: vote No. 405 through No. 544. (The first vote he took following his special election, until the end of that Congress.)

Given that, should I simply hardcode a fix in the scraper for that value for that time?

from congress.

dwillis avatar dwillis commented on July 28, 2024

I vote yes.

from congress.

konklone avatar konklone commented on July 28, 2024

Yeah, I think it can only reduce the amount of error in the scraper's output, even in the long run. I'll do this.

from congress.

swt83 avatar swt83 commented on July 28, 2024

But if they make the same error w/ a different member, then we won't be able to catch it.

from congress.

konklone avatar konklone commented on July 28, 2024

Well, we'd catch it the same way we caught this. And right now, this is causing a big swathe of invalid data. It seems unlikely to happen for another member, especially since we now know the cause - that the guy was specially elected mid-session. So as long as we only do it for House votes between these two numbers in that year, the only way it'll fail us is if it develops for someone else during that specific time period. So the worst case is we'll be in the same situation we're in right now, and the best (and most likely) case is it's all fixed.

from congress.

GPHemsley avatar GPHemsley commented on July 28, 2024

Did they give any indication that they would fix the issue upstream?

from congress.

konklone avatar konklone commented on July 28, 2024

Yes, but only "at some point".

from congress.

JoshData avatar JoshData commented on July 28, 2024

This still isn't fixed upstream, btw.

I've replaced the previous fix with a more generic name lookup in 08f4025. (Through the 107th Congress there were no bioguide IDs listed for anyone!)

from congress.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.