Code Monkey home page Code Monkey logo

fiftystates's Introduction

Overview

About the Project

The goal of the Fifty State Project is to build scrapers and parsers in order to get as much state legislative data as possible in one place.

For details on the reasons for the project and goals behind the project see the project announcement.

To stay up to date and communicate with other contributors to the project visit the Fifty State Project Google Group.

For an editable overview of each state's progress visit the Sunlight Labs Wiki.

Project Goals

  1. Collect URLs of State Legislature and Legislative Information Pages [done]
  2. Grab legislators and legislation
    1. Build scrapers and obtain data files for legislation in each of the fifty states
    2. create sponsor relationship between legislators and legislation
  3. Grab votes
    1. Build scrapers and obtain data files for legislator votes on legislation
    2. create voting relationship between legislators and legislation
  4. Build tools on top of data

Usage (proposed)

Valid options:
  • --year: a year or years the parser should attempt
  • --all: Attempt to parse years from 1969-2009
  • --upper: Parse upper chamber
  • --lower: Parse lower chamber
The vision is that the flow will look something like this:

$ ./scripts/nc/get_legislation --year=2009 --upper

Contributing

If you are interested in contributing the recommended procedure is to check on the Sunlight Labs Wiki and in the repository to see where your state is. The next step is to announce your interest on the Fifty State Project Google Group (this is where you can ask questions and make suggestions regarding the project).

Managing a State

Once you have claimed a state on the wiki and mailing list you should probably maintain your own fork of the project on github.

Please avoid making changes to files in other states/etc. on your state branch. Stick to editing files in the scripts/your_state directory and where necessary in any relevant utils directories.

Whenever your state script works as it should announce it on the mailing list and someone will merge your changes into the core.

Licensing

As of June 15th 2009 the Fifty State Project is licensed under the GPLv3 license

See LICENSING for the full terms of the GPLv3.

Requirements

Although we have previously allowed you to write parsers in your language of choice, for the sake of maintenance we highly encourage you to write your parsers in Python. Currently Python is the only language we are supporting with our documentation and tools. If you would like to contribute in a language other than Python, please send an email to Fifty State Project Google Group so we can discuss the issue.

For details on how scripts should be written and how they should run see scripts/pyutils/README.

If you are completely unfamiliar with Python there is other things you can do to help with the government transparency movement. Ruby developers are encouraged to work on the Congrelate Project. For other project ideas please join the Sunlight Labs Google Group.

Dependencies

  • Python (2.5+)
  • BeautifulSoup
  • html5lib
  • simplejson if on Python 2.5
  • (this list is out of date, refer to specific scripts/state directories for dependencies)

fiftystates's People

Contributors

csun-sait avatar jamesturk avatar jdunck avatar konklone avatar mikejs avatar rshapiro avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

leword

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.