Code Monkey home page Code Monkey logo

pacer_lib's Introduction

More extensive documentation can be found at (http://pacer-lib.readthedocs.org)

Motivation and Overview
=======================
Morning.

[pacer_lib](https://pypi.python.org/pypi/pacer_lib/2.0) is a library that has
been designed to facilitate empirical legal research that uses data from the
[Public Access to Court Electronic Records (PACER)] (http://www.pacer.gov) 
database by providing easy-to-use objects for scraping, parsing and searching
PACER docket sheets and documents.

We developed *pacer_lib* in order to solve problems that arose naturally
during the course of our research and our goal is to make it easy to:

* **download a large number of specific documents from PACER**

    To locate and download multiple files on PACER requires a lot of manual
    labour, so one of the first things that we developed was a way to 
    programatically interface with the PACER Case Locator so that we could
    script all of our docket and document requests.

* **store downloaded dockets in a sensible and scalable way**
    
    PACER charges a fee for every page and document you access. If you have a
    project of any reasonable size and limited means, it becomes extremely 
    important to keep track of what files you have already downloaded (lest you
    inadvertently download the file twice). We create a well-documented folder
    structure and a equally well-documented unique identifier that can be 
    quickly disaggregated into a PACER search query or PACER case number.

* **extract information and create datasets from these dockets**
    
    We needed to create datasets for regression and textual analysis, so we 
    baked in the process of converting relatively unstructured data (.html 
    docket sheets) into more structured data (.csv for docket sheets and .json 
    objects for other meta information).

pacer_lib's People

Contributors

zhangchuck avatar synsypa avatar

Watchers

Sachin avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.