Code Monkey home page Code Monkey logo

i990-and-nih-exporter's Introduction

README

The code in this repository is an exercise in working with public data from two main sources. The first source is IRS form 990, "Return of Organization Exempt from Income Tax." The 990 is an informational form that non-profit organizations (NPO) must file with the Internal Revenue Service (IRS) each year. The completed forms are publicly available records as a matter of federal law. The second source of information used here is details of individual grants from National Institutes of Health (NIH). The NIH Exporter gives access to information about all individual grants made by the NIH. That information includes the direct and indirect moneys paid to grantees each year for each grant.

NIH grant data can be downloaded from the NIH Exporter web tool. The data files are available for each fiscal year since 1985 in either CSV or XML format. I've chosen to do bulk downloads of NIH "Project" data files (in CSV format). See 01-load-data.R for details. Another option would have been to use the NIH REST API documented in a PDF entitled Reporter API Data Elements. But, I consider the bulk download approach I've used here to be simpler and more efficient. With the bulk data on my machine, I then do filtering and variable selection locally using regular dplyr operations. If you're more partial to API-based downloads, you might check out repoRter.nih, an R package by Michael Barr that was released in Feb. 2022. It's at an early stage of development and looks fairly bare-bones, but it will almost certainly save you some headaches.

A convenient place to get IRS 990s is from the ProPublica Nonprofit Explorer. It is necessary to search for the particular NPOs you are interested in, and individually download form 990s for each tax year of interest. The form for a specific NPO/tax-year will be available either as a scanned PDF, if it was filed as a paper document, or as an XML file, if it was filed electronically. All NPOs are required to file electronically from 2019 forward. Another option would be to use the IRS search and bulk download tools.

Possibly useful resources are IRS 990

Notes

Note that this repo does NOT contain the NIH or IRS data files themselves, since they are a bit large, and can be obtained elsewhere. Other public data referenced here includes:

i990-and-nih-exporter's People

Contributors

davebraze avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.