Code Monkey home page Code Monkey logo

miner's Introduction

miner

Script for downloading, unpacking, and converting online and public (and others to be added soon) datafiles to open format

The goal of this app is to develop a bash interface, modeled after homebrew (brew.sh) that will allow anyone free and open access to data that is available on the web. The hope is to make it possible to do three things easily:

  1. Search sources of data available online
  2. Make data easy to download and enter into a database (of your choice) into a common format
  3. Liberate public data by making it open data on your computer (you can use it in any format rather than proprietary formats (like the US Census, which uses Access)

miner & dat

What is the difference between miner and dat? We don't see ourselves as competitors. Rather, we are working on parallel and complementary projects. Here are what we see as differences:

  • miner focuses on using a formula (map) to get raw data files straight from original sources, download, and process them. dat is focused on building collaborative and version controlled datasets.
  • While miner is in early development, it will be fully operational quickly and aims to be a very small application. dat is still in early development and aims to be a much more robust and comprehensive data collaboration tool.
  • Cleaned data (as dat would allow the sharing of) is great for software projects! However, researchers often need raw datasets to choose cleaning methods and ensure quality.
  • Some data is public but not yet copyleft/open--downloading your own copy is the only legal way to use it. Shared repositories may be allowed privately, but would be difficult to get permission for publicly.
  • miner makes it possible to pull raw data regularly and note when data is changed (sometimes for good and perhaps sometimes for less good reasons.
  • miner might be used as one tool to easily dump into dat.

One way to look at it is that miner exists given today's non-standards-based, mixed license, individually/organizationally hosted dataset world. dat could be seen as the forerunner of the open knowledge / open data world.

miner's People

Contributors

alexanderjfink avatar ericpp avatar

Watchers

 avatar James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.