Code Monkey home page Code Monkey logo

census2016.datapack's Introduction

Census2016.DataPack

Extended R package for the 2016 Australian Census

This is the development site for an R package containing all 2016 Census data released by the ABS through its data packs. The data will be too large to host through CRAN.

Motivation

I think the best way to explain my motivation for producing this package is to show you a variable name from Table G38 of the data pack:

Se_d_r_or_t_h_t_Tot_NofB_0_ib

There are two problems with the data packs:

  1. The variable names are arcane.
  2. The data is not tidy: subtotals and subvariables lurk among the variable names.

The goals of this package are:

  1. To tidy the data so that the tables are normalized.
  2. To provide at all costs readable variable names.
  3. Predictable table names and structure to support autocompletion.

Specification

Table names

  1. Measure columns are in CamelCase, with an optional suffix for upper/lower bounds (.min and .max).
  2. All table names:
    1. start with [A-Z0-9]{3}, representing the geographic extent of the key (e.g. tables starting with LGA are summaries of Local Government Areas, those starting with STE are summaries of states/territories)
    2. followed by two underscores
    3. followed by the names of the measure columns in CamelCase separated by underscores
    4. and finish with an underscore and the value column name (unless the value column is persons, in which case it is omitted).
  3. The measure columns are in alphabetical order (except for subitems).
  4. The value columns are in lower snakecase. (TODO)
  5. Tables never contain subtotals.
  6. Tables are ordered by the key and then by the measure columns.

In addition:

  1. The package tarball should be under 100 MB (so that it can be uploaded to a drat repository there).

census2016.datapack's People

Contributors

ellisp avatar hughparsonage avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

census2016.datapack's Issues

FamilyComposition should be reordered

Currently

[1] Couple family with no children  Couple family with children     One parent family with children
[4] Other family                    Lone-person household           Group household                
6 Levels: Couple family with no children < Couple family with children < ... < Group household

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.