Code Monkey home page Code Monkey logo

datavaluearrays.jl-267bd6e8-c492-5d7c-9aed-5ac0f2f898a0's Introduction

DataValueArrays.jl

THIS PACKAGE IS NO LONGER BEING MAINTAINED. ALL THE FUNCTIONALITY FROM THIS PACKAGE HAS BEEN MOVED INTO https://github.com/davidanthoff/DataValues.jl.

Build Status Build status DataValueArrays codecov.io

Overview

DataValueArrays.jl provides the DataValueArray{T, N} type and its respective interface for use in storing and managing data with missing values.

DataValueArray{T, N} is implemented as a subtype of AbstractArray{DataValue{T}, N} and inherits functionality from the AbstractArray interface.

Missing Values

The central contribution of DataValueArrays.jl is to provide a data structure that uses a single type, namely DataValue{T} to represent both present and missing values. DataValue{T} is a specialized container type that contains precisely either one or zero values. A DataValue{T} object that contains a value represents a present value of type T that, under other circumstances, might have been missing, whereas an empty DataValue{T} object represents a missing value that, under other circumstances, would have been of type T had it been present.

Indexing into a DataValueArray{T} is thus "type-stable" in the sense that getindex(X::DataValueArray{T}, i) will always return an object of type DataValue{T} regardless of whether the returned entry is present or missing. In general, this behavior more robustly supports the Julia compiler's ability to produce specialized lower-level code than do analogous data structures that use a token NA type to represent missingness.

Constructors

There are a number of ways to construct a DataValueArray object. Passing a single Array{T, N} object to the DataValueArray() constructor will create a DataValueArray{T, N} object with all present values:

julia> julia> DataValueArray(collect(1:5))
5-element DataValueArray{Int64,1}:
 1
 2
 3
 4
 5

To indicate that certain values ought to be represented as missing, one can pass an additional Array{Bool, N} argument; any index i for which the latter argument contains a true entry will return an missing value from the resultant DataValueArray object:

julia> X = DataValueArray([1, 2, 3, 4, 5], [true, false, false, true, false])
5-element DataValueArray{Int64,1}:
#NULL
    2
    3
#NULL
    5

Note that the sizes of the two Array arguments passed to the above constructor method must be equal.

DataValueArrays are designed to look and feel like regular Arrays where possible and appropriate. Thus displaying a DataValueArray object prints the values of present entries and #NULL designator for missing entries. It is important to note, however, that there is no such #NULL object, and that indexing into a DataValueArray always returns a DataValue object, regardless of whether the entry at the specified index is missing or present:

julia> X[1]
DataValue{Int64}()

julia> X[2]
DataValue(2)

One can initialize an empty DataValueArray object by calling DataValueArray(T, dims), where T is the desired element type of the resultant DataValueArray and dims is either a tuple or sequence of integer arguments designating the size of the resultant DataValueArray:

julia> DataValueArray(Char, 3, 3)
3x3 DataValueArray{Char,2}:
 #NULL  #NULL  #NULL
 #NULL  #NULL  #NULL
 #NULL  #NULL  #NULL

Indexing

Indexing into a DataValueArray{T} is just like indexing into a regular Array{T}, except that the returned object will always be of type DataValue{T} rather than type T. One can expect any indexing pattern that works on an Array to work on a DataValueArray. This includes using a DataValueArray to index into any container object that sufficiently implements the AbstractArray interface:

julia> A = [1:5...]
5-element Array{Int64,1}:
 1
 2
 3
 4
 5

julia> X = DataValueArray([2, 3])
2-element DataValueArray{Int64,1}:
 2
 3

julia> A[X]
2-element Array{Int64,1}:
 2
 3

Note, however, that attempting to index into any such AbstractArray with a null value will incur an error:

julia> Y = DataValueArray([2, 3], [true, false])
2-element DataValueArray{Int64,1}:
 #NULL
     3      

julia> A[Y]
ERROR: NullException()
 in _checkbounds at /Users/David/.julia/v0.4/DataValueArrays/src/indexing.jl:73
 in getindex at abstractarray.jl:424

DataValueArray Implementation Details

Under the hood of each DataValueArray{T, N} object are two fields: a values::Array{T, N} field and an isnull::Array{Bool, N} field:

julia> fieldnames(DataValueArray)
2-element Array{Symbol,1}:
 :values
 :isnull

The isnull array designates whether indexing into an X::DataValueArray at a given index i ought to return a present or missing value.

History

This package started as a fork of NullableArrays.jl and owes a lot to the fantastic work contributors have done in that repository to get things started.

datavaluearrays.jl-267bd6e8-c492-5d7c-9aed-5ac0f2f898a0's People

Contributors

abhijithch avatar andreasnoack avatar andyferris avatar ararslan avatar bkamins avatar cjprybol avatar davidagold avatar davidanthoff avatar iainnz avatar johnmyleswhite avatar mbauman avatar nalimilan avatar quinnj avatar ranjanan avatar ratanrsur avatar scottpjones avatar tkelman avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.