Code Monkey home page Code Monkey logo

data-juggler's People

Contributors

arrayout avatar caesarsol avatar dependabot[bot] avatar glippi avatar ilariaventurini avatar lucafalasco avatar nofishlikeian avatar serenag avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

data-juggler's Issues

TODO: data-juggler 1.0

const datasetRaw = [
  { height: 190, gender: 'male', timeOfMeasure: 1552397833139 },
  { height: 170, gender: 'female', age: 22, timeOfMeasure: 1552397832139 },
  { height: 164, gender: 'female', age: 20, timeOfMeasure: 15523912333139 },
  { height: 176, gender: 'female', age: 12 }
]

function autoTypeColumn(datasetColumn) {
  if (...) return { type: 'continuous', max: 100, min: 0, nullable: true }
  else if (...) return { type: 'categorical', enum: ['male', 'female', 'neutral'], nullable: false }
  else if (...) return { type: 'date', max: 000, min: 000, readFormat: 'Y-m-d', nullable: true }
}

function autoCleanData(dataset, missingData = []) {
  /*
    - Transform missing data in `null` (do it also for undefined)
    - Transform strings containing numbers to Numbers
    - Missing days or temporal periods? Maybe add nulls
  */
  return [...]
}

const columnTypes = {
  // height: { type: 'continuous', min: 164, max: 190 },
  // gender: { type: 'categorical', enum: ['male', 'female', 'neutral'], nullable: false },
  // age: { type: 'continuous', ... },
  // timeOfMeasure: { type: 'date', max: 000, min: 000, readFormat: 'Y-m-d', nullable: true },
  
  [x]: autoTypeColumn(data.map(d => d[x])),
}

const dataset = dataJuggle({
  dataset: autoCleanData(datasetRaw, ['', 'N/A']), 
  columnTypes,
  formatters: null, // Will be added the default ones for each data type
  // virtualColumns, // FUTURE
})

dataset[0] === {
  height: { raw: 190, scaled: 1 },
  timeOfMeasure: { ... },
}```

Compilation problem using `commonjs`/`es6`

Inside tsconfig.json there is an attribute called module: "". Its value can be commonjs, es6, ...

If in data-juggler the value is module: "commonjs", you can:

  • yarn test
  • yarn build
  • npx ts-node ./benchmark-test/create-dataset.ts
  • npx ts-node ./benchmark-test/main.ts

but if you run views, then you have this error:

referenceError

So, change module: "commonjs" in module: "es6" then:

  • yarn build
  • run views.

Example of integration with mobx-state-tree?

Hello sir, how do I use this library with the really popular library mobx-state-tree?

Here is an example of what I'm doing today

export const Data = t
  .model('Data', {
    values: t.frozen(VALUES as ValueType[]),
  })

export const State = t
  .model('State', {
    data: t.optional(Data, {}),
    // other substates...
  })

const state = State.create({})

First steps... πŸšΆβ€β™‚οΈ

Design going forward

The library is coming along and this is a way to gather some feedback before implementing the more advanced features.

The purpose... of the library is to force us to define the properties of the data we would like to propagate in the application in one point and to abstract a bit of getters and scaling. If you go into a project you know that every component will have access to all the properties of the datum, included a scaled value between 0 and 1, which could become the standard. No need to browse an ad hoc dataStore, because you know that the properties could have been declared in only one point.

This library could solve some issues like, as an example, you plotted another ugly scatterplot and you have to make a tooltip but you did not propagate the raw data or the formatted data so now you have to deal with changing the code in multiple points. With this library (in theory) either you have it already inside, or you just added as a property at the source.

So far... the library works with three basic inputs, two necessary, an instance of CSV like data and an object which indicates the variables "meta-data", and a, optional, set of functions which will define custom datum properties. Exempli gratia:

 data = [
  { height: 190, gender: 'male', timeOfMeasure: 1552397833139 },
  { height: 170, gender: 'female', age: 22, timeOfMeasure: 1552397832139 },
  { height: 164, gender: 'female', age: 20, timeOfMeasure: 15523912333139 },
  { height: 176, gender: 'female', age: 12 }
];

types = {
  height: 'continuous',
  gender: 'categorical',
  age: 'continuous',
  timeOfMeasure: 'date'
};

formatter = {
  height: [{
    property: 'feet',
    compute: (datum) => datum * 0.0328084
  },
  {
    property: 'rescaled',
    compute: (datum, min, max) => datum / max
  }],
  timeOfMeasure: [{
    property: 'year',
    compute: (day) => day.format('YYYY')
  }]
}

// dataStoreFactory is imported eventually
const storeStateAndVariousNames = dataStoreFactory(data, types, formatter)

This gives you nice properties. Getters per "column"...

storeStateAndVariousNames.height[0] // { raw: 190, scaled: 1, feet: 6,233596, rescaled: 1},
storeStateAndVariousNames.timeOfMeasure[0] //  { dateTime: dayjs.Dayjs(this.raw) , isValid: true, iso: '2019-03-12T14:37:13+01:00', raw: 1552397832139, scaled: 1 }

... some stats ...

storeStateAndVariousNames.stats.height // { min: 164, max: 190 }
storeStateAndVariousNames.stats.gender // { frequencies: { male: 1, female: 3 } }

... and more coming up hopefully.

Going forward... there are certain things that we would like to implement and that we are certain will come:

  • Virtual columns computed with a function that takes as only argument the whole "row", in the fashion of the formatter object ( as proposed by @YeasterEgg )

  • Detaching from mobx-state-tree

The whole issue is that I would like to know what use cases and issues you had in your projects that you would like to see implemented and that would speed up your process. As an example, do you want to pass a fetching function in order to abstract this junk?

[ENH] Logarithmic scale

The datum should also contain a:+

{...a, logScaled: number}

being the scaled logarithmic and normalised value

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.