Code Monkey home page Code Monkey logo

data.okfn.org-new's People

Contributors

acouch avatar anuveyatsu avatar bnvk avatar danfowler avatar davidmiller avatar deiz avatar floppy avatar geraldb avatar harry-wood avatar jamiekt avatar kriskusano avatar lauriej avatar ldodds avatar ljoelle avatar mchelen avatar mikanebu avatar pdehaye avatar peterdesmet avatar pwalsh avatar rufuspollock avatar sanjaypoyzer avatar stevage avatar sxren avatar tfmorris avatar the42 avatar todrobbins avatar tuukka avatar vitorbaptista avatar waldoj avatar yannael avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

data.okfn.org-new's Issues

Recline graph specs to vega view spec code

  • I think we focus on doing this backend with nodejs before code goes to frontend
  • Should be very simple to test
  • Support all recline graph options (see the recline flot graph documentation) - line graph, bar chart (both ways)
    • Multiple series (multiple lines)

How do we handle inlining the "resource" data - do we want to use datapackage-render-js?

¨Data¨ page layout is messed up

Under Country and Regional Analyses (CRA) - UK Government Finances and Hotline SOS Démocratie 2013. in the *Community Datasets list, there are two gigantic lists of sources that should be reduced for the sake of design and page layout.

Looking at it with more attention, it seems both packages have been poorly packed. Even though this is the community part, I think we have to do something about it, as it makes the website unreadable.

http://prntscr.com/b884pm

http://prntscr.com/b884zk

Nice Owner/User page in data section

From @rgrp on June 7, 2014 14:50

Pages like: /data/{username}

Suggest

  • Pull their gravatar and name (from github API)
    • Do we want to cache this?
  • List their data packages (as per normal list)
    • Bonus for allowing quick filtering a la the main listing page

Note that "core" user will be special ... (will need to set their gravatar specialy)

User Stories

As a Data User I want to see all the Data Packages produced by a particular organization or user so that I can find new ones that are relevant to me

  • Especially, useful for core datasets
  • List all datasets (no pagination for now!)

As a Data User I want to see all the Data Package produced by a particular organization or user so that I can get a sense of the quality of their work

Copied from original issue: frictionlessdata/frictionlessdata.io#111

Normalize licenses and license names and display in dataset view

From @rgrp on July 3, 2013 9:6

At the moment not clear exactly what is required for licenses and some of the time we just have ids and other times names and urls. We want to ensure given an id we always have a name and url - we could look this up from licenses.opendefinition.org ...

In terms of the interface we want to also handle the unknown case (should that ever happen!!)

This would be part of the tools datapackage normalize code.

Copied from original issue: frictionlessdata/frictionlessdata.io#55

Datasets Data URLs and API generally

From @rgrp on February 24, 2013 18:26

This issue is about the URL / API structure for accessing data (and metadata) from the data packages.

Current Situation

  • For stuff under /data/: /data/{dataset}/datapackage.json and /data/{dataset}.csv
  • For other stuff either at /tools/view/ or /community/ via: http://data.okfn.org/tools/dataproxy/?url={path-to-csv} (though this is not much different from datapipes.okfnlabs.org/csv/raw/?url=.... and leaves much to be desired)

Proposal

/data/ + /community/ data packages

For /data/ and /community/ data packages:

/.../{dataset}/datapackage.json     # the datapackage.json file

## data urls
/.../{dataset}/r/{resource-name-or-order}.{format}  

so e.g.

/.../gdp/r/annual.csv   # resource name
/.../gdp/r/0.csv           # resource by index

Formats that we should support would be:

  • {format} = csv | json | html | raw (by default)
  • {resource-name} = name as in resources entry. (Also allow order e.g. 1 for first resource, 2 for second resource etc).

Addressing individual elements

Longer-term we could support addressing individual elements e.g. addressing into rows in a dataset or :

.../gdp/r/annual/5/        # row 5 of this dataset, rendered as HTML by default
.../gdp/r/annual/5.csv  # in CSV format
.../gdp/r/annual/5/year/  # cell in row 5, field year (in HTML form by default)

.../{dataset}/r/{resource-name-or-index}/{row-index-or-primary-key}[.html | .csv | .json]
.../{dataset}/r/{resource-name-or-index}/{row-index}/{field-name-or-index}[.html | .csv | .json]

Questions:

  • How do distinguish row index from primary key when both numerical (which takes precedence?) - i'd argue PK should take precedence and we have e.g. i:{number}
    • That said index is always possible whereas primary key may be absent ...
  • Support for ranges - see approach to this in datapipes

Data packages somewhere online

We follow something similar to the other case but instead of data package name in the url we move the data package url to the query string:

/api/datapackage.json?url={datapackage-url}
/api/data/{resource-name-or-index}.{format}?{datapackage-url}

# e.g. this returns first resource as CSV
/api/data/0.csv?url=https://raw.github.com/datasets/browser-stats/master/datapackage.json

Discussion

  • data.json is the serialization in the most obvious way - i.e. convert to a hash
    • alternative provide this in a results style format (and include the schema)
  • Should we use download attribute to set filename ...?
    • Not needed in above
  • (Now supported) How do we handle multiple data resources / files?
    • worry about that in the future - so only support first resource for the moment (this is good as it privileges single resource data packages ...)

Appendix

Alternatives

Alternatively could be:

{dataset}/{filename}.csv
{dataset}/{filename}.json (CORS enabled ...)

Or

{dataset}/data.csv

Think the former is better ...

Copied from original issue: frictionlessdata/frictionlessdata.io#19

Replace Recline with Vega views

Recline spec to Vega

      "id": "Graph",
      "type": "Graph",
      "state": {
        "graphType": "lines",
        "group": "Date",
        "series": [ "VIXClose" ]
      }

=> Vega(-lite)

      "id": "Graph",
      "type": "vega-lite",
      "spec": {
         // appropriate vega spec here ... e.g. for example above a line graph ...
      }

Data version API

From @trickvi on June 20, 2013 12:37

I would like to be able to use data.okfn.org as an intermediary between my software and the data packages it uses and be able to quickly check whether there's a new version available of the data (e.g. if I've cached the package on a local machine).

There are ways to do it with the current setup:

  1. Download the datapackage.json descriptor file, parse it and get the version there and check it against my local version. Problems:
    • This solution relies on humans and that they update their version but there might not be any consistency in it since the data package standard describes the version attribute as: "a version string conforming to the Semantic Versioning requirement"
    • I have to fetch the whole datapackage.json (it's not big I know but why download all that extra data I might not even want)
  2. Go around data.okfn.org and look directly at the github repository. Problems:
    • I have to find out where the repo is, use git and do a lot of extra stuff (I don't care how the data packages are stored, I just want a simple interface to fetch them)
    • What would be the point of data.okfn.org/data? In my mind it collects data packages and provides a consistent interface to get the data packages irrespective of how its stored.

I propose data.okfn.org provides an internal system to allow users to quickly check whether a new version might be released. This does not have to be an API. We could leverage HTTP's caching mechanism using an ETag header that would contain some hash value. This hash value can e.g. be the the sha value of heads ref objects served via the Github API:

https://api.github.com/repos/datasets/cpi/git/refs/heads/master

Software that works with data packages could then implement a caching strategy and just send a request with an If-None-Match header along with a GET request for datapackage.json to either get a new version of the descriptor (and look at the version in that file) or just serve the data from its cache.

Copied from original issue: frictionlessdata/frictionlessdata.io#51

Create dataset groups (tagging)

I'd like to have all datasets relating to an argument grouped together (eg: all datasets related to maritime transportation).
This could be done by adding tags as keywords in the datapackage.json and have pages on the website which can load list of datasets by keyword.
Example: http://data.okfn.org/data/keywords/container would load all datasets related to containers (container codes, IMDG code, and so on)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.