Code Monkey home page Code Monkey logo

csvw.org's People

Contributors

canwaf avatar rickmoynihan avatar robsteranium avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

canwaf

csvw.org's Issues

Invite contributors

We currently ask for input about tools but we should also invite other submissions - particularly for guides or examples.

We could put an invite in the footer and perhaps add contributing guidelines to this repo.

Improve rationale for CSVW

One thing I think the site would benefit from is improving the rationale for CSVW a bit more.

The page here is a good start and provides a brief summary of some problems with CSV:

https://swirrl.github.io/csvw.org/guides/why-use-csvw.html

However, crudely summarising (to highlight the issue) the argument for CSVW as presented reads a bit like this:

  1. CSV has a bunch of problems (many dialects, only one datatype etc, parsing issues etc...
  2. but it's open so yay can have 3 star data...
  3. but you really want 5 stars, so you need CSVW!

i.e. we jump straight into the 5 star model and don't describe how CSVW solves any of the stated problems with CSV. I think for most users this is the low hanging fruit. Linking with identifiers, and connecting over the web are definitely benefits, but it would be good to expand on fixing the problems EVERYONE has with CSV first :-)

I think this can largely be solved by riffing on the headings we have on the front page:

Screenshot 2021-10-26 at 10 50 19

before we get into the linked data story. Indeed it might be worth de-emphasising the linked data bits, or separating them out from the low hanging fruit.

Add more tools with brief summary / assessment

Here are some tools I discovered -- though I haven't actually looked at or tried any to any sufficient depth yet:

At some point it would be worth assessing them for completeness etc before offering recommendations or guidance.

Perhaps we could use the wiki to maintain an adhoc list more easily; and use the site as a promoted / vetted list eventually?

Add header:false to grit bin example?

I'm new to looking at CSVW after your ODI Lunchtime Lecture but noted that the example grit bin data from Data Mill North doesn't have a header row. After looking around on the W3C page it looks as though it assumes there will be one header row so would the example lose the first row of grit_bins.csv? If I've understood the W3C page correctly, adding the following would keep the first row:

"dialect": {
    "header": false
}

Is it necessary or worth explicitly including that in the example on either the front page or the
How to make CSVW
page?

Hierarchical (or simply linked/inherited) metadata

Hi! Thanks for the cool site. A few years ago, at a standards-unification conference for biological data, we looked at csv-on-the-web as something we should all adopt. A lot of our data was from repeated experiments, so we started talking about a hierarchical version, i.e. you would:

  1. Create a directory (on a disk, at a URL, in a zip file etc.) with some meta data file indicating it was a special "resource"
  2. Have a meta data file for this directory (e.g. saying "lab = ...") and further meta data files for subdirectories ("experiment type = ..", "cell_type = ...", "temperature = ..."), so that each subdirectory could either add to or overwrite parent directory meta data fields
  3. Finally have the CSV meta data, which "inherits" all the data from the subdirectory it's stored in

Do you know if there have been any efforts like this? Or some other mechanism to achieve similar goals? (I.e. a field in the json that says "please also include all of the stuff at this URI")?

Thanks in advance, sorry for abusing the issue system.

Add guides on consuming CSVW

I think we really need to add some guides to demonstrate and motivate stuff on the consumption side to form a complete story. Perhaps COW and CSVLint can currently fulfill this side?

It would be nice for example to add a guide on consuming the grit bins example, and showing the benefits there in R, python, Ruby etc…

Generally I think the biggest bang per buck for CSVW is supporting a consumer side CSVW profile which just adds basic datatype coercion support to the CSV files themselves. I guess you’d need to support URI templates too, so you could have URI as a datatype.

I'm not sure what tooling currently fulfills this usecase; but I think if CSVW is to succeed this is a missing piece.

Longer term it might be worth categorising tools for publication/consumption/validation/transformation etc.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.