Code Monkey home page Code Monkey logo

csvz's People

Contributors

secretgeek avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

doekman 0xflotus

csvz's Issues

csvz-meta-relations composite keys

Sometimes (not often) keys are defined as a combination of two columns. In the columns.csv data it would be possible to identify two columns as being "primary-key". Should this mean that the combination of those two columns constitutes a primary key?

Likewise, how would a foreign key relationship in the relations.csv data targeting that primary key be represented?

Make the spec more meta

I think the specification is now too broad.

Imagine the spec only says there can be meta csv, tables, columns and relations tables (specifying the name). Tool makes then can come up with profiles that describe what data in what meta-tables are put, and what the semantics are. These profiles then can be registered and published in this repository. There could be discussion of course.

You could have an ANSI-96-SQL-IMPORT-EXPORT profile (hope they come up with a better name). But you could also have a MY_OPEN_SOURCE_FORM_APPLICATION profile, that describes how data on forms are validated.

The advantage of tool makers (implementors): you only create was is being used. I do think there should be a csvz implementers forum.

So I propose a more minimalistic base specification, with extensions as profiles. The _meta/csv.csv profile can be built in and be called "localized" profile (if you want to call it that).

What do you think?

How can a .tar.z file full of csvs fit into the csvz specs?

Related to #4 ...

Example:

If someone:

  1. unzipped a compliant .csvz file, to a folder “MyData”
  2. Ran tar -z “MyData” (Todo: correct syntax here to specify output name Eg MyData.csv.t.z ??)

...then what standards would this now comply with?

Suggestion: there could be an optional fragment

csvz-0-tz

...which also invites/allows other mutually exclusive 0 sub standards....

Csvz-0-t

... for a compliant tar that is not gzip’d

Csvz-O-7z

...for a compliant 7z file? (Details needed for such)

MIME type

As adoption inevitably grows, a MIME type should be appropriately registered.

Is .csv.z an acceptable variant?

If I take a single csv file, and then zip it to end up with a .csv.z, is it conformant with the basic standard, or must the file end with a .csvz extension?

Add `csvz-meta-meta`

A file can have

_meta/meta.csv

describing which of the standards you claim conformance with. e.g.

fragment conformance notes
csvz-0 strict this csvz file claims strict adherence with csv-0
csvz-meta-tables strict yes we have a _meta folder with a tables.csv file in it

...If they only claimed those two rules were followed then it would be up to the consumer to read the files and determine for themselves how to make sense of them.)

(suggestion: tools could generate this, or at least a draft f this, and tools can use this for configuring their own expectactions.

Could there be a csvz *folder* without zip?

When you unzip a csvz file, you end up with a folder with csv-files.

Can this be considered a csvz-container?
I think it could be useful. For example, when putting datafiles into git.

To differentiate a csvz-container from a folder containing some csv files, I propose such a folder to have the .csvd extension. So if you extract the my_data.csvz file, you get the my_data.csvd folder.

The folder could have the same extension (thus not introducing a new extension). The disadvantage is you can't extract an .csvz file into the same folder without deleting/moving the original file.

Tools are not expected to open these .csvd-folders directly. It's only to denote that when zipped, it's automatically a .csvz file.

Haven't defined zip file

The specification doesn't specify what a Zip file is.

At the least, there should be a link to either the 2015 ISO standard or alternatively Pkware's technical note (including a version number).

Is there a GZip variant?
Similarly, are there 7Zup, bzip2, and rzip variants?
Is Zip64 supported, to allow more than 4GB files?

csv-meta-columns column ordinal

With columns headers being optional in the 4180 spec, it might also be useful to specify/require the column ordinal in the meta-columns file. This would allow attaching headers to a csv via the column metadata.

For that matter, in the tables metadata would be useful to include a "HasHeaders" boolean column.

`meta-per-file` -- allow individual meta files for each file?

Have you considered using a columns meta file per-table instead of putting all columns into a single csv?

So instead of:
_meta/tables.csv
_meta/columns.csv
states.csv
citites.csv

It would be something like:
_meta/tables.csv
_meta/states_columns.csv
_meta/cities_columns.csv
states.csv
cities.csv

The advantage is that it would be easier to get the schema for a single table.

Update toc

(What is a way to do that automatically in vs code?)

csvz-meta-columns column types

The spec for this file feels rather useless unless some minimal set of standard types is defined. A well-defined schema would allow an database import tool to construct the appropriate table in the database. Without a standard set fallback to "string" would be needed when an unknown type was encountered.

I would propose as a minimum:

  • boolean (true/false, 0/1)
  • int (byte/short/long?)
  • float (float/double)
  • date (datetime)
  • string (might be worth specifying ascii vs unicode)
  • binary (Base64)

Possibly also include:

  • guid
  • time (timespan/duration)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.