Code Monkey home page Code Monkey logo

csv's Introduction

CSV

CSV is a universal JavaScript CSV parser designed specifically to be simple, fast, and spec compliant.

GitHub Release NPM Release Bundlephobia Latest Status Release Status

Discord

Features

  • RFC Compliant
  • ECMAScript Module
  • Typescript Compatible

Imports

This package works isomorphically in browser and server-side JavaScript

Browser

Import directly from the local path or a CDN

<script type="module">
import { parse } from 'path/to/csv/index.js'
</script>

The minified version can be imported from

<script type="module">
import { parse } from 'path/to/csv/index.min.js'
</script>

Node

Install the package

npm install @vanillaes/csv

Import using the module path

import { parse } from '@vanillaes/csv'

Usage

CSV.parse()

Takes a string of CSV data and converts it to a 2 dimensional array of [entries][values]

Arguments

CSV.parse(csv, {options}, reviver(value, row, col)) : [entries][values]

  • csv - the CSV string to parse
  • options
    • typed - infer types (default false)
  • reviver1 - a custom function to modify the output (default (value) => value)

1 Values for row and col are 1-based.

Example

const csv = `
"header1,header2,header3"
"aaa,bbb,ccc"
"zzz,yyy,xxx"
`;
const parsed = parse(csv)
console.log(parsed);
> [
>   [ "header1", "header2", "header3" ],
>   [ "aaa", "bbb", "ccc" ],
>   [ "zzz", "yyy", "xxx" ]
> ]

CSV.stringify()

Takes a 2 dimensional array of [entries][values] and converts them to CSV

Arguments

CSV.stringify(array, {options}, replacer(value, row, col)) : string

  • array - the input array to stringify
  • options
    • eof - add a trailing newline at the end of file (default true)
  • replacer1 - a custom function to modify the values (default (value) => value)

1 Values for row and col are 1-based.

Example

const data = [
  [ "header1", "header2", "header3" ],
  [ "aaa", "bbb", "ccc" ],
  [ "zzz", "yyy", "xxx" ]
];
const stringified = stringify(data)
console.log(stringified);
> "header1,header2,header3"
> "aaa,bbb,ccc"
> "zzz,yyy,xxx"

Typescript

Typings are generated from JSDoc using Typescript. They are 100% compatible with VSCode Intellisense and will work seamlessly with Typescript.

csv's People

Contributors

coltonehrman avatar evanplaice avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

csv's Issues

Define Modules

How should the code be structured and separated into different modules?

Considerations:

  • keep them context-specific (ex map API endpoint to module)
  • optimize for tree-shaking

Feature request: support options for custom separator and delimiter values

The issue

"CSV" files created on windows operating systems, will contain a number of different separators and delimiters, depending on the regional settings of the host. This also applies to certain Linux distributions (looking at you ubuntu).
In the current implementation only the comma separator(,) and double-quote delimiter(") are supported, as stated in the RFC.

requested feature

It should be possible to define custom separator and delimiter values, via the option object. These values should be optional and default to comma and double-quote.

Ignore Leading/Trailing Whitespace in Values

Currently, the parser isn't very forgiving when it comes to excess whitespace

Changes

  • ignore leading whitespace on unquoted values
  • ignore trailing whitespace on unquoted values
  • ignore leading whitespace on quoted values
  • ignore trailing whitespace on quoted values
  • full test coverage for the new functionality

How to use with node <13.2

Readme mentions A CommonJS bundle is included for backward compatible with node <= 13.2 but package does not even install on node <13.2 yarn add gives error of node version.

So how are we supposed to use this package on node < 13.2

Thanks.

Newline at end of file should be optional?

Hi - thanks for sharing your great CSV handling code.
One thing - I think the spec allows for not having a newline at the end of the file:
2. The last record in the file may or may not have an ending line
break. For example:
aaa,bbb,ccc CRLF
zzz,yyy,xxx
but I don't think your code does yet. Hopefully easily fixed.

Implement Optional Type Inference

By default, CSV is stringly-typed. Ie everything is a string. Since stringly-typed values aren't very useful in a software context, this feature exists to add automatic type inference to the parser.

Changes

  • add a private inferType() function
  • add options.typed to CSV.parse() [default: false]
  • hook inferType() into the parser

Specifics:

The type inference castToScalar() function from the jquery-csv package can be reused here.

castToScalar: function (value, state) {
        var hasDot = /\./;
        if (isNaN(value)) {
          return value;
        } else {
          if (hasDot.test(value)) {
            return parseFloat(value);
          } else {
            var integer = parseInt(value);
            if (isNaN(integer)) {
              return null;
            } else {
              return integer;
            }
          }
        }
      }

For readability, it's probably a good idea to refactor this as a switch statement.

Source: $.castToScalar()

Implement 'replacer' on CSV.stringify()

The replacer allows users to hook into the formatter with their own function.

It should expose the

  • value being formatted
  • row of the value being formatted
  • column of the value being formatted

Changes

  • add row/col state tracking to CSV.stringify
  • hook the optional function call into the formatter

Add Runtime Input Checks

Check the inputs to ensure they're the correct types. Throw if not

Checklist

  • CSV.parse inputs
  • CSV.stringify inputs

Implement 'reviver' on CSV.parse()

The reviver parameter allows users hook a custom function into the parser.

It should provide the following

  • the value being parsed
  • the row of the value being parsed
  • the column of the value being parsed

The row/col states tracking is already included in the parser.

Changes

  • hook the reviver into the parser implementation

Documentation for reviver row and col parameters

README.md documents order of reviver arguments as

reviver(value, col, row)

but implementation seems to pass value, row, col, at least for CSV.parse.

Also, documentation doesn't say whether row and col are 1-based or 0-based. Seems to be 1.

Define API

What major endpoints should this expose?

Considerations:

  • simple is ๐Ÿ‘
  • familiar is ๐Ÿ‘
  • intuitive is ๐Ÿ‘

Parse raises `CSVError: Illegal state` on comma in quoted field. (Not RFC compliant?)

Hi all! This is good work and I'm happy to find a pure-ES CSV parser.

I have some data I'm trying to parse, but it throws errors. Here is an example:

export var town06_csv = `
x,y,z,pitch,yaw,roll,formatted
586.8056030273438, -10.063207626342773, 0.29999998211860657, 0.0, -179.58056640625, 0.0, "[586.806, -10.063, 0.300, 0.000, -179.581, 0.000]"
584.8312377929688, -13.57775592803955, 0.29999998211860657, 0.0, -179.58056640625, 0.0, "[584.831, -13.578, 0.300, 0.000, -179.581, 0.000]"
`

Note the last entry is "[586.806, -10.063, 0.300, 0.000, -179.581, 0.000]", i.e. a single entry enclosed in double quotes. I import this CSV string and the parser, and run parse like this:

<script type="module">
  import { town06_csv } from './raw_spawn_data.js';
  import { parse } from './escsv.js'
  var town06_spawnpoints = parse(town06_csv);
</script>

But we get an error Uncaught Error: CSVError: Illegal state [row:3, col:7]! Per RFC4180 ( https://www.rfc-editor.org/rfc/rfc4180#section-2 ), this should be allowed, because the field is enclosed by a double quote. As far as I can tell, there is no error in my CSV, and it parses correctly in other parsers, e.g. in OpenOffice:

image

Specifically, it throws the error on line 67 (case 2, un-delimited input, defaults to state 4.)

Stringify: Double quotes wrongly serialized

Hi, great library!

Found an issue when serializing a field with more than one double quote.
For instance a value of John Dwayne "The Rock" Johnson is serialized as "Dwayne ""The Rock" Johnson".
When trying to parse that with CSV.parse it throws CSVError: Illegal state.

Code snippet:
const CSV = require('@vanillaes/csv'); const data = [ [ "header1", "header2", "header3" ], [ "aaa", "bbb", 'Dwayne "The Rock" Johnson' ], [ "zzz", "yyy", "xxx" ] ]; const stringified = CSV.stringify(data); const rows = CSV.parse(stringified);

Throws:
Error: CSVError: Illegal state [row:2, col:3]

TypeScript "circular definition" error on import

When importing csv-es into a TypeScript project via import CSV from 'csv-es the following errors appear at compile time:

ERROR in node_modules/csv-es/index.d.ts(28,14): error TS2303: Circular definition of import alias 'parse'. node_modules/csv-es/index.d.ts(29,14): error TS2303: Circular definition of import alias 'stringify'.

Scaffolding

What sort of repo maintenance/tooling should this project include?

Testing

tape.js + tap-spec

Tape was a great choice for jquery-csv.

It:

  • is small
  • has a minimal API
  • outputs in TAP format, which can be piped into a ton of different transforms
  • compatible with ES Modules (with workarounds)

Tape-es for tape w/ ESM support

Should code-coverage be included?

No

Linting

SemiStandard

Continuous Integration

TravisCI works well for checking if the build is broken but this package should include auto-publish to NPM on tag push.

Circle-CI is a pretty good alternative.

GitHub Actions

Minification

None

Implement CSV.stringify() options.headers

HOLD: This feature is on hold, I'm not 100% convinced it's a necessary addition

If set, the options.headers should exclude the first row of values if it's set to true.

Changes

  • include the headers option
  • hook it into the formatter

Bug with single column csv text not ending a trailing line break

Hi, it seems there is a bug where single column csv content that does not end with a line break gets the last "cell" value deleted when parsed. I have created a PR where I updated the tests to expose the issue, but do not have enough knowledge to fix the issue. I have tried updating the code at https://github.com/vanillaes/csv/blob/main/index.js#L104-L107 to honour flushing of the last value using if (ctx.value !== '') instead but that seems to break some other tests.

Any ideas on how to fix this would be appreciated.

Cheers

Add to contributors

@coltonehrman Add yourself to the contributors list and -- if you want -- the license.

BTW, package.json only allows one author/project so I added myself as a contributor instead.

Implement CSV.parse() options.headers

HOLD: This feature is on hold, I'm not 100% convinced it's a necessary addition

If set options.headers should take the headers into account. If type inference is set, the headers row should be skipped.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.