Code Monkey home page Code Monkey logo

Comments (6)

karimhm avatar karimhm commented on July 30, 2024

The following link might be interesting as well: https://bitbucket.org/ewanhiggs/csv-game/src/master/

from simdcsv.

dongx-psu avatar dongx-psu commented on July 30, 2024

Is there some benchmark numbers for this project. This is very interesting. I will definitely consider adapting it to FishStore when this library becomes a bit more stable.

from simdcsv.

MarkPflug avatar MarkPflug commented on July 30, 2024

Hello,
I'll submit my own library for comparison: Sylvan.Data.Csv. I believe it is currently the fastest CSV parser in the .NET ecosystem. I recently added a SIMD fast-path that processes unquoted fields, and falls back to the single data path when a quoted field is encountered. This was my first exposure to SIMD, so I'm sure there's room for improvement in that logic, but it was a pretty significant improvement over the non-SIMD code. The library is encoding agnostic, so I'm it could probably be made even faster if it had a code path specialized for processing UTF-8 bytes instead of .NET chars, but I don't want to compromise the ergonomics of the API to do so. On my machine it processes ~1GB/sec of UTF-8 encoded CSV data, when just counting rows/fields.

from simdcsv.

lemire avatar lemire commented on July 30, 2024

Thanks @MarkPflug

from simdcsv.

liquidaty avatar liquidaty commented on July 30, 2024

I'll add one I maintain: https://github.com/liquidaty/zsv

I'm sure with the intellectual firepower on this repo, you can beat zsv, or perhaps someone already has, but I haven't yet seen that, other than from parsers that are unable to handle real-world CSV variants in the same manner as Excel, which as a practical matter is (imho) the best de facto standard for CSV parsing.

from simdcsv.

nietras avatar nietras commented on July 30, 2024

https://github.com/nietras/Sep is my project, it has detailed Benchmarks and uses csFastFloat. Sep does not support all features that some csv parsers do but has an api tailored to machine learning use cases. I'm working on unquoting/unescaping which is all that is missing.

from simdcsv.

Related Issues (6)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.