Code Monkey home page Code Monkey logo

Comments (5)

MarkPflug avatar MarkPflug commented on June 19, 2024

What is your PowerShell module intending to do? You shouldn't have to do special batching at all. The CsvDataReader allocates a single working buffer internally (which is configurable in size) and that is all the memory that it will ever use, regardless of how big the input file is. Conceptually, you can thing of it as operating on a batch size of 1. Each time you call read there will be a single record processed.

from sylvan.

opustecnica avatar opustecnica commented on June 19, 2024

Perfect!
A little bit of history is due. I have been Using the PowerShell built in Import-Csv for all of my Csv needs. Slow, but part of the default install. Suddenly, a few multi gigabyte Csv sources appeared and … default became rapidly too slow and memory hungry. After a bit of googling, I run into https://www.joelverhagen.com/blog/2020/12/fastest-net-csv-parsers and … the performance of your library made it for a quick buy.
Mostly, the module will concentrate on importing data to datatables that will then be inserted into a DB. Same thing for export operations. The batching in these cases will happen at the datatable level to avoid individual commits.
I have the core of the module already working and it is now time for cleanup and publishing. Will keep you posted. Thank you again.

from sylvan.

MarkPflug avatar MarkPflug commented on June 19, 2024

If you are trying to load data into SqlServer (or any other relational database) I would suggest you avoid using an intermediate DataTable and just feed the CsvDataReader directly to SqlBulkCopy (other providers have similar capabilities). Here is an example:

public void SqlBulkLoadSample()

SqlBulkCopy.WriteToServer accepts a DbDataReader, so there's no need to load into a DataTable first. The key here, is that the DbDataReader needs to provide a schema that conforms to the target SQL table. The linked example shows how you can apply the schema of the table in SQL to the CsvDataReader.

from sylvan.

opustecnica avatar opustecnica commented on June 19, 2024

Copy that. SQL is definitely a target together with Postgres.

from sylvan.

MarkPflug avatar MarkPflug commented on June 19, 2024

@opustecnica shoot me a notification when your powershell module is released. I'd be interested in reviewing it.

from sylvan.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.