Comments (5)
What is your PowerShell module intending to do? You shouldn't have to do special batching at all. The CsvDataReader
allocates a single working buffer internally (which is configurable in size) and that is all the memory that it will ever use, regardless of how big the input file is. Conceptually, you can thing of it as operating on a batch size of 1. Each time you call read there will be a single record processed.
from sylvan.
Perfect!
A little bit of history is due. I have been Using the PowerShell built in Import-Csv for all of my Csv needs. Slow, but part of the default install. Suddenly, a few multi gigabyte Csv sources appeared and … default became rapidly too slow and memory hungry. After a bit of googling, I run into https://www.joelverhagen.com/blog/2020/12/fastest-net-csv-parsers and … the performance of your library made it for a quick buy.
Mostly, the module will concentrate on importing data to datatables that will then be inserted into a DB. Same thing for export operations. The batching in these cases will happen at the datatable level to avoid individual commits.
I have the core of the module already working and it is now time for cleanup and publishing. Will keep you posted. Thank you again.
from sylvan.
If you are trying to load data into SqlServer (or any other relational database) I would suggest you avoid using an intermediate DataTable and just feed the CsvDataReader directly to SqlBulkCopy (other providers have similar capabilities). Here is an example:
SqlBulkCopy.WriteToServer
accepts a DbDataReader
, so there's no need to load into a DataTable first. The key here, is that the DbDataReader needs to provide a schema that conforms to the target SQL table. The linked example shows how you can apply the schema of the table in SQL to the CsvDataReader.
from sylvan.
Copy that. SQL is definitely a target together with Postgres.
from sylvan.
@opustecnica shoot me a notification when your powershell module is released. I'd be interested in reviewing it.
from sylvan.
Related Issues (20)
- Doc error HOT 2
- Ordering of columns HOT 3
- Nullable string from an empty string HOT 2
- Write headers when no data? HOT 1
- Cannot parse a CR delimited file HOT 2
- Non-descript exception thrown for Sylvan.Data.Csv HOT 3
- Processing files with large numbers of errors is rather slow HOT 3
- Query : Usage of Sylvan Libraries HOT 2
- Csv headers with "" HOT 3
- Skipping over rows that throw CsvFormatException HOT 2
- passing an IAsyncEnumerable to ObjectDataReader.Builder HOT 2
- As is. The Object to CSV Writer example doesn't compile HOT 1
- Clone a CsvDataReader HOT 1
- CsvDataWriter to append to a file HOT 1
- Configuration adjustments for CSV parsing HOT 1
- Records Count and Max Length of each Column HOT 1
- A 19kb record throws CsvRecordTooLargeException HOT 3
- Fields ending in escaped character return the wrong character HOT 1
- Minor unescape issue? HOT 2
- CompiledDataBinder: "Body of catch must have the same type as body of try" exception HOT 7
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sylvan.