billgraziano / csvdatareader Goto Github PK
View Code? Open in Web Editor NEWFast streaming CSV file reader for PowerShell impleted as a DataReader.
Fast streaming CSV file reader for PowerShell impleted as a DataReader.
Use case: Adding a Client ID or source identifier or the file name itself to the columns returned.
is there a compiled one ?
Exception calling "WriteToServer" with "1" argument(s): "The given ColumnMapping does not match up with any column in the source or destination."
What am I missing here?
If I have a CSV with a notes field of some sort and that notes field includes newline characters, CscDataReader interprets those as signifying the beginning of a new row of data, even though the newline characters are inside the text qualifiers. This causes an indexoutofrangeexception to be thrown.
I would love to see the tool be able to qualify not just the separator character but newline characters as well to handle this scenario.
E.g.
id,details,asdsAttempts
0,Harmonic oscillation at T+5 minutes Premature engine shutdown at T+7 min 30 s,0
1,"Initially scheduled for 23–25 Sep, carried dummy payload – mass simulator,
165 kg (originally intended to be RazakSAT).",0
2,Broke up after successful water landing,0
Maybe something like
public CsvDataReaderWeb(string url)
{
WebClient _client = new WebClient();
Stream _webStream = _client.OpenRead(url);
StreamReader _stream = new StreamReader(_stream);
_headers = _streamReader.ReadLine().Split(',');
}
Right now column names are case-sensitive. I need to either use a new storage mechanism or do something like this: http://stackoverflow.com/questions/8935161/how-to-add-a-case-insensitive-option-to-array-indexof
hi
for my experiments I already added it like this (additional constructor)
public CsvDataReader(string fileName, string separator, string[] columns)
{
if (!File.Exists(fileName))
throw new FileNotFoundException();
_stream = new StreamReader(fileName);
_headers = columns;
if (separator == "|")
separator = "\\" + separator;
_CsvRegex = new Regex(separator+"(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))", RegexOptions.Compiled);
}
Use Case: Identify all rows that won't parse. Maybe return back a result that is only invalid rows. Probably return all the columns as a single column. Basically check the column count after a reading a row and spit back the invalid ones.
If I attempt to import a CSV with a header row of:
"DisplayName","PrimarySMTPAddress"
then an error is thrown:
Exception calling "WriteToServer" with "1" argument(s): "The given ColumnName 'DisplayName' does not match up with any column in data source."
This can be worked around by using
""DisplayName
""
in the column mapping, but it would be preferable for the same optional text qualifier logic that is used for the data rows to be used for the header rows as well. My source for this CSV puts double quotes around every single cell whether it is a header or data cell.
I am using the compiled DLL you provided in "releases." I did not compile my own.
Control this via a settable property. If set, will throw exception on bad column count but will include either the last good row or the bad row in the exception.
My script runs well if the script is called from e:\xxx and the CsvDataReader.dll is in e:\xxx , but if i move the ps1 script and CsvDataReaderdll to e:\xxx\newpath then i'm getting this error:
Exception calling "LoadFrom" with "1" argument(s): "Could not load file or assembly 'file:///E:\xxx\newpath\CsvDataReader.dll' or one of its dependencies. The system cannot find the file specified."
The file isn't blocked when i look in properties, and it was all of the same permissions as the file in the parent directory. Do i need to GAC this thing?
Control this via a settable property. Any failed rows are skipped and returned via a string array property at the end.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.