Comments (8)
""
is processed as escaped "
like "\""
in Ruby's String
.
This is a common convention in CSV. See also RFC 4180: https://datatracker.ietf.org/doc/html/rfc4180#section-2
7. If double-quotes are used to enclose fields, then a double-quote
appearing inside a field must be escaped by preceding it with
another double quote. For example:
"aaa","b""bb","ccc"
from csv.
@kou thanks for looking into this. Makes sense.
I suppose a way forward could be to provide an option to tell the parser that no escaping is being used in the data. I could look into that if that's something you'd be open to add.
Though I'm actually thinking that in my case it'd be easier to implement a TSV parser as tabs aren't allowed in values... but it'd be nice to provide a common way to deal with such cases.
from csv.
Can we use col_sep: "\t"
?
from csv.
@kou not sure what you mean, the problem is the same with col_sep: "\t"
. In that case the \t becomes part of the string.
from csv.
I thought that your original data uses \t
as a separator and all columns don't include \t
:
The original source is actually using tabs to separate the columns (which aren't allowed in the data)
from csv.
Yes, that's correct. But as CSV parser is converting "" to ", the same issue happens:
CSV.parse_line(%Q{"size: 2" "\t'time: 3''\t"test"}, col_sep: "\t", liberal_parsing: true)
=> ["\"size: 2\" \"", "'time: 3''", "test"]
CSV.parse_line(%Q{"size: 2""\t'time: 3''\t"test"}, col_sep: "\t", liberal_parsing: true)
=> ["\"size: 2\"\t'time: 3''\t\"test\""]
Are you suggesting to adjust the CSV parser to behave differently if tab is used as col_sep?
As I mentioned, I could probably just split the string on "\t" and hence not use the CSV parser at all / implement a simplified TSV parser. I was more wondering if you'd consider adjusting the CSV parser to handle this.
from csv.
Ah, you just used CSV not TSV as an example to show your use case, right?
for some reason decided to quote each value
Is it your choice for parsing? Or is it decided by a person who created the original source?
(It seems that the first and third column are quoted by "
and the second column is quoted '
not "
. Is it intentional?)
If it's your choice, how about just stopping it and using quote_char: nil
?
pp CSV.parse_line(%Q{size: 2" \ttime: 3'\ttest}, col_sep: "\t", quote_char: nil)
# ["size: 2\" ", "time: 3'", "test"]
pp CSV.parse_line(%Q{size: 2"\ttime: 3'\ttest}, col_sep: "\t", quote_char: nil)
# ["size: 2\"", "time: 3'", "test"]
from csv.
@kou right, I suppose I used the quote_char so I don't need to strip the quotation marks afterwards, but that should be easy to do. Thanks for your help!
from csv.
Related Issues (20)
- New bugfix version for the changes on master HOT 1
- CSV.generate is not working with Rails 7 HOT 18
- `CSV::Row` pattern matching `Symbol` assumption HOT 1
- :date_time converter fails to recognize "YYYY-MM-DD HH:MM" HOT 7
- Add quoted information to CSV::FieldInfo HOT 1
- ArgumentError: unknown encoding name - iso-8859-1|utf-8 HOT 2
- Feature Request: Generate CSV String from Array HOT 5
- #eof? method returning wrong value when it's used on a csv file HOT 3
- #eof? method returning wrong value when it's used on a csv file with #each, #map, #filter HOT 1
- row access method(like .first, .count, .map) remove row unintentionally HOT 2
- New release for Ruby 3.2 HOT 7
- How about GH releases generated by `gh release create --generate-notes` HOT 5
- feature: add option to limit length of strings HOT 3
- Suggestion to add `sep` option HOT 1
- Broken links in documention HOT 11
- Duplicated last line in CSV.foreach HOT 14
- Recipes not copied downstream HOT 6
- A fiber to execute ':heder_converters' has been changed since v3.2.6 HOT 1
- Inconsistent behaviour between `CSV::Table.new` and `CSV.parse` HOT 10
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from csv.