ruby / csv Goto Github PK
View Code? Open in Web Editor NEWCSV Reading and Writing
Home Page: https://ruby.github.io/csv/
License: BSD 2-Clause "Simplified" License
CSV Reading and Writing
Home Page: https://ruby.github.io/csv/
License: BSD 2-Clause "Simplified" License
reproduction code:
require 'csv'
puts CSV::VERSION
csv = CSV.new(<<~CSV, headers: true, return_headers: true)
head1,head2,head3
aaa,bbb,ccc
ddd,ee"e.fff
ggg,hhh,iii
CSV
until csv.eof?
begin
p csv.gets
rescue CSV::MalformedCSVError => e
p e
end
end
ruby 2.6.2:
$ ruby -v && ruby /tmp/csv.rb
ruby 2.6.2p47 (2019-03-13 revision 67232) [x86_64-linux]
3.0.4
#<CSV::Row "head1":"head1" "head2":"head2" "head3":"head3">
#<CSV::Row "head1":"aaa" "head2":"bbb" "head3":"ccc">
#<CSV::MalformedCSVError: Illegal quoting in line 3.>
Traceback (most recent call last):
3: from /tmp/csv.rb:1:in `each'
2: from /home/krororo/.rbenv/versions/2.6.2/lib/ruby/2.6.0/csv/parser.rb:236:in `parse'
1: from /home/krororo/.rbenv/versions/2.6.2/lib/ruby/2.6.0/csv/parser.rb:236:in `new'
/home/krororo/.rbenv/versions/2.6.2/lib/ruby/2.6.0/csv/row.rb:35:in `initialize': undefined method `size' for nil:NilClass (NoMethodError)
ruby 2.5.3:
$ ruby -v && ruby /tmp/csv.rb
ruby 2.5.3p105 (2018-10-18 revision 65156) [x86_64-linux]
2.4.8
#<CSV::Row "head1":"head1" "head2":"head2" "head3":"head3">
#<CSV::Row "head1":"aaa" "head2":"bbb" "head3":"ccc">
#<CSV::MalformedCSVError: Illegal quoting in line 3.>
#<CSV::Row "head1":"ggg" "head2":"hhh" "head3":"iii">
The expected behavior is to get data after an error occurs.
As the title says, passing an IO object that supports pos
but not rewind
to CSV.new
or CSV.parse
can cause silent data corruption (specifically, loss of the first 1024 bytes of input) if row_sep
is not explicitly specified.
I originally ran into this problem on JRuby, while trying to use a non-markable Java InputStream (specifically, a java.util.zip.ZipInputStream
) wrapped with .to_io
as the input to CSV.parse
. However, it's also possible to demonstrate the problem with a simple wrapper around StringIO:
require 'csv'
require 'stringio'
require 'forwardable'
class DummyIO
extend Forwardable
def_delegators :@io, :gets, :read, :pos # no seek or rewind!
def initialize(data)
@io = StringIO.new(data)
end
end
csv = (1..10).map do |row|
(1..100).map { |col| "row#{row}col#{col}" }.to_a.join(",")
end.to_a.join("\n")
CSV.new(DummyIO.new(csv)).each do |row|
puts row.inspect
end
The output of this code, which should begin with ["row1col1", "row1col2", "row1col3"
, instead begins with ["ol4", "row2col5", "row2col6", "row2col7"
.
Ideally, the row_sep
autodetection should be rewritten to save the data it reads, so that it can be later parsed without having to seek or rewind the input at all. This would allow any non-seekable input streams to be safely used without needing any special-case hacks.
Currently CSV
is a single file library, being a 2300+ LOC file is kind of difficult to manage by itself. We probably need to follow the Rubygem's proposal on how to structure the gem.
Pros:
Cons:
Plan:
With this we will have 5 files, and still work to do via future refactors (a 1700+ LOC file still looks like a red flag) and unit test.
I have a file, it used utf-16le
encoding.
require 'csv'
CSV.foreach('shifenzheng.txt', encoding: "utf-16le") do |row|
p row
sleep 10
end
I got the following error:
ArgumentError: invalid byte sequence in UTF-16LE │
from ~/.rvm/rubies/ruby-2.5.0/lib/ruby/2.5.0/csv.rb:2046:in `=~'
I tried to:
CSV.foreach('shifenzheng.txt', encoding: 'utf-16le:utf-8') do |row| │
p row; sleep 10; │
end
This worked.
I tried to:
fp = File.open 'shifenzheng.txt', 'rb', encoding: 'utf-16le'
fp.readline
This worked well, too.
So, I looked into csv code, it seems CSV::foreach
dose not accept argument for reading by bytes.
When run following test code, Ruby 2.6.1's CSV library returns diffrenet result with Ruby 2.5.3.
require 'csv'
csv = <<-CSV
alice,有栖,ありす,,,,,,,,
CSV
rows = CSV.parse(csv, headers: true)
p rows.headers
$ ruby -v csv.rb
ruby 2.6.1p33 (2019-01-30 revision 66950) [x86_64-darwin18]
["alice", "有栖", "ありす", nil, nil, nil, nil, nil, nil, nil, nil]
$ ruby -v csv.rb
ruby 2.5.3p105 (2018-10-18 revision 65156) [x86_64-darwin18]
[]
I haven't figured out how to solve that problem, but i found some step to reproduce it:
File.open
skip_lines
options are presencev3.0.1
works correctly, It's happening in v3.0.2
and afterstest case:
def test_three_bytes_chars
Tempfile.create(['temp', '.csv']) do |tempfile|
tempfile.close
path = tempfile.path
text = "\xE5\x93\x88\xE5\x9B\x89"
File.open(path, "w") do |csv|
row = [text].join(',').concat("\n")
row_count = (32 * 1024) / (StringIO.new(row).length) + 1
csv << (row * row_count)
end;
assert_equal(
[text],
CSV.read(File.open(path), headers: true, :skip_lines => /\A#/).to_a.last
)
end
end
Results in Ruby 2.5.5:
# Running tests:
[102/756] TestCSVFeatures#test_three_bytes_chars = 0.06 s
1) Failure:
TestCSVFeatures#test_three_bytes_chars [/Users/jeff/project/csv/test/csv/test_features.rb:341]:
<["哈囉"]> expected but was
<["囉"]>.
[142/756] TestCSVFeatures::DifferentOFS#test_three_bytes_chars = 0.06 s
2) Failure:
TestCSVFeatures::DifferentOFS#test_three_bytes_chars [/Users/jeff/project/csv/test/csv/test_features.rb:341]:
<["哈囉"]> expected but was
<["囉"]>.
Finished tests in 2.863357s, 264.0258 tests/s, 2669.9430 assertions/s.
756 tests, 7645 assertions, 2 failures, 0 errors, 0 skips
ruby -v: ruby 2.5.5p157 (2019-03-15 revision 67260) [x86_64-darwin18]
Is this Ruby's official CSV library now? I tried using FasterCSV with ruby 2.4.0, but it bails with an error requiring I use "ruby 1.9 csv" instead. Since this vode is available as the official ruby/csv
namespace on Github, I assume that this is that gem, but there's no documentation, and the gemspec says that 2.5.0 is the minimum required ruby version? Should I be looking at the FasterCSV docs, is this just that, directly moved over?
If I'm in the wrong place, please point me to the docs for whatever CSV gem that works with 2.4.0.
For example:
require "csv"
p CSV.parse('1, 2,3 , 4 ,5', remove_spaces_around_column_separator: true) # Needs shorter name
# [["1", "2", "3", "4", "5"]]
Hello,
I'm creating this issue to know a bit more about the methods exposed from the CSV object that are delegated to the underlying IO object, including #eof?
.
For ruby < 2.6, the behaviour was that whenever we call #readline
, #eof?
would return the state of the IO and tell if the end of file is reached. However, this behaviour has changed in ruby >= 2.6 and it seems that we cannot rely on #eof?
anymore to tell the state of the IO when we want to readline on the CSV object.
Is it the intended behaviour? If so, shouldn't #eof?
(and possibly other delegated methods from IO) be deprecated or at least a notice in the documentation that we shouldn't rely on it (them)?
If not, I'd be happy to give it a try and fix it.
Thanks
ruby 2.6.1p33 (2019-01-30 revision 66950) [x86_64-darwin16]
For the following CSV test.csv file:
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11,12
and the following code snippet:
require 'csv'
csv = ::CSV.open('./test.csv', {headers: true, col_sep: ','})
p csv.eof?
csv.readline
p csv.eof?
csv.readline
p csv.eof?
csv.readline
p csv.eof?
I got the following results with 2.6.1:
false
true
true
true
but using ruby 2.5.3p105 (2018-10-18 revision 65156) [x86_64-darwin16]
I got:
false
false
false
true
This is one of the most confusing exception in the stdlib
I'd expect something like CSV::MalformedCSVError
but ArgumentError
. I have to use something like this in our code, where I could be sure I don't catch something else than only an illegal character on a given line:
loop do
row = csv.gets
break unless row
...
rescue ArgumentError => exception
raise if exception.message != 'invalid byte sequence in UTF-8'
...
end
I have a CSV file with "forced quotes" and UTF-8 BOM (\xEF\xBB\xBF) which CSV can not read after a rewind
. I get "CSV::MalformedCSVError: Illegal quoting in line 1."
My UTF-8 CSV file with BOM:
File.open('bom_test.csv', 'w') do |io|
io.write("\xEF\xBB\xBF\"Name\",\"City\"\n\"John Doe\",\"New York\"")
end
Reproduce error:
# Case 1
csv = CSV.open('bom_test.csv', 'r:BOM|UTF-8', {headers: true})
csv.shift
# => #<CSV::Row "Name":"John Doe" "City":"New York">
csv.rewind
csv.shift
# => CSV::MalformedCSVError (Illegal quoting in line 1.)
# Case 2
csv = CSV.open('bom_test.csv', 'r:BOM|UTF-8', {headers: true})
csv.readline
# => #<CSV::Row "Name":"John Doe" "City":"New York">
csv.rewind
csv.readline
# => CSV::MalformedCSVError (Illegal quoting in line 1.)
Both gems on ruby gems versions 0.0.1 and 0.1.0 now require ruby 2.5.0-dev. Both released gems can no longer be installed with released versions of ruby like 2.3.x.
This may work:
@@ -360,12 +320,14 @@ class CSV
if @liberal_parsing
@unquoted_value = Regexp.new("[^".encode(@encoding) +
escaped_column_separator +
- "\r\n]+".encode(@encoding))
+ escaped_row_separator +
+ "]+".encode(@encoding))
else
@unquoted_value = Regexp.new("[^".encode(@encoding) +
escaped_quote_character +
escaped_column_separator +
- "\r\n]+".encode(@encoding))
+ escaped_row_separator +
+ "]+".encode(@encoding))
end
@cr_or_lf = Regexp.new("[\r\n]".encode(@encoding))
@not_line_end = Regexp.new("[^\r\n]+".encode(@encoding))
If we don't have quote character, we can use line.split(column_separator)
.
At first, I don't know whether the following CSV is valid format.
a,""b""
Ruby 2.5's CSV library handles it well.
However, Ruby 2.6's shows different behavior.
$ ruby -v t.rb
ruby 2.6.0p0 (2018-12-25 revision 66547) [x86_64-darwin18]
Traceback (most recent call last):
5: from t.rb:7:in `<main>'
4: from /Users/watson/.rbenv/versions/2.6.0/lib/ruby/2.6.0/csv.rb:683:in `parse'
3: from /Users/watson/.rbenv/versions/2.6.0/lib/ruby/2.6.0/csv.rb:1180:in `read'
2: from /Users/watson/.rbenv/versions/2.6.0/lib/ruby/2.6.0/csv.rb:1180:in `to_a'
1: from /Users/watson/.rbenv/versions/2.6.0/lib/ruby/2.6.0/csv.rb:1171:in `each'
/Users/watson/.rbenv/versions/2.6.0/lib/ruby/2.6.0/csv/parser.rb:273:in `parse': Do not allow except col_sep_split_separator after quoted fields in line 1. (CSV::MalformedCSVError)
$ ruby -v t.rb
ruby 2.5.3p105 (2018-10-18 revision 65156) [x86_64-darwin18]
[["a", "\"b\""]]
require 'csv'
csv =<<CSV
a,""b""
CSV
p CSV.parse(csv)
When trying to parse a file that contains extra whitespace at the end of the line, the parser fails with CSV::MalformedCSVError (Any value after quoted field isn't allowed in line 1.)
Other programs (Numbers, OpenOffice, etc) seem to have no trouble with the file.
CSV gem version: 3.1.0
Ruby version: 2.5.1
File contents (note that there is one extra whitespace after the double quote of header_2:
"header_1","header_2"
"value_1","value_2"
$ irb
2.5.1 :001 > require 'csv'
=> true
2.5.1 :002 > csv = CSV.open('bad_file.csv', headers: true)
=> <#CSV io_type:File io_path:"bad_file.csv" encoding:UTF-8 lineno:0 col_sep:"," row_sep:"\n" quote_char:"\"" headers:true>
2.5.1 :003 > csv.each { |row| puts row.inspect }
Traceback (most recent call last):
11: from .rvm/rubies/ruby-2.5.1/bin/irb:11:in '<main>'
10: from (irb):3
9: from .rvm/gems/ruby-2.5.1/gems/csv-3.1.0/lib/csv.rb:1243:in 'each'
8: from .rvm/gems/ruby-2.5.1/gems/csv-3.1.0/lib/csv.rb:1243:in 'each'
7: from .rvm/gems/ruby-2.5.1/gems/csv-3.1.0/lib/csv/parser.rb:303:in 'parse'
6: from .rvm/gems/ruby-2.5.1/gems/csv-3.1.0/lib/csv/parser.rb:779:in 'parse_quotable_loose'
5: from .rvm/gems/ruby-2.5.1/gems/csv-3.1.0/lib/csv/parser.rb:28:in 'each_line'
4: from .rvm/gems/ruby-2.5.1/gems/csv-3.1.0/lib/csv/parser.rb:28:in 'each_line'
3: from .rvm/gems/ruby-2.5.1/gems/csv-3.1.0/lib/csv/parser.rb:31:in 'block in each_line'
2: from .rvm/gems/ruby-2.5.1/gems/csv-3.1.0/lib/csv/parser.rb:818:in 'block in parse_quotable_loose'
1: from .rvm/gems/ruby-2.5.1/gems/csv-3.1.0/lib/csv/parser.rb:869:in 'parse_quotable_robust'
CSV::MalformedCSVError (Any value after quoted field isn't allowed in line 1.)
We have a .CSV from a vendor that quotes numeric fields that should be treated as strings with =" -- for example (excerpt from a real file)
USD,="6161015399000000",54585234
I'm going to experiment with a custom converter. If that works, I'll close this issue.
Currently the project has no test coverage report, this can be a nice to have to maintain and improve the CSV gem.
Pros:
Cons:
Plan (if accepted):
Is this behavior correct when cloning a CSV::Row
?
irb(main):006:0> row = CSV::Row.new([:name], ['Andre'])
=> #<CSV::Row name:"Andre">
irb(main):007:0> row[:name]
=> "Andre"
irb(main):008:0> dup_row = row.dup
=> #<CSV::Row name:"Andre">
irb(main):009:0> dup_row[:name]
=> "Andre"
irb(main):010:0> dup_row.delete(:name)
=> [:name, "Andre"]
irb(main):011:0> dup_row[:name]
=> nil
irb(main):012:0> row['name']
=> nil
I expected that when modifying the duplicated row would not change the original object.
Using ruby 2.5.1p57
Since Ruby's internal encoding is UTF-8, I expect CSV.generate
returns UTF-8
string as well
However, CSV.generate
returns ASCII-8BIT
string by default.
Line 537 in 9b81ece
Seems that String.new
returns ASCII-8BIT
string.
I want to know whether this is right behavior in Ruby 2.6 's CSV library. Thank you!!!
$ ruby -v t.rb
ruby 2.6.0p0 (2018-12-25 revision 66547) [x86_64-darwin18]
#<Encoding:ASCII-8BIT>
"\xE5\xBA\x97\xE5\x90\x8D,\xE5\x90\x8D\xE7\xA7\xB0,\xE7\xA8\x8E\xE9\xA1\x8D\xE5\x8C\xBA\xE5\x88\x86,\xE6\x95\xB0\xE9\x87\x8F,\xE7\xA8\x8E\xE9\xA1\x8D\n"
$ ruby -v t.rb
ruby 2.5.3p105 (2018-10-18 revision 65156) [x86_64-darwin18]
#<Encoding:UTF-8>
"店名,名称,税額区分,数量,税額\n"
require 'csv'
csv_string = CSV.generate do |csv|
csv << ["店名", "名称", "税額区分", "数量", "税額"]
end
p csv_string.encoding
p csv_string
Given CSV.generate_line(["",""], col_sep: '|')
the resulting ""|""
is unexpected. The expected result is a simple |
.
It appears the issue is caused by
Line 1438 in ba560e4
If the current behavior is on purpose (which seems to be the case), could an option be introduced to not quote empty values, such as quote_empty: true
?
(Also, does anyone know the smallest workaround for this that doesn't break legitimate double-double-quotes in CSV quoting?
I have noticed a different behaviour between Ruby <= 2.4.3 and Ruby 2.5.0 for the #open
method.
If you create an empty file for writing and you are not writing any line in that CSV file, Ruby <= 2.4.3 doesn't write anything (an empty file) but Ruby 2.5.0 writes the headers.
$ ruby -v
ruby 2.4.3p205 (2017-12-14 revision 61247) [x86_64-darwin17]
$ irb
irb(main):001:0> require "csv"
=> true
irb(main):002:0> CSV.open("ruby-2.4.3.csv", "wb", headers: ["name", "surname"], write_headers: true) { }
=> nil
irb(main):003:0> `cat ruby-2.4.3.csv`
=> ""
$ ruby -v
ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-darwin17]
$ irb
irb(main):001:0> require "csv"
=> true
irb(main):002:0> CSV.open("ruby-2.5.0.csv", "wb", headers: ["name", "surname"], write_headers: true) { }
=> nil
irb(main):003:0> `cat ruby-2.5.0.csv`
=> "name,surname\n"
In the examples, I'm using an empty block but in a real application probably you will have an if
statement, something like this:
CSV.open(...) do |csv|
csv << "hello" if condition
end
Hi there I update my app to ruby-2.5 and have some problems with sending encoding bom|utf-8
to csv parser.
I create a sample Dockerfiles for this:
With ruby-2.4.3
everything working fine:
FROM ruby:2.4.3
RUN echo 'test' > test.csv
RUN echo "require 'csv'\ncsv = CSV.read('test.csv', encoding: 'bom|utf-8')\n p csv[0][0]" > script.rb
CMD ruby script.rb
output is "test"
But for ruby-2.5.0 encoding error is happend
FROM ruby:2.5.0
RUN echo 'test' > test.csv
RUN echo "require 'csv'\ncsv = CSV.read('test.csv', encoding: 'bom|utf-8')\n p csv[0][0]" > script.rb
CMD ruby script.rb
Error is:
/usr/local/lib/ruby/2.5.0/csv.rb:1532:in `find': unknown encoding name - bom|utf-8 (ArgumentError)
from /usr/local/lib/ruby/2.5.0/csv.rb:1532:in `initialize'
from /usr/local/lib/ruby/2.5.0/csv.rb:1280:in `new'
from /usr/local/lib/ruby/2.5.0/csv.rb:1280:in `open'
from /usr/local/lib/ruby/2.5.0/csv.rb:1346:in `read'
from script.rb:2:in `<main>'
Is it possible to make the following error message less cryptic? I have no idea what is col_sep_split_separator (It doesn't appear in the documentation or in the parameters), and the language is not clear at all:
Line 687 in 000221d
In the past CSV.parse('"a"bc"')
returned "CSV::MalformedCSVError: Unclosed quoted field on line 1.", which states clearly that a quote is unclosed.
Now, it returns "Do not allow except col_sep_split_separator after quoted fields in line 1.", and I think most people wouldn't know what is causing the problem.
opts = {row_sep: "|\n", col_sep: ","}
CSV.parse(CSV.generate(opts) { |csv|
csv << ["yes, it's true"];
csv << [ "CSV is broken"];
csv << ["uhoh!"]; }, opts)
In the example above, the parser will raise a CSV::MalformedCSVError (Unquoted fields do not allow new line <"\n"> in line 2.)
error.
It will succeed if any of the following are done:
force_quotes: true
to generate
row_sep: "\n"
Therefore, the problem seems to be that a non-standard row_sep
+ lines with a field being quoted in one row and unquoted in the next.
The CSV parser seems to call @input.string
if @input
is a StringIO, but I think it should check if @input.pos
is zero before doing that.
require 'csv'
require 'stringio'
strio = StringIO.new(<<'EOF')
aaa,b,c
EOF
p strio.read(2) #=> "aa"
p strio.pos #=> 2
p CSV.parse_line(strio) #=> ["aaa", "b", "c"] (["a", "b", "c"] is expected)
This problem was found by @katsyoshi and presented on ruby-jp.slack.com. I tracked it down and identified the causing commit as eeab2ed, which was between v3.0.1 and v3.0.2.
If iterating thru multiple converters, if any single converter results in a symbol, subsequent converters are skipped. As seen in the #convert_field
docs below:
#
# Processes +fields+ with <tt>@converters</tt>, or <tt>@header_converters</tt>
# if +headers+ is passed as +true+, returning the converted field set. Any
# converter that changes the field into something other than a String halts
# the pipeline of conversion for that field. This is primarily an efficiency
# shortcut.
#
def convert_fields(fields, headers = false)
# see if we are converting headers or fields
converters = headers ? @header_converters : @converters
fields.map.with_index do |field, index|
converters.each do |converter|
break if field.nil?
field = if converter.arity == 1 # straight field converter
converter[field]
else # FieldInfo converter
header = @use_headers && !headers ? @headers[index] : nil
converter[field, FieldInfo.new(index, lineno, header)]
end
break unless field.is_a? String # short-circuit pipeline for speed
end
field # final state of each field, converted or original
end
end
However the docs for the #initialize options don't relay that same requirement about converter arrays:
# <b><tt>:converters</tt></b>:: An Array of names from the Converters
# Hash and/or lambdas that handle custom
# conversion. A single converter
# doesn't have to be in an Array. All
# built-in converters try to transcode
# fields to UTF-8 before converting.
# The conversion will fail if the data
# cannot be transcoded, leaving the
# field unchanged.
This leads a reader to believe that you can pass converters like so:
:header_converters => [:symbol, lambda { |h| h.do_some_stuff }]
But this will result in lambda { |h| h.do_some_stuff }
never executing.
I could reverse the order of my converters, as shown below, and now both will execute, but the intended result is dependent on the correct order.
:header_converters => [lambda { |h| h.do_some_stuff }, :symbol]
My workaround:
Because I didn't want to duplicate your :symbol lambda (and risk falling out of sync with subsequent changes), I had to write a custom lambda, that as a part of its definition invokes the default symbol lambda:
def self.symbol_and_squeezed_lambda
lambda do |header|
native_symbol_lambda = CSV::HeaderConverters[:symbol]
first_conversion = native_symbol_lambda.call(header)
first_conversion.to_s.squeeze('_').to_sym
end
end
...
:header_converters => symbol_and_squeezed_lambda
All of this could only be found by putting debuggers inside csv.rb
and tracing my way down to and through the convert_field
method before I realized why me second/custom converter wasn't being executed.
Possible changes:
#initialize
break
in the #convert_fields
Even if change 1 is implemented, it would only help clarity, and would still require the use of workaround as I showed above.
If headers: true
option was given, CSV.parse()
returns CSV::Table
object even if empty string will be passed as argument with Ruby 2.5.0.
However, in Ruby 2.6, the above behavior was incomatible.
I want to know whether it is expected. Thank you!!
$ ruby -v ttt.rb
ruby 2.6.0p0 (2018-12-25 revision 66547) [x86_64-darwin18]
[]
Traceback (most recent call last):
ttt.rb:4:in `<main>': undefined method `headers' for []:Array (NoMethodError)
$ ruby -v ttt.rb
ruby 2.5.3p105 (2018-10-18 revision 65156) [x86_64-darwin18]
#<CSV::Table mode:col_or_row row_count:1>
[]
require 'csv'
row = CSV.parse('', headers: true)
p row
p row.headers
I tried running Activeadmin's tests against ruby master, and it all went good, except for a single failure, related to encodings.
I researched a bit and it all comes down to the following change in behaviour:
$ ruby -EUTF-8:UTF-8 -ve 'require "csv"; puts CSV.generate_line(["おはようございます".encode("Shift_JIS")], encoding: Encoding::Shift_JIS).encoding'
ruby 2.6.5p106 (2019-08-29 revision 67799) [x86_64-linux]
last_commit=Revert "merge revision(s) 53e9908d8afc7f03109b0aafd1698ab35f512b05: [Backport #15916]"
Shift_JIS
$ ruby -EUTF-8:UTF-8 -ve 'require "csv"; puts CSV.generate_line(["おはようございます".encode("Shift_JIS")], encoding: Encoding::Shift_JIS).encoding'
ruby 2.7.0dev (2019-09-09T12:27:40Z master 89c5d5a64e) [x86_64-linux]
/home/deivid/.rbenv/versions/master/lib/ruby/gems/2.7.0/gems/csv-3.1.1/lib/csv.rb:568: warning: The last argument is used as the keyword parameter
/home/deivid/.rbenv/versions/master/lib/ruby/gems/2.7.0/gems/csv-3.1.1/lib/csv.rb:915: warning: for `initialize' defined here
UTF-8
I'm experiencing a regression in the CSV library on upgrading to ruby 2.6.0:
Consider:
require 'csv'
CSV.parse("John Doe Male\nJane Doe Female", col_sep: " ")
In ruby 2.5.1, this returns:
=> [["John Doe", "Male"], ["Jane Doe", "Female"]]
In ruby 2.6.0, this raises an exception (with a TODO in the message?)
CSV::MalformedCSVError (TODO: Meaningful message in line 1.)
require 'csv'
data = CSV.new(<<~ROWS, headers: true)
Name,Department,Salary
Bob,Engineering,1000
ROWS
data.each do |row|
puts row.to_s
end
puts 'second loop'
data.each do |row|
puts row.to_s
end
ruby csv_bug.rb
Bob,Engineering,1000
second loop
Name,Department,Salary <--- 🔥this should not be here 🔥
Bob,Engineering,1000
ruby csv_bug.rb
Bob,Engineering,1000
second loop
ruby csv_bug.rb
Bob,Engineering,1000
second loop
I may take a crack at fixing myself, but almost certain this isn't desired behavior.
It seems the row pointer is being rewound when calling .each
twice, but on the second iteration the rewind is including the header line -- even though I've specified headers: true
Side note, in the future I think .each
should always rewind the row counter after execution, which appears what it's trying to do in 2.6.0
but it's accidentally including the header row
Bob should ask for a raise
I want to compare performance between multiple csv versions.
benchmark-ips doesn't support the feature.
Hello,
we validate user's CSV. One of validation steps is the encoding validation of a line.
Our validation method looks somehow like this
def valid_and_parse_lines
line_number = 2 # count start after headers
loop do
row = csv.gets
break unless row
line = line_initialization(row)
add_error(line_number, line.errors.full_messages) if line.invalid?
rescue ArgumentError => exception
raise if exception.message != 'invalid byte sequence in UTF-8'
add_error(line_number, 'This line contains invalid UTF-8 characters')
ensure
line_number += 1
end
csv.rewind
end
The CSV is loaded this way
@csv ||= ::CSV.new(file || content, encoding: Encoding::UTF_8, headers: true)
What is the issue? If there is some ones of the lines of the first 1kB block (used by stdlib's CSV parser for some detection) contain some invalid lines it leads some lines are skipped.
Here is an example:
labels_invalid_iso8859-1_encoding.csv.zip
We always open the files in UTF-8, this CSV contains two lines with illegal character in ISO-8859-1. The second and fourth lines (after the header) contain the illegal character. In this case, only the second lines is detected, fourth one is unfortunately skipped. There is no problem with this if there are other invalid lines after the first 1kB block.
It seems the bug relates to these lines in csv.rb
Here's the documentation of the #read
method:
Slurps the remaining rows and returns an Array of Arrays.
Let's check this with the following sample script:
# frozen_string_literal: true
require 'csv'
data = StringIO.new <<~DATA
foo,1
bar,2
baz,3
DATA
csv = CSV.new(data)
puts "Shift 1: #{csv.shift.inspect}"
puts "Shift 2: #{csv.shift.inspect}"
puts "Rest : #{csv.readlines.inspect}"
$ ruby -v
ruby 2.5.3p105 (2018-10-18 revision 65156) [x86_64-darwin18]
$ ruby sample-csv.rb
Shift 1: ["foo", "1"]
Shift 2: ["bar", "2"]
Rest : [["baz", "3"]]
Perfect - using ruby-2.5.3 this is all fine.
Running this same example with ruby-2.6.2 it shows #read
returns all rows of the file instead of just the remaining ones:
$ ruby -v
ruby 2.6.2p47 (2019-03-13 revision 67232) [x86_64-darwin18]
$ ruby sample-csv.rb
Shift 1: ["foo", "1"]
Shift 2: ["bar", "2"]
Rest : [["foo", "1"], ["bar", "2"], ["baz", "3"]]
The same code as #82. #eof?
raise CSV::MalformedCSVError. Expected behavior returns true
or false
value.
backtrace:
$ ruby /tmp/csv.rb
3.0.9
#<CSV::Row "head1":"head1" "head2":"head2" "head3":"head3">
#<CSV::Row "head1":"aaa" "head2":"bbb" "head3":"ccc">
Traceback (most recent call last):
7: from /tmp/csv.rb:1:in `each'
6: from /home/krororo/.rbenv/versions/2.6.2/lib/ruby/gems/2.6.0/gems/csv-3.0.9/lib/csv/parser.rb:303:in `parse'
5: from /home/krororo/.rbenv/versions/2.6.2/lib/ruby/gems/2.6.0/gems/csv-3.0.9/lib/csv/parser.rb:779:in `parse_quotable_loose'
4: from /home/krororo/.rbenv/versions/2.6.2/lib/ruby/gems/2.6.0/gems/csv-3.0.9/lib/csv/parser.rb:28:in `each_line'
3: from /home/krororo/.rbenv/versions/2.6.2/lib/ruby/gems/2.6.0/gems/csv-3.0.9/lib/csv/parser.rb:28:in `each_line'
2: from /home/krororo/.rbenv/versions/2.6.2/lib/ruby/gems/2.6.0/gems/csv-3.0.9/lib/csv/parser.rb:31:in `block in each_line'
1: from /home/krororo/.rbenv/versions/2.6.2/lib/ruby/gems/2.6.0/gems/csv-3.0.9/lib/csv/parser.rb:818:in `block in parse_quotable_loose'
/home/krororo/.rbenv/versions/2.6.2/lib/ruby/gems/2.6.0/gems/csv-3.0.9/lib/csv/parser.rb:879:in `parse_quotable_robust': Illegal quoting in line 3. (CSV::MalformedCSVError)
$ ruby -r csv -e 'puts CSV::VERSION'
3.0.1
$ ruby -v
ruby 2.6.0rc1 (2018-12-06 trunk 66253) [x86_64-darwin18]
This seems to be a recent regression.
In Ruby 2.6:
> CSV.parse("\n", {headers:'s'}) { |x| puts x }
Traceback (most recent call last):
11: from /Users/cabeer/.rubies/ruby-2.6.0-rc1/bin/irb:23:in `<main>'
10: from /Users/cabeer/.rubies/ruby-2.6.0-rc1/bin/irb:23:in `load'
9: from /Users/cabeer/.rubies/ruby-2.6.0-rc1/lib/ruby/gems/2.6.0/gems/irb-0.9.6/exe/irb:11:in `<top (required)>'
8: from (irb):5
7: from /Users/cabeer/.rubies/ruby-2.6.0-rc1/lib/ruby/2.6.0/csv.rb:693:in `parse'
6: from /Users/cabeer/.rubies/ruby-2.6.0-rc1/lib/ruby/2.6.0/csv.rb:1147:in `each'
5: from /Users/cabeer/.rubies/ruby-2.6.0-rc1/lib/ruby/2.6.0/csv.rb:1205:in `shift'
4: from /Users/cabeer/.rubies/ruby-2.6.0-rc1/lib/ruby/2.6.0/csv.rb:1205:in `loop'
3: from /Users/cabeer/.rubies/ruby-2.6.0-rc1/lib/ruby/2.6.0/csv.rb:1239:in `block in shift'
2: from /Users/cabeer/.rubies/ruby-2.6.0-rc1/lib/ruby/2.6.0/csv.rb:1239:in `new'
1: from /Users/cabeer/.rubies/ruby-2.6.0-rc1/lib/ruby/2.6.0/csv/row.rb:32:in `initialize'
NoMethodError (undefined method `each' for nil:NilClass)
In Ruby 2.5:
CSV.parse("\n", {headers:'s'}) { |x| puts x }
=> nil
Code to reproduce:
irb(main):015:0> data = CSV.open('FL_insurance_sample.csv', headers: true)
=> <#CSV io_type:File io_path:"FL_insurance_sample.csv" encoding:UTF-8 lineno:0 col_sep:"," row_sep:"\r" quote_char:"\"" headers:true>
irb(main):016:0> data.map {|row| row['policyID'] }
=> ["119736", "448094", "206893", "333743", "172534", "785275", "995932", "223488", "433512", "142071", "253816", , "198381", "746777", "144396", "263732", "696203", "200041", "127791", "945454", "918783", "515966", "430825", "190906", "166568", "353031", "551567", "133628", "441954", "747072", "701430", "420763", "618391", "214094", "253272", "836769", "396418", "212363", "246838", "758281", "551473", "999148", "983276", "751862", "347062", "894809", "762476", "747976", "685314", "387815", "991621", "998347", "996182", "840804", "783604", "253701", "840979", "861594", "779435", "900859", "186330", "512609", "432937", "412242", "597608", "213466", "517820", "980965", "191055", "510055", "843311", "816735", "337433", "623083", "528326", "334012", "531464", "128205", "982934", "366858", "917861", "210083", "586410", "237321", "910610", "617475", "902121", "182262", "945617", "156266", "588078", "688316", "382076", "121582", "563196", "643286", "142017", "879091", "337675", "379732", "932035", "312404", "747081", "901144", "196907", "388423", "856044", "539127", "700397", "661275", "139221", "937181", "543206", "341736", "754850", "866762", "356525", "871304", "758309", "803289", "922786", "549990", "168802", "546453", "813885", "353826", "938192", "302211", "331518", "631541", "373350", "840399", "107484", "134321", "245082", "895779", "614634", "731255", "870031", "240901", "663297", "191014", "325852", "121870", "751791", "273278", "381736", "625288", "315535", "774751", "328683", "423065", "213789", "844864", "662432", "776510", "375098", "161060", "465293", "704983", "378969", "983689", "541087", "525184", "864811", "312073", "830827", "603600", "830638", "755799", "414134", "733306", "483113", "201599", "115553", "188952", "923142", "394238", "884711", "463999", "575587", "971233", "735553", "827885", "842735", "542544", "489799", "179917", "743678", "598714", "895885", "313786", "535248", "706708", "771872", "413294", "408522", "971371", "340752", "348885", "889842", "606357", "738016", "231530", "223769", "242550", "649523", "732806", "506127", "682614", "383061", "394483", "100734", "233198", "407586", "777849", "368612", "127678", "120313", "381935", "143107", "805832", "687756", "349748", "169703", "160845", "114716", "639372", "962341", "300088", "447463", "837999", "296462", "921167", "609153", "375107", "566456", "910667", "822256", "306936", "176160", "409658", "413864", "445152", "168986", "288291", "180190", "668451", "614810", "502082", "706039", "306321", "373798", "469052", "481459", "390921", "196529", "740303", "538916", "994961", "409875", "312502", "972492", "973803", "392941", "483132", "941267", "979434", "395199", "935112", "328328", "988479", "938390", "531723", "390295", "800333", "896027", "159053", "850624", "501210", "206613", "215176", "348557", "449758", "149149", "630004", "400230", "820739", "392800", "331634", "289487", "944227", "309870", "781075", "138800", "238602", "766085", "742674", "539986", "970167", "790549", "726087", "466491", "137084", "770955", "296576", "375634", "666525", "885400", "294715", "121406", "667352", "221935", "420569", "121608", "219338", "656313", "719612", "622576", "288272", "483654", "206575", "431741", "574975", "916215", "250085", "498521", "437546", "336215", "850733", "779536", "806053", "215624", "615260", "495971", "514202", "538761", "213484", "143092", "708325", "155509", "746709", "819848", "309835", "139184", "290655", "827012", "318336", "258417", "701302", "142034", "328942", "347135", "909114", "563415", "279517", "556606", "651400", "727059", "696150", "617858", "435154", "526134", "168517", "338250", "183015", "153278", "805805", "123298", "220620", "111828", "809267", "100832", "833501", "855220", "780279", "736500", "769111", "552700", "261355", "232697", "871943", "902178", "642392", "879254", "662332", "322056", "412153", "661157", "340623", "316160", "436138", "117516", "648515", "845348", "369196", "524989", "731292", "270408", "146599", "334837", "713639", "994148", "311195", "122957", "792909", "354608", "669472", "606218", "997046", "762150", "598122", "743490", "354377", "423012", "193658", "932158", "145157", "917300", "139283", "677487", "945412", "360081", "942850", "788310", "715861", "597274", "568609", "289654", "280820", "673647", "327821", "950932", "998455", "540523", "424217", "699408", "687347", "680327", "597117", "266549", "673994", "436373", "835868", "214893", "649680", "415864", "119912", "595269", "767070", "676536", "852629", "389349", "152953", "955461", "237748", "761602", "896196", "119126", "384059", "390978", "274235", "771305", "726249", "593040", "492618", "930951", "481346", "441607", "969756", "260394", "344306", "303488", "420736", "999074", "436388", "225877", "239600", "372980", "426550", "668292", "579643", "719414", "909116", "154943", "767203", "969840", "250163", "357693", "202254", "891615", "210155", "406098", "636233", "660021", "908793", "285202", "575821", "905410", "234566", "648800", "118922", "999511", "460983", "746058", "914824", "828851", "215511", "689551", "646941", "239894", "601473", "295575", "284817", "247418", "956631", "274394", "241568", "710879", "389832", "981075", "472817", "270291", "625006", "730065", "127439", "200161", "972046", "731927", "189177", "805881", "689675", "147990", "857740", "650870", "171696", "173411", "134855", "253880", "867210", "201080", "402131", "974455", "703811", "766146", "651977", "395336", "309592", "724503", "133652", "932447", "640925", "268779", "234219", "776191", "862405", "439550", "610152", "636589", "663394", "203390", "460155", "202669", "680696", "932413", "577961", "402465", "441804", "427207", "357789", "962575", "481067", "378030", "227970", "312359", "622542", "654076", "799638", "301168", "704199", "995995", "586496", "166251", "627242", "704509", "392017", "744159", "452889", "877439", "636980", "996522", "929753", "956015", "669029", "228166", "677953", "401301", "268640", "300856", "933265", "891070", "511306", "645925", "616982", "719750", "610089", "629045", "685168", "578509", "775294", "210493", "914012", "852931", "727832", "140749", "596550", "859377", "584257", "155589", "514183", "872822", "651684", "779165", "158884", "144149", "383577", "261091", "444338", "327171", "371347", "951686", "447162", "188345", "818636", "614898", "669081", "935876", "468469", "542386", "395441", "303562", "825549", "970082", "716335", "254640", "614480", "738905", "524306", "734078", "427755", "484434", "995789", "105029", "939080", "963857", "269554", "359282", "246752", "204754", "659602", "334792", "674289", "384509", "475176", "332849", "803588", "230743", "411740", "937207", "848833", "504245", "734996", "230134", "816763", "322162", "321155", "613838", "869481", "384695", "983908", "111917", "176957", "470811", "352835", "481615", "906547", "748767", "389516", "594771", "786732", "612028", "989756", "699478", "415463", "351135", "496510", "485677", "677003", "105608", "158676", "777817", "812774", "159089", "786677", "259373", "451475", "882753", "783377", "437430", "827369", "596250", "285886", "458484", "486112", "101338", "232095", "342224", "886136", "903258", "520030", "277195", "181268", "128861", "502760", "848887", "548440", "201566", "744482", "390109", "511392", "408803", "411806", "344504", "433345", "272099", "163684", "719955", "876995", "743399", "595305", "910435", "131728", "372630", "273180", "618362", "653880", "939368", "320372", "695248", "622358", "248006", "717616", "678182", "399488", "739362", "760498", "447188", "255060", "964099", "853251", "280691", "436217", "137690", "364869", "622931", "144721", "879044", "293936", "791778", "771997", "610413", "110834", "810838", "501151", "132290", "790532", "793536", "252450", "390685", "263106", "222923", "478696", "644404", "936549", "726970", "637467", "506061", "405009", "797729", "342656", "808248", "586817", "297047", "377880", "960034", "556888", "440763", "827514", "261396", "971391", "329887", "790204", "593181", "567130", "778905", "565101", "124465", "254681", "730075", "469022", "427994", "904768", "952446", "802386", "443430", "306353", "425957", "984289", "841525", "671320", "790892", "387385", "485814", "803291", "575869", "274205", "508824", "176891", "201277", "242745", "272543", "354967", "417685", "216543", "547186", "766555", "340437", "960468", "578597", "316406", "730243", "340294", "348693", "198785", "296809", "466837", "931973", "478208", "327460", "755449", "246887", "790477", "807314", "852868", "757484", "539258", "125300", "515247", "362625", "441367", "428148", "251047", "613819", "494107", "634777", "984559", "388907", "444433", "136473", "928946", "276598", "957251", "437315", "880804", "278989", "223853", "526575", "914254", "485361", "689055", "662290", "311286", "830307", "886048", "100316", "302424", "902058", "465430", "704037", "465920", "485918", "211655", "461546", "956939", "223206", "808184", "263920", "481056", "885148", "639151", "939684", "730817", "881633", "351383", "194298", "892232", "332022", "568246", "363008", "463483", "679828", "958563", "349028", "483460", "267578", "951972", "230094", "107453", "310848", "480667", "376972", "329020", "687753", "839888", "248834", "462344", "583299", "460519", "733735", "267190", "408444", "941525", "701341", "670477", "569296", "186507", "141517", "424714", "569209", "950233", "484152", "196725", "124654", "911722", "765060", "688938", "273297", "248817", "226968", "840749", "599513", "352034", "459881", "596506", "419655", "823571", "700978", "425548", "416477", "997627", "904796", "366928", "620351", "284239", "790773", "220438", "608209", "979905", "430332", "342163", "686993", "871485", "675816", "536685", "623673", "374046", "227293", "544721", "187389", "233931", "979777", "854640", "713166", "622713", "760557", "386420", "990463", "186358", "364912", "352489", "549727", "271003", "814262", "896386", "865501", "453724", "408265", "520733", "993917", "343211", "850825", "378860", "682241", "640824", "807855", "349153", "336877", "491718", "963762", "110199", "560859", "971647", "299010", "731013", "183167", "170649", "677574", "980724", "361902", "361949", "762422", "230449", "626481", "596965", "663507", "893473", "144278", "666220", "145040", "350628", "143023", "347038", "306792", "479443", "139293", "928529", "406883", "904481", "800218", "431173", "779054", "995049", "726733", "980126", "353282", "208099", "776568", "219138", "572410", "976645", "243505", "378722", "397940", "502466", "878346", "508637", "982298", "489930", "858457", "339167", "172384", "123923", "554572", "198361", "302493", "111020", "387113", "681245", "335083", "198084", "974019", "189732", "919237", "703813", "992140", "442886", "323672", "209448", "551371", "754333", "803578", "860005", "244382", "557428", "946133", "954047", "751178", "983825", "385875", "521593", "850201", "742445", "688673", "950319", "804282", "964764", "946971", "381999", "703464", "767342", "295825", "511947", "497446", "724330", "715643", "646193", "301733", "512304", "837657", "789826", "631172", "693823", "478886", "996504", "113209", "628450", "296773", "332837", "723813", "973935", "902448", "967792", "991555", "199723", "709705", "581618", "228671", "792113", "362668", "924918", "810749", "916616", "971782", "251835", "647540", "565470", "643886", "970032", "705570", "886873", "109862", "280283", "176659", "605167", "522273", "161421", "964810", "148513", "537098", "939518", "120991", "642958", "102002", "484307", "539985", "660338", "680100", "240664", "948418", "915706", "700735", "521481", "399117", "850152", "267551", "475548", "708729", "875533", "754962", "254136", "520684", "811927", "307890", "441071", "960973", "134341", "830075", "194015", "344814", "594806", "819822", "747511", "274748", "671128", "546588", "281041", "388236", "583457", "634761", "992828", "840869", "753794", "394096", "993901", "785862", "567375", "503438", "428168", "824970", "182953", "470499", "309731", "190203", "706567", "449478", "355644", "651125", "629544", "851607", "778665", "869078", "737972", "644493", "206766", "349230", "539320", "696388", "250787", "122420", "613556", "703519", "275369", "332121", "931804", "384379", "740074", "103223", "492213", "246117", "401497", "702684", "672475", "683842", "752251", "256535", "177567", "889944", "167508", "902637", "529761", "824464", "852690", "289551", "714503", "339215", "916316", "562662", "859555", "155867", "326736", "386572", "980608", "403239", "610427", "358889", "328732", "575764", "467957", "664848", "412356", "422141", "604590", "805329", "450931", "266753", "551411", "649329", "992384", "580316", "803768", "905727", "107888", "916778", "942342", "768796", "531150", "293430", "209503", "102398", "903095", "636279", "359690", "891085", "409967", "678303", "645243", "749435", "611857", "430900", "755749", "274616", "544936", "583970", "580276", "542777", "886730", "744670", "772025", "456572", "128131", "271209", "300811", "473009", "526005", "860797", "685747", "123937", "183879", "512260", "691247", "883852", "555449", "570707", "781009", "151838", "181634", "439778", "168008", "968580", "884571", "905912", "128146", "344782", "552814", "500262", "785630", "933238", "503844", "460236", "700206", "891489", "304021", "433793", "464021", "491115", "944950", "838298", "672001", "224331", "676822", "206441", "370773", "394818", "852520", "876010", "593070", "179262", "369830", "828704", "455732", "934502", "470145", "589653", "709858", "660969", "725797", "669238", "868735", "238679", "888350", "537673", "269818", "833285", "556472", "853960", "691111", "902545", "638372", "538203", "963299", "370091", "682597", "430806", "143612", "237956", "245985", "183788", "103810", "278243", "878953", "394030", "965651", "454585", "658942", "481064", "970015", "664155", "504852", "465283", "901625", "997150", "261097", "575697", "209405", "319675", "508721", "495724", "548607", "454012", "588111", "569500", "972877", "746177", "415821", "900320", "866150", "790880", "362729", "788533", "727899", "326368", "653519", "814948", "724926", "284110", "676635", "868212", "767054", "506214", "897926", "677790", "101036", "233192", "544426", "268827", "635219", "265871", "552478", "247639", "617777", "227225", "741059", "113309", "154954", "315104", "282118", "901589", "475454", "733014", "872342", "977297", "996797", "930412", "828167", "835470", "130599", "556323", "653783", "199843", "408604", "235265", "330735", "632165", "681096", "547242", "242005", "566613", "974414", "597501", "329516", "182197", "266075", "885528", "970498", "587057", "742260", "815587", "888748", "651160", "900301", "431495", "291761", "392571", "296579", "814503", "397603", "278859", "927946", "307199", "473551", "359896", "630799", "115071", "813157", "446877", "549092", "869635", "181190", "952536", "234730", "990532", "437376", "868414", "560023", "531018", "862279", "913677", "368497", "806491", "956474", "393855", "236809", "233360", "852903", "411412", "638511", "619853", "677532", "203973", "864999", "470074", "454081", "679930", "276744", "689881", "219881", "281812", "681382", "187152", "420392", "254782", "983500", "612229", "613407", "380142", "935588", "203097", "771476", "881297", "474249", "832086", "377902", "605866", "571393", "801676", "520599", "317574", "481235", "942125", "658270", "776764", "549397", "929053", "634233", "747131", "995193", "381321", "995419", "501720", "523558", "650297", "871903", "846959", "898645", "631406", "344642", "180988", "476557", "495613", "813476", "110207", "349233", "355679", "342917", "763188", "684404", "523215", "961479", "872886", "846234", "365135", "420233", "390802", "278448", "901435", "788395", "870434", "584847", "440239", "847832", "177345", "285210", "879413", "107783", "336897", "972481", "860546", "268835", "578855", "678603", "603821", "947816", "286578", "591676", "787369", "935010", "712366", "210708", "692014", "758505", "888231", "435390", "323832", "389142", "950551", "836880", "850076", "641388", "342919", "419446", "757410", "584791", "430552", "233436", "770307", "453741", "805159", "655713", "632168", "236753", "814965", "416373", "458001", "421258", "487337", "705392", "724971", "525340", "274641", "725522", "948462", "931725", "297000", "734660", "906872", "209844", "846660", "744254", "785151", "557521", "670304", "210163", "208187", "881910", "715199", "439458", "441512", "560925", "436617", "189453", "800460", "958971", "650415", "807845", "150155", "189513", "518477", "562655", "316784", "327359", "449637", "519610", "753517", "303809", "348892", "730950", "238662", "481664", "546767", "663326", "413119", "885963", "787349", "214885", "799851", "958812", "711408", "766275", "982804", "177639", "182295", "360593", "455558", "498511", "803252", "942252", "642072", "893642", "979888", "385335", "782042", "172382", "267767", "694601", "612302", "508808", "879262", "503798", "166146", "527060", "922058", "944141", "132263", "908240", "454703", "904151", "966797", "128830", "122025", "396173", "983748", "986174", "379626", "463284", "686919", "831575", "475571", "245018", "253022", "597738", "923068", "928173", "860443", "844073", "110751", "215972", "254044", "881800", "826248", "441838", "123081", "417943", "834306", "724894", "991408", "680719", "609567", "499385", "370022", "689051", "873363", "997492", "891852", "495074", "480977", "902538", "126856", "312850", "880863", "926739", "265135", "713991", "709214", "856409", "715519", "275781", "476956", "182028", "307144", "970874", "640315", "319900", "304610", "727625", "611749", "908085", "284304", "515383", "209934", "439491", "114641", "329658", "323610", "280448", "311781", "164040", "494483", "998895", "291130", "651752", "193711", "249527", "611084", "781807", "674839", "449262", "921003", "587925", "755440", "200577", "469159", "942176", "881788", "485164", "547747", "331888", "571760", "172126", "470768", "664373", "441872", "389978", "376364", "542479", "640211", "422963", "977255", "613747", "380308", "915355", "383141", "164909", "627420", "759506", "955914", "114477", "431157", "861984", "762453", "582444", "710826", "143379", "473557", "567547", "733330", "984266", "689889", "165580", "192397", "898470", "393024", "857508", "862971", "469168", "863406", "198867", "743079", "473889", "683427", "106321", "448154", "770864", "689741", "667310", "705671", "831095", "181430", "176997", "842701", "336744", "910596", "876682", "182231", "825424", "526224", "296132", "194544", "372807", "109022", "746495", "529196", "540751", "946516", "598909", "801828", "707890", "543857", "114550", "583414", "104959", "806361", "934501", "637915", "870261", "444277", "711042", "155229", "941628", "854460", "426518", "565545", "913552", "734960", "829687", "162975", "679151", "524631", "458718", "147509", "760033", "248427", "809305", "607091", "691454", "136509", "295142", "126875", "135566", "758919", "919634", "924677", "216954", "543931", "127115", "450072", "568353", "838315", "494884", "374662", "317978", "443783", "586472", "619279", "670758", "220374", "533430", "868629", "128756", "333988", "260039", "938522", "948241", "476849", "573272", "939271", "669724", "326868", "888466", "955649", "350333", "265649", "672496", "932711", "813138", "387562", "763690", "144432", "956980", "606510", "191943", "902403", "408994", "706339", "993218", "854831", "359057", "384404", "704979", "565064", "185266", "832418", "118372", "355246", "354474", "952889", "862741", "147554", "379234", "335634", "674044", "278463", "744141", "941104", "369810", "622825", "159490", "949620", "589581", "290570", "115262", "157439", "820427", "208245", "153652", "807437", "205089", "928704", "847055", "443358", "885202", "421647", "288215", "656479", "474918", "239241", "123075", "640862", "742091", "127947", "977282", "663177", "113510", "587466", "536877", "254997", "247253", "437996", "916505", "644885", "730168", "954535", "564735", "441080", "873449", "968925", "907594", "775788", "611644", "406061", "618002", "482864", "567849", "751165", "144854", "466348", "315491", "795920", "809607", "474800", "256436", "947732", "408682", "356370", "507120", "127256", "167047", "400444", "342254", "817780", "487950", "667514", "513680", "405754", "532214", "293233", "754154", "249388", "283201", "248770", "672690", "420683", "198679", "351513", "149468", "145769", "601460", "592320", "669544", "762657", "380367", "288147", "182418", "375290", "805557", "829455", "316463", "927046", "850288", "885654", "826742", "941158", "967866", "873747", "829382", "430259", "438032", "303496", "847942", "218356", "452794", "981439", "598715", "163921", "818157", "251458", "671867", "427176", "403131", "764637", "674131", "657414", "527597", "142447", "802496", "696236", "896153", "919260", "884541", "243532", "623924", "213207", "247464", "988513", "350424", "673107", "122904", "852799", "778713", "152959", "565397", "908162", "687004", "497350", "680301", "778222", "643726", "346776", "525302", "594959", "963097", "255200", "905563", "845050", "673682", "347556", "323644", "849213", "315733", "370714", "813661", "601144", "958101", "430723", "161858", "883297", "220889", "337019", "287974", "909550", "100922", "296485", "376743", "451369", "994229", "549339", "197726", "987427", "993682", "976312", "900034", "491015", "617324", "239357", "681327", "352403", "164006", "251771", "503103", "655363", "585539", "887739", "403857", "454622", "528824", "553390", "781776", "752454", "208268", "334746", "345341", "174580", "970196", "748393", "916913", "592832", "576225", "925826", "708420", "175701", "925767", "930709", "983768", "950355", "233788", "353200", "875871", "124583", "952771", "821284", "948967", "430600", "253223", "694013", "235606", "318077", "652306", "930767", "877872", "213778", "844059", "247133", "360152", "998644", "193200", "676176", "917098", "388192", "943356", "218764", "814489", "507992", "260882", "539377", "266526", "123549", "829398", "460058", "856342", "692354", "615248", "860257", "931737", "433714", "248035", "526493", "997868", "358238", "451520", "761310", "771480", "101934", "806089", "216998", "948101", "253386", "428457", "724191", "583553", "856147", "555415", "920290", "137015", "403589", "953377", "374030", "336624", "780782", "838573", "585047", "546503", "429947", "674337", "308168", "437807", "732936", "381680", "186331", "271843", "835867", "831567", "254867", "548875", "561547", "571309", "328654", "252040", "444929", "197179", "633582", "592768", "876103", "862211", "790275", "621015", "895668", "135611", "167834", "812700", "325223", "677144", "669129", "293268", "532415", "251466", "894944", "664654", "907057", "998569", "792947", "502869", "725878", "230998", "423210", "389940", "664326", "240123", "331172", "556670", "846632", "322952", "176662", "402035", "605361", "443281", "805272", "241436", "995966", "522413", "161754", "990849", "474544", "324243", "477231", "792493", "412141", "524599", "885997", "420433", "729792", "395368", "205934", "578516", "872862", "114347", "475670", "443621", "673952", "646454", "950759", "429724", "424668", "540318", "491387", "653920", "924154", "495756", "979918", "773304", "546476", "993585", "535339", "690173", "506624", "736529", "180830", "949466", "425185", "147649", "347367", "552384", "811257", "674211", "166223", "765687", "282794", "593914", "324812", "608083", "423580", "387933", "904424", "761790", "598443", "484654", "527134", "176490", "418718", "380161", "472396", "618259", "768626", "458435", "765917", "916187", "162788", "120519", "311330", "387890", "208316", "160228", "874004", "698690", "745209", "110742", "534704", "200400", "774390", "538228", "813558", "818077", "159155", "333123", "757197", "920104", "960185", "725351", "673832", "930523", "498099", "305181", "325705", "466339", "971724", "954511", "540123", "213307", "216606", "874075", "971470", "366404", "876618", "405650", "491600", "568134", "242703", "384502", "453529", "901763", "156912", "684723", "424691", "492823", "203236", "676080", "299949", "909870", "973531", "734936", "470691", "110865", "475900", "226393", "210736", "305466", "894356", "885456", "538335", "609325", "137546", "277745", "295124", "621716", "132178", "120869", "140522", "727284", "236135", "787194", "852514", "343463", "780366", "866819", "199429", "119518", "747453", "440966", "194655", "951443", "702469", "793790", "438939", "999934", "250883", "385392", "608329", "457841", "493081", "934596", "548420", "335002", "165548", "377989", "227423", "124412", "377470", "621976", "246495", "835654", "752159", "883077", "770421", "993256", "608797", "177498", "583056", "648684", "772506", "219637", "258611", "809796", "208711", "323348", "728302", "522495", "129550", "453849", "941314", "458742", "895375", "185051", "462082", "882912", "234498", "176588", "842073", "743463", "242608", "231375", "718468", "394321", "444948", "238013", "271859", "211885", "776863", "722439", "359327", "710731", "776506", "645112", "482005", "502056", "572712", "462794", "292594", "501693", "216263", "773421", "245630", "572151", "547063", "230388", "708957", "515262", "756887", "676849", "357598", "934659", "257988", "430403", "569210", "307831", "339743", "745408", "539716", "101917", "820525", "307438", "170986", "357034", "914080", "974421", "220168", "959902", "795347", "732807", "227458", "890669", "819429", "639251", "158748", "772770", "152480", "859106", "581469", "956578", "183150", "784387", "891604", "409847", "636070", "817161", "943428", "806748", "562174", "934491", "436612", "728370", "762676", "734837", "752475", "257481", "695884", "187491", "868235", "840135", "136438", "188725", "281230", "291930", "111122", "676051", "849208", "778619", "472645", "598219", "204934", "958855", "687514", "538729", "696393", "274681", "415978", "160023", "184988", "218831", "596035", "168397", "423292", "828873", "204797", "897569", "799747", "464641", "696673", "976632", "732026", "117534", "872422", "899914", "384082", "896585", "152761", "544530", "805168", "709187", "409699", "416182", "590430", "528889", "261688", "115390", "398857", "760015", "903693", "769620", "104286", "146960", "233763", "285808", "646299", "760225", "979849", "219918", "625730", "522608", "165057", "159906", "378949", "662978", "558678", "730488", "229350", "817582", "874837", "774295", "760133", "772729", "239902", "904378", "314179", "303881", "517094", "107937", "917246", "257598", "644718", "747921", "442005", "659759", "213603", "838151", "267426", "571302", "698726", "282310", "539291", "375853", "240937", "919609", "476820", "159650", "615969", "100552", "651052", "748723", "673237", "110064", "703669", "519631", "329672", "294876", "820774", "252218", "865012", "849538", "981829", "770201", "826721", "396474", "880873", "565387", "332302", "460245", "911743", "267295", "872032", "613154", "976549", "179720", "265541", "850806", "852687", "315838", "203148", "130405", "131808", "134560", "505648", "775324", "811476", "182634", "427559", "437777", "902563", "367446", "271334", "612601", "316847", "150303", "394051", "542171", "171860", "400481", "816979", "727342", "445739", "404437", "609024", "755866", "112859", "962786", "496644", "535918", "463363", "226759", "607543", "425549", "327498", "396221", "690336", "136887", "635836", "515258", "651907", "536350", "986555", "201560", "508697", "111291", "609096", "822949", "602022", "705280", "680770", "107422", "856543", "619650", "572365", "595809", "257438", "823119", "617556", "764690", "864046", "193605", "140260", "630427", "667000", "534281", "733846", "441668", "346905", "471804", "924273", "782472", "923115", "460088", "435210", "465217", "632642", "946716", "889218", "404441", "444832", "963351", "933512", "266370", "558297", "906995", "109623", "785276", "184434", "193939", "500086", "639901", "323489", "937572", "756783", "433616", "826854", "331401", "259660", "518773", "123592", "475196", "143117", "984362", "134777", "222297", "265957", "918381", "796136", "845445", "226030", "271564", "924063", "103266", "700435", "659660", "227300", "374066", "123016", "420523", "776342", "776037", "804597", "608373", "595264", "407421", "352091", "803973", "270558", "652444", "827200", "850231", "498526", "717198", "448126", "184340", "638794", "407739", "184207", "657784", "814859", "811680", "979770", "252003", "226540", "672570", "155093", "974037", "979183", "233479", "755452", "250746", "537361", "915444", "931045", "901729", "768473", "252678", "681011", "951273", "789023", "240265", "588128", "972275", "168545", "197675", "999382", "692391", "672027", "264144", "770696", "187260", "873049", "593470", "963969", "555273", "615393", "490494", "697353", "421097", "950831", "766780", "204413", "544764", "131988", "129847", "409099", "928010", "466138", "609599", "587061", "230711", "287115", "776232", "745932", "325051", "187894", "195243", "713013", "253583", "781680", "542224", "104974", "723519", "493420", "334249", "527477", "359947", "746102", "949162", "696418", "194983", "680323", "839993", "541094", "239552", "770103", "165523", "344351", "942745", "947678", "672744", "792046", "387999", "908535", "358731", "677721", "941277", "483690", "857802", "740906", "919102", "304392", "375135", "221539", "371791", "982150", "338482", "286495", "366455", "398273", "912893", "966632", "914040", "882363", "546934", "816632", "589723", "430821", "627786", "736953", "533010", "381430", "759310", "444137", "903870", "624726", "495956", "113363", "357742", "279348", "324826", "215848", "962606", "887227", "565260", "328610", "944346", "644507", "729667", "429283", "581781", "755753", "142827", "341946", "270573", "536379", "171775", "708413", "954090", "549875", "781573", "669963", "973078", "587327", "575246", "943418", "578947", "186785", "636868", "396787", "960541", "112131", "188509", "413139", "962216", "486011", "807241", "633515", "834247", "784953", "468672", "164662", "136179", "703995", "236855", "574154", "909313", "867949", "988803", "176858", "175809", "742632", "385687", "177371", "112455", "336898", "204230", "830696", "909294", "782864", "843402", "869528", "524469", "169804", "347285", "567160", "654565", "510895", "987492", "586377", "337553", "354199", "981409", "206290", "337732", "645321", "975467", "395047", "642389", "224356", "653196", "333301", "927664", "126301", "363406", "554639", "537517", "985795", "384126", "487525", "922534", "760822", "623487", "872380", "524056", "222610", "198654", "990746", "225303", "925232", "800314", "550794", "350309", "580261", "348652", "889450", "399352", "580279", "943758", "430347", "170664", "978837", "275155", "194680", "695630", "215995", "238812", "968286", "218943", "317113", "451791", "652662", "400773", "665980", "158638", "292508", "464645", "935905", "406770", "590484", "812083", "406339", "590191", "872415", "100183", "388113", "401548", "470715", "867722", "901247", "640036", "241060", "628393", "446363", "260998", "690403", "393469", "264232", "743750", "907482", "921398", "963545", "384813", "116201", "206089", "202042", "870972", "263023", "444839", "554136", "763442", "929994", "531174", "557398", "404442", "506915", "373536", "844429", "283751", "346096", "636998", "244426", "114549", "662993", "221023", "657126", "579728", "342330", "522288", "147972", "413112", "769784", "442020", "323442", "998523", "241351", "254636", "599130", "993406", "129197", "387772", "188539", "966998", "749607", "535991", "193274", "754564", "749566", "847856", "330371", "497553", "113326", "903029", "222753", "681211", "297969", "428746", "803892", "438700", "398597", "164281", "803877", "482168", "755955", "965248", "244345", "936339", "897474", "391278", "957782", "251684", "641478", "474867", "490406", "915481", "841009", "840625", "635630", "177196", "861422", "552214", "238887", "493555", "132602", "845185", "164980", "496569", "148266", "147958", "163343", "329163", "583236", "461324", "220158", "328677", "488314", "310994", "713148", "489265", "659216", "279868", "495998", "568583", "677274", "852311", "822395", "253856", "551302", "496047", "374148", "792523", "900748", "445981", "660678", "150447", "364459", "887035", "971990", "951307", "281064", "753008", "502107", "462946", "643422", "120198", "202971", "759562", "799104", "558130", "886216", "951224", "216749", "581390", "502722", "146138", "532633", "927138", "673188", "736391", "604743", "692332", "427050", "988573", "940597", "239535", "347408", "538280", "551460", "749963", "161927", "150363", "107975", "346822", "541383", "776492", "794095", "975186", "913911", "411601", "151923", "819754", "154536", "761630", "620772", "294679", "624136", "721031", "927124", "304866", "346806", "653887", "949615", "536648", "633500", "622395", "633813", "225863", "226126", "772926", "147594", "125807", "363724", "996010", "986423", "253156", "729943", "391026", "412655", "647057", "779973", "689489", "990714", "706555", "129296", "327539", "118944", "144243", "437444", "670813", "575467", "699009", "671436", "413834", "239946", "990000", "826462", "316214", "492961", "442410", "896508", "870816", "677914", "176452", "970883", "725467", "770729", "766802", "588698", "298126", "122168", "892180", "948179", "337833", "554172", "669026", "264935", "505926", "465988", "271099", "556902", "485501", "599108", "358359", "278650", "609336", "572254", "385575", "522335", "834638", "635324", "570424", "942863", "470637", "996559", "301932", "632025", "229880", "109576", "788175", "914574", "569415", "821291", "634124", "185970", "436744", "704860", "101647", "850933", "368464", "104727", "225655", "615422", "206104", "709548", "226074", "518819", "652298", "353715", "319395", "400381", "298323", "845239", "813847", "964981", "885249", "923289", "126697", "132123", "840286", "682904", "257187", "560496", "746359", "190282", "104333", "820974", "482419", "300169", "653797", "497445", "504791", "974677", "765649", "494391", "456862", "242153", "947099", "297233", "874742", "360668", "435391", "171307", "239305", "930747", "625422", "130055", "844995", "459561", "462242", "146416", "570060", "536378", "213647", "920679", "926661", "558833", "974897", "207143", "913546", "870447", "552493", "510748", "938188", "532913", "167499", "765386", "210890", "262962", "616053", "784997", "861634", "109830", "266063", "457869", "741249", "763585", "949907", "191615", "818768", "194701", "910643", "637252", "370830", "677724", "850978", "375369", "446144", "487346", "523299", "339883", "804691", "579920", "993078", "109832", "905794", "346428", "972835", "442072", "974847", "455276", "838658", "486358", "562913", "541001", "465850", "589494", "907092", "515739", "809347", "137666", "874786", "559221", "247162", "253810", "192327", "571274", "724204", "147039", "193318", "615282", "680026", "594538", "781700", "984848", "610699", "158137", "493201", "482257", "486523", "288843", "376340", "591874", "635533", "459510", "967051", "413727", "568714", "873338", "784142", "663960", "537983", "228015", "122852", "469724", "324199", "199791", "773124", "182215", "377464", "647274", "637261", "396524", "161803", "498048", "137093", "409641", "381346", "746036", "625114", "417996", "435656", "139250", "743287", "597971", "335302", "236690", "614044", "613017", "768688", "774160", "670560", "657807", "783382", "187163", "557026", "479705", "194308", "368553", "496190", "547412", "173565", "471644", "383282", "693042", "105317", "816245", "803073", "988321", "925937", "730644", "482134", "125516", "182810", "689686", "745335", "357563", "797318", "123000", "792398", "320607", "641460", "182745", "420504", "653409", "355976", "884082", "102209", "215487", "471737", "927260", "702589", "402398", "875957", "893906", "598292", "355578", "156004", "250802", "738022", "861105", "960463", "538138", "942785", "224136", "393896", "946965", "931106", "449878", "378063", "210664", "911234", "213856", "203214", "588568", "554477", "321025", "951282", "562737", "938615", "194628", "874342", "734544", "963243", "403614", "108042", "409595", "165986", "336849", "245687", "547858", "906916", "668006", "518416", "643623", "926693", "395381", "953823", "174451", "498894", "751188", "572494", "706173", "720644", "322646", "654622", "464553", "670821", "570426", "362631", "247880", "437580", "252203", "376260", "966934", "472053", "155170", "383762", "311678", "392740", "338463", "742057", "769118", "599926", "422230", "301651", "383830", "163854", "662565", "660484", "617248", "825861", "231197", "916258", "161758", "887223", "383344", "459625", "693149", "351043", "844929", "774573", "660510", "112074", "982904", "507519", "532294", "835886", "161671", "223349", "320534", "776158", "606616", "812455", "555701", "963672", "137571", "330767", "412319", "240615", "118770", "767744", "167183", "849883", "310361", "650883", "876881", "161574", "593465", "413972", "430801", "542491", "141830", "913792", "899387", "922176", "508612", "453383", "988066", "447991", "993055", "322842", "521572", "631055", "321905", "452162", "908980", "890894", "888700", "918787", "457477", "471375", "471133", "354257", "966598", "924670", "371425", "629224", "540985", "721375", "713284", "999959", "662754", "924130", "931461", "257105", "223233", "978076", "705587", "735129", "224104", "986821", "513126", "825668", "748339", "253767", "829906", "244259", "686544", "275362", "504590", "918960", "124890", "745232", "551173", "625539", "182852", "325267", "928844", "564801", "400779", "580684", "592817", "734245", "166317", "238385", "100486", "104523", "499960", "509826", "262040", "257157", "551907", "473018", "336331", "176508", "884653", "884722", "605027", "903230", "476285", "948111", "504675", "671626", "707861", "895905", "507835", "428223", "367585", "830470", "272593", "989877", "699646", "399907", "211482", "554106", "590629", "992337", "415862", "846424", "942249", "270306", "525880", "216978", "847119", "435108", "397388", "345366", "223060", "347262", "900540", "513661", "905359", "108901", "318268", "807301", "786635", "812938", "346494", "385437", "825088", "170378", "389374", "129286", "907574", "735800", "115957", "421717", "112435", "110176", "836931", "189595", "990943", "736490", "531947", "611803", "760374", "388637", "218586", "762291", "988675", "541105", "170671", "240936", "626576", "879316", "710780", "745558", "317535", "207231", "439562", "824767", "860072", "206961", "868341", "680474", "428430", "215439", "481384", "924346", "361021", "655610", "934022", "405382", "573268", "377718", "154462", "935950", "484776", "355148", "204231", "758924", "702105", "609967", "916840", "723783", "931709", "328567", "433520", "421307", "616888", "742840", "961261", "312921", "996298", "889576", "336265", "867761", "447892", "616915", "888181", "549952", "745622", "674160", "291839", "587343", "410984", "444661", "268530", "824378", "831910", "160889", "852682", "318811", "356824", "635094", "299775", "241604", "574570", "663265", "876259", "339351", "522848", "410190", "955014", "691750", "487472", "553269", "754185", "319628", "163711", "201993", "200790", "246049", "948699", "350602", "481169", "702608", "536914", "368936", "551454", "132393", ...]
irb(main):017:0> data.map {|row| row['policyID'] }
=> []
irb(main):018:0> data
=> <#CSV io_type:File io_path:"FL_insurance_sample.csv" encoding:UTF-8 lineno:36635 col_sep:"," row_sep:"\r" quote_char:"\"" headers:["policyID", "statecode", "county", "eq_site_limit", "hu_site_limit", "fl_site_limit", "fr_site_limit", "tiv_2011", "tiv_2012", "eq_site_deductible", "hu_site_deductible", "fl_site_deductible", "fr_site_deductible", "point_latitude", "point_longitude", "line", "construction", "point_granularity"]>
irb(main):019:0> data.to_a
=> []
irb(main):020:0>
Tested on ruby 2.6.2 and 2.5.1 respective csv versions: csv (default: 3.0.4)
and csv (default: 1.0.0)
and I experienced the bug on both.
$ ruby -v
ruby 2.5.3p105 (2018-10-18 revision 65156) [x86_64-darwin16]
$ ruby -r csv -e 'puts CSV::VERSION'
3.0.0
CSV.parse("foo")
#=> [["foo"]]
CSV.parse("foo\n")
#=> [["foo"]]
CSV.parse("foo\nbar\n")
#=> [["foo"], ["bar"]]
CSV.parse("foo\r\n")
#=> [["foo"]]
CSV.parse("foo\r\nbar\r\n")
#=> [["foo"], ["bar"]]
CSV.parse("foo\r")
#=> CSV::MalformedCSVError (Unquoted fields do not allow \r or \n in line 1.)
CSV.parse("foo\rbar\r")
#=> [["foo"], ["bar"]]
Is it intentional?
I expect to be like this.
CSV.parse("foo\r")
#=> [["foo"]]
We should improve performance as much as HoneyFormat: https://github.com/buren/honey_format#benchmark
Probably, v3.0.3, the header of CSV::Table is not refreshed by the addition of new CSV::Row object or Hash.
require 'csv'
txt =<<TXT
"h1","h2","h3"
1,"test",3
2,"test",6
3,"test",9
TXT
table = CSV::parse(txt, headers: true)
table.each do |row|
row << {"h4" => "additional"}
end
puts table.to_csv
h1,h2,h3
1,test,3,additional
2,test,6,additional
3,test,9,additional
h1, h2, h3, h4
1, test, 3, additional
2, test, 6, additional
3, test, 9, additional
I'm not sure but from discussion in stack-overflow board or my trial, it was started from 3.0.1.
v 3.0.0 is working as expect.
When you call
CSV.generate_line(['a'], converters: lambda { |field| field.prepend '=' } )
doesn't do anything.
Checking the code, and correct me if I'm wrong, the only supported options when using generate_line
are
Line 1083 in 9ccdfe3
The "problem" in my opinion is that the documentation says that it can be used any option new
understands:
Lines 563 to 567 in 9ccdfe3
I can try to make changes to make it work if you guys point me what should be the expected behaviour.
require "csv"
p CSV.parse('"a\""')
Hi there, while filing some bug for this project I noticed what code style of this project is not in great shape.
I can help with integration of RuboCop into project and fixing current code style issues if authors are agree with me.
Hey friends I am unable to use this gem for any of the ruby-version which is less than 2.5. I tried with all version of csv gem but still same issue.
The snippet below shows for least version of csv.
Bundler could not find compatible versions for gem "ruby":
In Gemfile:
ruby
csv (~> 0.0.1) was resolved to 0.0.1, which depends on
ruby (>= 2.5.0dev)
Gem version as currently defined in the gemspec is 1.0.0.
But:
$ ruby -r csv -e "p CSV::VERSION"
"2.4.8"
Hi,
We tried to migrate one of our projects to ruby 2.6 and I am dealing with this difference in behavior that doesn't seem to be documented in changelogs.
ruby 2.5.3:
irb(main):006:0> CSV.parse("toto,tata", col_sep: nil)
=> [["t", "o", "t", "o", ",", "t", "a", "t", "a", nil]]
ruby 2.6.3:
irb(main):003:0> CSV.parse("toto,tata", col_sep: nil)
Traceback (most recent call last):
15: from /Users/beauraf/.asdf/installs/ruby/2.6.3/bin/irb:23:in `<main>'
14: from /Users/beauraf/.asdf/installs/ruby/2.6.3/bin/irb:23:in `load'
13: from /Users/beauraf/.asdf/installs/ruby/2.6.3/lib/ruby/gems/2.6.0/gems/irb-1.0.0/exe/irb:11:in `<top (required)>'
12: from (irb):3
11: from /Users/beauraf/.asdf/installs/ruby/2.6.3/lib/ruby/2.6.0/CSV.rb:685:in `parse'
10: from /Users/beauraf/.asdf/installs/ruby/2.6.3/lib/ruby/2.6.0/CSV.rb:1245:in `read'
9: from /Users/beauraf/.asdf/installs/ruby/2.6.3/lib/ruby/2.6.0/CSV.rb:1245:in `to_a'
8: from /Users/beauraf/.asdf/installs/ruby/2.6.3/lib/ruby/2.6.0/CSV.rb:1236:in `each'
7: from /Users/beauraf/.asdf/installs/ruby/2.6.3/lib/ruby/2.6.0/CSV.rb:1430:in `parser_enumerator'
6: from /Users/beauraf/.asdf/installs/ruby/2.6.3/lib/ruby/2.6.0/CSV.rb:1421:in `parser'
5: from /Users/beauraf/.asdf/installs/ruby/2.6.3/lib/ruby/2.6.0/CSV.rb:1421:in `new'
4: from /Users/beauraf/.asdf/installs/ruby/2.6.3/lib/ruby/2.6.0/csv/parser.rb:230:in `initialize'
3: from /Users/beauraf/.asdf/installs/ruby/2.6.3/lib/ruby/2.6.0/csv/parser.rb:328:in `prepare'
2: from /Users/beauraf/.asdf/installs/ruby/2.6.3/lib/ruby/2.6.0/csv/parser.rb:455:in `prepare_separators'
1: from /Users/beauraf/.asdf/installs/ruby/2.6.3/lib/ruby/2.6.0/csv/parser.rb:455:in `escape'
TypeError (no implicit conversion of nil into String)
Is this behaviour intended?
Thanks.
It looks like there's a bug in CSV 3.0.0 where parse
with encoding 'bom|utf-8' doesn't remove the bom character (but it seems to be fine on the read
method)
require 'csv'
require 'tempfile'
t = Tempfile.new("file.csv")
bom_character = 65_279
t << "name\nMy CSV".codepoints.unshift(bom_character).pack("U*")
t.rewind
csv = CSV.read(t, headers: true, encoding: 'bom|utf-8')
csv.first["name"]
=> "My CSV"
csv = CSV.parse(File.read(t), headers: true, encoding: 'bom|utf-8')
csv.first["name"]
=> nil
If you call #seek(n)
or #read(n)
on StringIO
object newer version of CSV.parse
ignores current offset position in IO object.
Consider this examples
Ruby 2.5.5
require 'csv'
io = StringIO.new("#meta\na,b\n1,2")
io.read(6) # => "#meta\n"
CSV.parse(io) # => [["a", "b"], ["1", "2"]]
Ruby 2.6.3
io = StringIO.new("#meta\na,b\n1,2")
io.read(6) # => "#meta\n"
CSV.parse(io) # => [["#meta"], ["a", "b"], ["1", "2"]] (contains skipped data)
It looks that new logic always rewinds StringIO and there is now way to pass IO with specific offset position anymore.
In Ruby 2.5, CSV::MalformedCSVError
simply inherited from RuntimeError
. In 2.6, it defines its own #new
, taking 2 arguments (as opposed to the single argument of RuntimeError
).
Code explicitly raising new errors of this class outside of the CSV library implementation itself breaks since it doesn't include the second argument (a line number).
Is making the line number optional something you'd entertain?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.