diff options
author | adamroyjones <[email protected]> | 2021-11-18 21:20:09 +0000 |
---|---|---|
committer | Sutou Kouhei <[email protected]> | 2021-12-24 14:35:33 +0900 |
commit | c70dc3cafb29d89d0377677ead346495183db47e () | |
tree | 7acd1af5074bd2e78ee0cb54af78e9224034abca /lib/csv | |
parent | 47c53af16872d61576184b0d6935fcf531564cc4 (diff) |
[ruby/csv] Add handling for ambiguous parsing options (https://.com/ruby/csv/pull/226)
: fix GH-225 With Ruby 3.0.2 and csv 3.2.1, the file ```ruby require "csv" File.open("example.tsv", "w") { |f| f.puts("foo\t\tbar") } CSV.read("example.tsv", col_sep: "\t", strip: true) ``` produces the error ``` lib/csv/parser.rb:935:in `parse_quotable_robust': TODO: Meaningful message in line 1. (CSV::MalformedCSVError) ``` However, the CSV in this example is not malformed; instead, ambiguous options were provided to the parser. It is not obvious (to me) whether the string should be parsed as - `["foo\t\tbar"]`, - `["foo", "bar"]`, - `["foo", "", "bar"]`, or - `["foo", nil, "bar"]`. This commit adds code that raises an exception when this situation is encountered. Specifically, it checks if the column separator either ends with or starts with the characters that would be stripped away. This commit also adds unit tests and updates the documentation. https://.com/ruby/csv/commit/cc317dd42d
Notes: Merged: https://.com/ruby/ruby/pull/5336
-rw-r--r-- | lib/csv/parser.rb | 23 |
1 files changed, 23 insertions, 0 deletions
@@ -361,6 +361,7 @@ class CSV prepare_skip_lines prepare_strip prepare_separators prepare_quoted prepare_unquoted prepare_line @@ -531,6 +532,28 @@ class CSV @not_line_end = Regexp.new("[^\r\n]+".encode(@encoding)) end def prepare_quoted if @quote_character @quotes = Regexp.new(@escaped_quote_character + |