Class CsvEscape
Utility class for performing CSV escape/unescape operations.
FeaturesSpecific features of the CSV escape/unescape operations performed by means of this class:
- Works according to the rules specified in RFC4180 (there is no CSV standard as such).
- Encloses escaped values in double-quotes ("value") if they contain any non-alphanumeric characters.
- Escapes double-quote characters (") by writing them twice: "".
There are four different input/output modes that can be used in escape/unescape operations:
- String input, String output: Input is specified as a String object and output is returned as another. In order to improve memory performance, all escape and unescape operations will return the exact same input object as output if no escape/unescape modifications are required.
- String input, java.io.Writer output: Input will be read from a String and output will be written into the specified java.io.Writer.
- java.io.Reader input, java.io.Writer output: Input will be read from a Reader and output will be written into the specified java.io.Writer.
- char[] input, java.io.Writer output: Input will be read from a char array (char[]) and output will be written into the specified java.io.Writer. Two int arguments called offset and len will be used for specifying the part of the char[] that should be escaped/unescaped. These methods should be called with offset = 0 and len = text.length in order to process the whole char[].
In order for Microsoft Excel to correcly open a CSV file —including field values with line breaks— these rules should be followed:
- Separate fields with comma (,) in English-language setups, and semi-colon (;) in non-English-language setups (this depends on the language of the installation of MS Excel you intend your files to be open in).
- Separate records with Windows-style line breaks (\r\n, U+000D + U+000A).
- Enclose field values in double-quotes (") if they contain any non-alphanumeric characters.
- Don't leave any whitespace between the field separator (;) and the enclosing quotes (").
- Escape double-quote characters (") inside field values with two double-quotes ("").
- Use \n (U+000A, unix-style line breaks) for line breaks inside field values, even if records are separated with Windows-style line breaks (\r\n) [ EXCEL 2003 compatibility ].
- Open CSV files in Excel with File -> Open..., not with Data -> Import... The latter option will not correctly understand line breaks inside field values (up to Excel 2010).
(Note unbescape will perform escaping of field values only, so it will take care of enclosing in double-quotes, using unix-style line breaks inside values, etc. But separating fields (e.g. with ;), delimiting records (e.g. with \r\n) and using the correct character encoding when writing CSV files will be the responsibility of the application calling unbescape.)
The described format for Excel is also supported by OpenOffice.org Calc (File -> Open...) and also Google Spreadsheets (File -> Import...)
ReferencesThe following references apply:
- Since:
- 1.0.0
- Author:
- Daniel Fernández
-
Method Summary
Modifier and TypeMethodDescriptionstatic void
Perform a CSV escape operation on a char[] input.static void
Perform a CSV escape operation on a Reader input, writing results to a Writer.static String
Perform a CSV escape operation on a String input.static void
Perform a CSV escape operation on a String input, writing results to a Writer.static void
unescapeCsv
(char[] text, int offset, int len, Writer writer) Perform a CSV unescape operation on a char[] input.static void
unescapeCsv
(Reader reader, Writer writer) Perform a CSV unescape operation on a Reader input, writing results to a Writer.static String
unescapeCsv
(String text) Perform a CSV unescape operation on a String input.static void
unescapeCsv
(String text, Writer writer) Perform a CSV unescape operation on a String input, writing results to a Writer.
-
Method Details
-
escapeCsv
Perform a CSV escape operation on a String input.
This method is thread-safe.
- Parameters:
text
- the String to be escaped.- Returns:
- The escaped result String. As a memory-performance improvement, will return the exact same object as the text input argument if no escaping modifications were required (and no additional String objects will be created during processing). Will return null if input is null.
-
escapeCsv
Perform a CSV escape operation on a String input, writing results to a Writer.
This method is thread-safe.
- Parameters:
text
- the String to be escaped.writer
- the java.io.Writer to which the escaped result will be written. Nothing will be written at all to this writer if input is null.- Throws:
IOException
- if an input/output exception occurs- Since:
- 1.1.2
-
escapeCsv
Perform a CSV escape operation on a Reader input, writing results to a Writer.
This method is thread-safe.
- Parameters:
reader
- the Reader reading the text to be escaped.writer
- the java.io.Writer to which the escaped result will be written. Nothing will be written at all to this writer if input is null.- Throws:
IOException
- if an input/output exception occurs- Since:
- 1.1.2
-
escapeCsv
Perform a CSV escape operation on a char[] input.
This method is thread-safe.
- Parameters:
text
- the char[] to be escaped.offset
- the position in text at which the escape operation should start.len
- the number of characters in text that should be escaped.writer
- the java.io.Writer to which the escaped result will be written. Nothing will be written at all to this writer if input is null.- Throws:
IOException
- if an input/output exception occurs
-
unescapeCsv
Perform a CSV unescape operation on a String input.
This method is thread-safe.
- Parameters:
text
- the String to be unescaped.- Returns:
- The unescaped result String. As a memory-performance improvement, will return the exact same object as the text input argument if no unescaping modifications were required (and no additional String objects will be created during processing). Will return null if input is null.
-
unescapeCsv
Perform a CSV unescape operation on a String input, writing results to a Writer.
This method is thread-safe.
- Parameters:
text
- the String to be unescaped.writer
- the java.io.Writer to which the unescaped result will be written. Nothing will be written at all to this writer if input is null.- Throws:
IOException
- if an input/output exception occurs- Since:
- 1.1.2
-
unescapeCsv
Perform a CSV unescape operation on a Reader input, writing results to a Writer.
This method is thread-safe.
- Parameters:
reader
- the Reader reading the text to be unescaped.writer
- the java.io.Writer to which the unescaped result will be written. Nothing will be written at all to this writer if input is null.- Throws:
IOException
- if an input/output exception occurs- Since:
- 1.1.2
-
unescapeCsv
Perform a CSV unescape operation on a char[] input.
This method is thread-safe.
- Parameters:
text
- the char[] to be unescaped.offset
- the position in text at which the unescape operation should start.len
- the number of characters in text that should be unescaped.writer
- the java.io.Writer to which the unescaped result will be written. Nothing will be written at all to this writer if input is null.- Throws:
IOException
- if an input/output exception occurs
-