Why CSV-to-JSON is trickier than it looks
CSV has no formal type system — every value is text. A converter has to guess whether 42 should become the JSON number 42 or the string "42", and whether true is a boolean or a value in a column called "Status". The wrong guess silently corrupts data — a leading-zero zip code 01234 parsed as a number becomes 1234. Phone numbers, ID codes, and version strings all suffer the same problem.
Edge cases that silently corrupt your data
- Quoted fields with embedded commasCSV spec (RFC 4180) allows comma-containing values if wrapped in double quotes:
"New York, NY". A naive splitter on,breaks this into two fields. Always verify the converter handles quoted fields. - Newlines inside quoted fieldsA CSV field can contain a literal newline if quoted:
"line1\nline2". Converters that split on line breaks first will produce a corrupt parse for multi-line values. - Leading zerosUS zip codes, ISBNs, product codes, and phone numbers often have leading zeros. Auto-typed as numbers, the zeros are dropped. Treat all ID and code columns as strings — check the output values carefully before using in production.
- Inconsistent row lengthSome CSV exports produce rows with fewer columns than the header. The converter should fill missing fields with
nullor omit the key entirely — verify which behavior your downstream code expects.
When to use a parsing library instead
For one-off exploration or small files, this tool is the fastest option. For production code that ingests CSV (user uploads, data pipelines, ETL), use a proper parsing library: Papa Parse in the browser (handles all RFC 4180 edge cases, streams large files), csv-parse in Node.js, or pandas.read_csv() in Python. These handle quoted fields, multi-line values, and encoding issues that simple implementations miss.
