Skip to main content

Chapter 18 - CSV, JSON, and XML Files (JavaScript)

Here's a JavaScript-flavoured version of the same concepts, with small JS/Node examples for each idea.


Overview: CSV, JSON, XML

Same three data serialization formats. In JavaScript:

  • CSV: use the csv-parse and csv-stringify packages (or papaparse).
  • JSON: built-in JSON.parse() / JSON.stringify() — no package needed.
  • XML: use fast-xml-parser or xml2js.

CSV format

What CSV looks like and limits

Same format: each line = row, commas separate cells, everything is text. Same limitations vs Excel.

You shouldn't parse CSV with split(',') — use a proper parser.

Reading CSV

Using papaparse (works in Node and browser):

npm install papaparse
const fs = require("fs");
const Papa = require("papaparse");

const csvText = fs.readFileSync("example3.csv", "utf-8");
const result = Papa.parse(csvText);
const data = result.data; // 2D array

console.log(data[0][0]); // '4/5/2035 13:34'
console.log(data[0][1]); // 'Apples'
console.log(data[6][1]); // 'Strawberries'

Iterating rows

for (let i = 0; i < data.length; i++) {
console.log(`Row #${i + 1}`, data[i]);
}

Skip headers by starting at index 1.

Writing CSV

const fs = require("fs");
const Papa = require("papaparse");

const rows = [
["spam", "eggs", "bacon", "ham"],
["Hello, world!", "eggs", "bacon", "ham"],
[1, 2, 3.141592, 4],
];

const csvText = Papa.unparse(rows);
fs.writeFileSync("output.csv", csvText, "utf-8");

Papa.unparse() handles quoting automatically (e.g., "Hello, world!").

Custom delimiters (TSV)

const tsvText = Papa.unparse(rows, { delimiter: "\t" });
fs.writeFileSync("output.tsv", tsvText, "utf-8");

Reading with headers (like DictReader)

const csvText = fs.readFileSync("exampleWithHeader3.csv", "utf-8");
const result = Papa.parse(csvText, { header: true });

for (const row of result.data) {
console.log(row.Timestamp, row.Fruit, row.Quantity);
}

With header: true, each row is an object keyed by column name — equivalent to Python's DictReader.

Writing with headers (like DictWriter)

const data = [
{ Name: "Alice", Pet: "cat", Phone: "555-1234" },
{ Name: "Bob", Phone: "555-9999" },
{ Name: "Carol", Pet: "dog", Phone: "555-5555" },
];

const csvText = Papa.unparse(data, { columns: ["Name", "Pet", "Phone"] });
fs.writeFileSync("output.csv", csvText, "utf-8");

Missing keys produce empty cells, just like Python's DictWriter.

Project: Remove the header from CSV files

const fs = require("fs");
const path = require("path");
const Papa = require("papaparse");

fs.mkdirSync("headerRemoved", { recursive: true });

for (const filename of fs.readdirSync(".")) {
if (!filename.endsWith(".csv")) continue;
console.log(`Removing header from ${filename}...`);

const csvText = fs.readFileSync(filename, "utf-8");
const result = Papa.parse(csvText);
const rows = result.data.slice(1); // skip header

const output = Papa.unparse(rows);
fs.writeFileSync(path.join("headerRemoved", filename), output, "utf-8");
}

JSON

What JSON looks like

JSON is native to JavaScript — it's literally JavaScript Object Notation.

{
"name": "Alice Doe",
"age": 30,
"car": null,
"programmer": true,
"address": {
"street": "100 Larkin St.",
"city": "San Francisco",
"zip": "94102"
},
"phone": [
{ "type": "mobile", "number": "415-555-7890" },
{ "type": "work", "number": "415-555-1234" }
]
}

In JavaScript there's no type mismatch — null, true, false are already JS values. The only rule: JSON strings must use double quotes.

JSON.parse: JSON string → JavaScript

const jsonString =
'{"name": "Alice Doe", "age": 30, "car": null, "programmer": true}';

const data = JSON.parse(jsonString);
console.log(data);
// { name: 'Alice Doe', age: 30, car: null, programmer: true }

JSON.stringify: JavaScript → JSON string

const data = {
name: "Alice Doe",
age: 30,
car: null,
programmer: true,
address: {
street: "100 Larkin St.",
city: "San Francisco",
zip: "94102",
},
phone: [
{ type: "mobile", number: "415-555-7890" },
{ type: "work", number: "415-555-1234" },
],
};

const jsonString = JSON.stringify(data);
console.log(jsonString); // single-line JSON

Pretty-printing:

const pretty = JSON.stringify(data, null, 2);
console.log(pretty);

Reading/writing JSON files

const fs = require("fs");

// Write
fs.writeFileSync("data.json", JSON.stringify(data, null, 2));

// Read
const loaded = JSON.parse(fs.readFileSync("data.json", "utf-8"));
console.log(loaded.name); // 'Alice Doe'

XML

XML basics

Same tag-based format. In JavaScript, use fast-xml-parser:

npm install fast-xml-parser

Reading XML

const { XMLParser } = require("fast-xml-parser");
const fs = require("fs");

const xmlString = fs.readFileSync("my_data.xml", "utf-8");
const parser = new XMLParser();
const result = parser.parse(xmlString);

console.log(result.person.name); // 'Alice Doe'
console.log(result.person.age); // 30
console.log(result.person.address.street); // '100 Larkin St.'
console.log(result.person.phone.phoneEntry[0].number); // '415-555-7890'

fast-xml-parser converts XML into a JavaScript object hierarchy — elements become properties, repeated elements become arrays.

Reading with attributes

const parser = new XMLParser({ ignoreAttributes: false });
const result = parser.parse(xmlString);
// Attributes appear as @_attributeName properties

Building/writing XML

const { XMLBuilder } = require("fast-xml-parser");

const data = {
person: {
name: "Alice Doe",
age: 30,
address: {
street: "100 Larkin St.",
city: "San Francisco",
},
},
};

const builder = new XMLBuilder({ format: true });
const xmlString = builder.build(data);
console.log(xmlString);

Alternative: xml2js

npm install xml2js
const xml2js = require("xml2js");

const xmlString = "<person><name>Alice Doe</name><age>30</age></person>";
const result = await xml2js.parseStringPromise(xmlString);
console.log(result.person.name[0]); // 'Alice Doe'

xml2js wraps every value in an array (since XML elements can repeat), which is less convenient than fast-xml-parser.


Overall idea of the chapter

Chapter 18 in JavaScript: CSV parsing with papaparse (header mode for dict-like access), JSON with built-in JSON.parse/JSON.stringify (no package needed — JSON is native to JS), and XML with fast-xml-parser (converts to/from JS objects). CSV is simplest for tabular data, JSON is the natural choice in JavaScript, and XML is for legacy systems.