JSON (JavaScript Object Notation) is a text-based file format that represents nested key-value structures. It’s the right choice when the data isn’t naturally tabular — when records have nested structure, when fields differ from one record to the next, or when there are arrays of varying length within a record. JSON maps directly to Python dictionaries and lists, which is why it’s so common in Python tooling.

A typical JSON file is a list of records, each record an object:

[
  {"name": "Alice", "age": 31, "email": "[email protected]"},
  {"name": "Bob",   "age": 24, "email": "[email protected]"}
]

In Python with Pandas:

df = pd.read_json("my_data.json")

JSON supports four primitive types — string, number, boolean, null — and two structured types — object (key-value, like a Python dictionary) and array (ordered list, like a Python list). These compose: arrays of objects, objects whose values are arrays of objects, and so on, to arbitrary depth. The lack of a schema means JSON can represent almost anything; the same property means consumers have to be tolerant of variation.

JSON’s competitors in the same niche are XML (more verbose, more structured, common in older enterprise systems) and YAML (less rigid, easier for humans to write, more common in configuration files). For purely tabular data, CSV is simpler. For large scientific arrays, HDF5 is more efficient.