← retour aux snippets

pandas: read_json(lines=True)

Charger du JSONL (NDJSON) efficacement.

objectif

Charger du JSONL (NDJSON) efficacement.

code minimal

import pandas as pd, io
s = '{"a":1}\n{"a":2}\n'
df = pd.read_json(io.StringIO(s), lines=True)
print(df["a"].sum())

utilisation

import pandas as pd, io
s = '{"x": "a"}\n{"x": "b"}\n'
print(pd.read_json(io.StringIO(s), lines=True).shape[0])

variante(s) utile(s)

import pandas as pd, io
s = '{"a":1,"b":{"c":2}}\n'
print("b" in pd.read_json(io.StringIO(s), lines=True).columns)

notes

  • lines=True pour un enregistrement par ligne.