objectif
Charger du JSONL (NDJSON) efficacement.
code minimal
import pandas as pd, io
s = '{"a":1}\n{"a":2}\n'
df = pd.read_json(io.StringIO(s), lines=True)
print(df["a"].sum())
utilisation
import pandas as pd, io
s = '{"x": "a"}\n{"x": "b"}\n'
print(pd.read_json(io.StringIO(s), lines=True).shape[0])
variante(s) utile(s)
import pandas as pd, io
s = '{"a":1,"b":{"c":2}}\n'
print("b" in pd.read_json(io.StringIO(s), lines=True).columns)
notes
- lines=True pour un enregistrement par ligne.