objectif
Interroger des fichiers Parquet et DataFrames en SQL.
code minimal
import duckdb, pandas as pd
con = duckdb.connect()
con.execute("CREATE TABLE t AS SELECT 1 AS id")
df = con.execute("SELECT * FROM t").df()
print(df.to_dict(orient="records"))
utilisation
import duckdb, pandas as pd
import numpy as np
df = pd.DataFrame({"id":[1,2,3], "x":[10,20,30]})
res = duckdb.query("SELECT id, x*2 AS x2 FROM df").df()
print(res.x2.tolist())
variante(s) utile(s)
# duckdb.query("SELECT * FROM 'data/*.parquet'").df()
print("ok")
notes
- Très pratique pour prototyper des requêtes analytiques.