Parquet
Example parquet
datasource configuration:
from yummy import ParquetSource
my_stats_csv = ParquetSource(
path="/home/jovyan/notebooks/dataset/all_data.parquet",
timestamp_field="datetime",
)
additionally you can setup: s3_endpoint_override: str
You can read:
- single file:
path="/home/jovyan/notebooks/dataset/all_data.parquet"
- directory:
path="/home/jovyan/notebooks/dataset/"
- selected files:
path="/home/jovyan/notebooks/dataset/2022-*.parquet"
You can also read from:
local filesystem
s3
store (you can uses3_endpoint_override
to use custom s3 like minio