Parquet
Example parquet datasource configuration:
from yummy import ParquetSource
my_stats_csv = ParquetSource(
path="/home/jovyan/notebooks/dataset/all_data.parquet",
timestamp_field="datetime",
)
additionally you can setup: s3_endpoint_override: str
You can read:
- single file:
path="/home/jovyan/notebooks/dataset/all_data.parquet" - directory:
path="/home/jovyan/notebooks/dataset/" - selected files:
path="/home/jovyan/notebooks/dataset/2022-*.parquet"
You can also read from:
local filesystems3store (you can uses3_endpoint_overrideto use custom s3 like minio