With DeltaSource you can consume Delta Lake to fetch the features.

Example delta datasource configuration:

from yummy import DeltaSource

my_stats_csv = DeltaSource(

additionally you can setup: s3_endpoint_override: str

You can also read from:

  • local filesystem
  • s3 store (you can use s3_endpoint_override to use custom s3 like minio

DeltaSource we be handled differently for yummy backends. For polars, dask, ray the rust delta-rs implementation will be used.

For spark the spark delta reader will be used. To use it you will need to include addtional spark packages (eg.

For example to run pyspark with jupyter driver:


export PYSPARK_DRIVER_PYTHON_OPTS="lab --notebook-dir=/home/jovyan --ip='' --port=8888 --no-browser --allow-root --NotebookApp.password='' --NotebookApp.token=''"

pyspark \
    --packages \
    --conf "" \
    --conf "" \
    --conf "spark.driver.memory=5g" \
    --conf "spark.executor.memory=5g"