Version: Next

Configuration and Config Merging

This chapter covers the pipeline configuration section and the rules for merging multiple Teckel documents.

Configuration

The config top-level section controls pipeline-wide settings.

Schema

config:
  backend: <string>
  cache:
    autoCacheThreshold: <integer>
    defaultStorageLevel: <string>
  notifications:
    onSuccess: <List[NotificationTarget]>
    onFailure: <List[NotificationTarget]>
  components:
    readers: <List[ComponentDef]>
    transformers: <List[ComponentDef]>
    writers: <List[ComponentDef]>

All config fields are optional. Implementations define their own defaults.

Backend Hint

Field	Type	Required	Default	Description
`backend`	string	No	impl-defined	Hint for the execution backend

The backend field suggests which engine should execute the pipeline. This is advisory; the runtime may ignore it if it only supports one backend.

config:
  backend: spark       # or "datafusion", "duckdb", etc.

Cache Configuration

Field	Type	Required	Default	Description
`autoCacheThreshold`	integer	No	impl-defined	Auto-cache assets consumed by N or more downstream assets
`defaultStorageLevel`	string	No	impl-defined	Default storage level for caching

When an intermediate asset is referenced by multiple downstream transformations, caching avoids redundant computation. The autoCacheThreshold tells the runtime to automatically cache any asset consumed by at least that many downstream assets.

config:
  cache:
    autoCacheThreshold: 2                   # cache if 2+ downstream consumers
    defaultStorageLevel: "MEMORY_AND_DISK"  # spill to disk if memory is full

Notifications

Notifications alert on pipeline success or failure.

Field	Type	Required	Default	Description
`channel`	string	Yes	—	`"log"`, `"webhook"`, or `"file"`
`url`	string	Conditional	—	Required for `"webhook"`
`path`	string	Conditional	—	Required for `"file"`

config:
  notifications:
    onSuccess:
      - channel: log
      - channel: webhook
        url: "https://hooks.slack.com/services/T00/B00/xxx"
    onFailure:
      - channel: webhook
        url: "https://hooks.slack.com/services/T00/B00/xxx"
      - channel: file
        path: "logs/failures.log"

Components

The components section registers custom readers, transformers, and writers that can be referenced by the custom transformation.

config:
  components:
    readers:
      - name: myCustomReader
        class: "com.example.CustomReader"
    transformers:
      - name: deduplicate
        class: "com.example.DeduplicateTransformer"
    writers:
      - name: deltaWriter
        class: "com.example.DeltaWriter"

Complete Configuration Example

version: "2.0"

config:
  backend: spark
  cache:
    autoCacheThreshold: 2
    defaultStorageLevel: "MEMORY_AND_DISK"
  notifications:
    onSuccess:
      - channel: log
    onFailure:
      - channel: webhook
        url: "https://alerts.example.com/pipeline-failure"
  components:
    transformers:
      - name: mlScore
        class: "com.example.MLScoringTransformer"

input:
  - name: data
    format: parquet
    path: "data/input/"

transformation:
  - name: scored
    custom:
      from: data
      component: mlScore
      options:
        model_path: "models/v3.2"

output:
  - name: scored
    format: parquet
    mode: overwrite
    path: "data/output/scored"

Config Merging

Multiple Teckel documents can be merged into a single pipeline. This enables environment-specific overrides (for example, a base pipeline definition with a development overlay).

Merge Semantics

Value Type	Merge Behavior
Object (YAML mapping)	Deep merge: overlay fields override base fields recursively
Array (YAML sequence)	Replaced entirely by the overlay
Scalar	Replaced by the overlay

The key distinction is between objects and arrays. Objects are merged field by field (recursively), while arrays are replaced wholesale.

Merge Order

Files are merged left to right. The rightmost file has the highest precedence.

base.yaml + dev.yaml  -->  merged result

If three files are provided:

base.yaml + common.yaml + dev.yaml  -->  merged result

Each subsequent file overlays the accumulated result.

Example

base.yaml — the production pipeline definition:

version: "2.0"

config:
  cache:
    autoCacheThreshold: 3
    defaultStorageLevel: "MEMORY_AND_DISK"

input:
  - name: source
    format: parquet
    path: "${INPUT_PATH}"

output:
  - name: source
    format: parquet
    mode: overwrite
    path: "${OUTPUT_PATH}"

dev.yaml — the development overlay:

input:
  - name: source
    format: csv
    path: "data/local/input.csv"
    options:
      header: true

output:
  - name: source
    path: "data/local/output"

Merged result:

The input and output arrays from dev.yaml completely replace those from base.yaml (arrays are replaced, not merged element by element). The config section is preserved from base.yaml because dev.yaml does not declare a config section. If dev.yaml had a config key, it would be deep-merged with the base config object.

The effective merged document is:

version: "2.0"

config:
  cache:
    autoCacheThreshold: 3
    defaultStorageLevel: "MEMORY_AND_DISK"

input:
  - name: source
    format: csv
    path: "data/local/input.csv"
    options:
      header: true

output:
  - name: source
    path: "data/local/output"

Deep Merge for Objects

When both base and overlay have the same object key, fields are merged recursively:

base.yaml:

config:
  cache:
    autoCacheThreshold: 3
    defaultStorageLevel: "MEMORY_AND_DISK"
  notifications:
    onFailure:
      - channel: webhook
        url: "https://alerts.example.com"

overlay.yaml:

config:
  cache:
    autoCacheThreshold: 2

Result:

config:
  cache:
    autoCacheThreshold: 2                   # overridden by overlay
    defaultStorageLevel: "MEMORY_AND_DISK"  # preserved from base
  notifications:                             # preserved from base
    onFailure:
      - channel: webhook
        url: "https://alerts.example.com"

The autoCacheThreshold scalar is replaced, defaultStorageLevel is preserved because the overlay does not mention it, and the entire notifications object is preserved.

Configuration​

Schema​

Backend Hint​

Cache Configuration​

Notifications​

Components​

Complete Configuration Example​

Config Merging​

Merge Semantics​

Merge Order​

Example​

Deep Merge for Objects​