Skip to main content
Version: 2.0

Transformations Overview

Teckel provides 31 transformation types for building data pipelines declaratively in YAML. Each transformation is an asset with a unique name and exactly one operation key.

Formal reference: Section 8 — Transformations in the Teckel Specification.

31 transformation types

transformation:
- name: myTransformation
<operation_key>:
...

Every transformation entry must have exactly one operation key. Unless otherwise stated, transformations consume one upstream asset via a from field and produce one output dataset.


Transformation Categories

Filtering and Row Selection

YAML KeySectionDescription
select8.1Project specific columns or expressions
where8.2Filter rows by a boolean condition
distinct8.9Remove duplicate rows
limit8.10Return at most N rows
sample8.19Return a random sample of rows

Aggregation and Sorting

YAML KeySectionDescription
group8.3Group rows and apply aggregate functions
orderBy8.4Sort rows by one or more columns
rollup8.23Hierarchical aggregation with subtotals
cube8.24Multi-dimensional aggregation (all combos)

Joins

YAML KeySectionDescription
join8.5Join datasets (7 types including cross)

Set Operations

YAML KeySectionDescription
union8.6Combine rows from multiple datasets
intersect8.7Rows present in all source datasets
except8.8Rows in the first source but not the second

Column Operations

YAML KeySectionDescription
addColumns8.11Add computed columns
dropColumns8.12Remove columns
renameColumns8.13Rename columns via a mapping
castColumns8.14Change column data types

Window Functions

YAML KeySectionDescription
window8.15Apply window functions over partitions

Reshaping

YAML KeySectionDescription
pivot8.16Rotate rows into columns (long to wide)
unpivot8.17Rotate columns into rows (wide to long)
flatten8.18Flatten nested structures
conditional8.20Add a column with CASE WHEN logic
split8.21Split dataset into two based on condition

Advanced

YAML KeySectionDescription
sql8.22Execute a raw SQL query
scd28.25Slowly Changing Dimension Type 2
enrich8.26Enrich records via external HTTP API
schemaEnforce8.27Validate or evolve dataset schema
assertion8.28Validate data quality rules
repartition8.29Change partition count (with shuffle)
coalesce8.30Reduce partitions (no full shuffle)
custom8.31Invoke a user-registered component

General Rules

  • Each transformation entry must have exactly one operation key.
  • Unknown operation keys produce an error.
  • The pipeline forms a DAG — execution order is determined by data dependencies, not YAML order.
  • Asset names must be unique across all inputs and transformations.