Data Ingestion
Data ingestion, in essence, involves transferring data from a source to a designated target.
Its primary aim is to usher data into an environment! ronment primed for staging, processing, analysis, and artificial intel! ligence/machine learning (AI/ML). While massive organizations may focus on moving data internally (among teams), for most of us, data ingestion emphasizes pulling data from external sources and directing it to in-house targets.
| ETL steps |
Data ingestion: now vs then
-
Old world: traditional ETL
-
Extract → Transform → Load
-
You heavily clean/transform before loading into a warehouse.
-
-
New world: mostly ELT
-
Extract → Load everything into cheap cloud storage → Transform later
-
Storage is cheap, compute is flexible, so people prefer “store first, think later”.
-
-
Big trends that changed ingestion:
-
Cloud + warehouses + lakehouses
-
Streaming/real-time data, not just nightly batches
-
So: we still do “ETL”, but the order + tools changed.
Comments
Post a Comment