What is a good ETL pipeline vs bad one.
Sigiloso
Great question. When we talk about an ETL pipeline as good or bad, there are multiple questions to ask: -> What are the data sources -> If it is a batch or real time processing -> Where is the data processed is going to be stored (a data lake or a warehouse) -> Is the pipeline decoupled -> Is the pipeline scalable, reliable, performance efficient and so on. So, getting these questions right and consistently monitoring the pipelines, defining SLA's and identifying the bottlenecks and optimizing them is crucial. All these included will result in a pipeline being good or bad.