I’m currently a third-year at the University of Virginia and for my DS 3002 class, we were tasked with creating an ETL Data Processing Pipeline. ETL stands for Extract, Transform, and Load, which helps summarize exactly what an ETL pipeline is responsible for. Specifically, an ETL pipeline has a series of processes that help extract data from a given source, transform it in some manner, and then load the output into some destination.
Although I could have gone the typical form for creating an ETL pipeline — using a script — the first thing the instructions for the project…
Student at the University of Virginia