Start Data Engineering
@startdataeng
I write about data engineering | SQL | Python | Distributed systems. Get my free data engineering course at https://t.co/sZTEcV0Q9W
ID:1249353981106798595
https://www.startdataengineering.com/ 12-04-2020 15:11:13
1,4K Tweets
7,5K Followers
30 Following
An orchestration tool that I've been impressed with is Not Dagster. Easy setup, powerful features and great docs.
Use 👇🏽 to play around with a pipeline on dagster
startdataengineering.com/post/data-engi…
#data #data engineering #Python #Database #DataAnalytics
Exercise project for anyone starting in data engineering startdataengineering.com/post/data-engi…
#dataengineering #bigdata #ETL #ApacheAirflow #AWS #ApacheSpark
When data to process is larger than memory, try to stream with python generators, before jumping to distributed systems!
#data #data engineering #Python #pythonlearning #Generator
E.g. Stream a file(note () and not []), get diff between date cols