Sam Gupta (@databysam) 's Twitter Profile
Sam Gupta

@databysam

Lots about data engineering and cloud solution architecture, and everything about being a part of the AI revolution!

ID: 1794271617419255808

calendar_today25-05-2024 07:37:51

23 Tweet

2 Takipçi

22 Takip Edilen

Sam Gupta (@databysam) 's Twitter Profile Photo

There's a major pivot in the data analytics world recently. With both SnowflakeDB and Databricks making significant investments in open source formats like Iceberg and Delta, more companies will happily embrace vendor-neutral formats. #dataengineering #solutionarchitecture

Sam Gupta (@databysam) 's Twitter Profile Photo

Most data teams are siloed from Business teams - how do you protect yourself as a data engineer? 1. Demand the business purpose of your pipelines and talk to consumers. 2. Profile your data and add data quality checks at the start. 3. Parametrize your variables #dataengineering

Sam Gupta (@databysam) 's Twitter Profile Photo

Are you looking to enter the world of data? Data Engineering is an amazing choice. Even with all the AI advancements, a crucial aspect will always be good data. And DE pipelines are the way you maintain and output good data. Start with SQL if you want to begin!

Sam Gupta (@databysam) 's Twitter Profile Photo

Not sure whether to learn SnowflakeDB or Databricks as a Data Engineer? Short answer is - it doesn’t matter, learn either. When you’re talking about day-to-day work, 2 skills are core: 1. SQL 2. Knowledge of how Distributed Systems work. That’s all. Anything else is a bonus.

Sam Gupta (@databysam) 's Twitter Profile Photo

Here's an easy visual to understand the execution model of Spark streaming. Think of input as an unbounded table & each processing run consumes only a small part, with specific schema and specific data. Everything else is identical to batch #dataengineering #solutionarchitecture