Mark Lyons (@mcl5tech) 's Twitter Profile
Mark Lyons

@mcl5tech

product @cloudera | prev product @aws @dremio @verticaunified • #data #analytics #design #tech for 🌍

ID: 907193988

linkhttp://www.markclyons.com calendar_today27-10-2012 02:22:36

1,1K Tweet

910 Takipçi

4,4K Takip Edilen

Dremio (@dremio) 's Twitter Profile Photo

Subsurface LIVE is back! Coming in the Spring of 2023 🎉 We’re accepting proposals now for key topics. See details + submit your proposal now 🎤 lnkd.in/gMXtSTSJ #CallForSpeakers #Data #ApacheIceberg #DataLakehouse

Subsurface LIVE is back! Coming in the Spring of 2023 🎉

We’re accepting proposals now for key topics. See details + submit your proposal now 🎤

lnkd.in/gMXtSTSJ

#CallForSpeakers #Data #ApacheIceberg #DataLakehouse
Mark Lyons (@mcl5tech) 's Twitter Profile Photo

Always great to catch up with people who have depth in the data space to share the stories from academic papers to how companies have been created. Thanks Juan Sequeda Tim Gasper

Dremio (@dremio) 's Twitter Profile Photo

Don't miss your chance to take the stage at Subsurface LIVE, coming in the Spring of 2023 🎉 We’re accepting proposals now for key topics. See details + submit your proposal now 🎤 lnkd.in/gMXtSTSJ #CallForSpeakers #Data #ApacheIceberg #DataLakehouse

Don't miss your chance to take the stage at Subsurface LIVE, coming in the Spring of 2023 🎉

We’re accepting proposals now for key topics. See details + submit your proposal now 🎤

lnkd.in/gMXtSTSJ

#CallForSpeakers #Data #ApacheIceberg #DataLakehouse
Dipankar Mazumdar (@dipankartnt) 's Twitter Profile Photo

Merge-On-Read (MOR) Vs Copy-On-Write (COW) in Apache Iceberg. Both these approaches are used to deal with deletes & updates of data files in the Data lake. Let’s break down @IcebergDevs👇 #DataEngineering #data

Merge-On-Read (MOR) Vs Copy-On-Write (COW) in <a href="/ApacheIceberg/">Apache Iceberg</a>.

Both these approaches are used to deal with deletes &amp; updates of data files in the Data lake.

Let’s break down @IcebergDevs👇

#DataEngineering #data
Dremio (@dremio) 's Twitter Profile Photo

With all the recent news about #ApacheIceberg we thought we'd share this video from last year's Subsurface Conference. We're looking for speakers for our event happening in spring 2023 🎤 submit your talk today! sessionize.com/subsurface-liv… #CallForSpeakers

Dipankar Mazumdar (@dipankartnt) 's Twitter Profile Photo

How do we migrate from one catalog to another for Apache Iceberg tables? if you are already using a catalog (say HDFS) & want to change it to something else (say AWS Glue), how is that possible? A 🧵 for @IcebergDevs #dataengineering

How do we migrate from one catalog to another for <a href="/ApacheIceberg/">Apache Iceberg</a> tables?

if you are already using a catalog (say HDFS) &amp; want to change it to something else (say AWS Glue), how is that possible?

A 🧵 for @IcebergDevs 

#dataengineering
Alex Merced | Open Data Lakehouse Advocate (@amdatalakehouse) 's Twitter Profile Photo

If you find what you see interesting here is a tutorial I wrote giving you a step by step guide getting setup and doing an example exercise -> dremio.com/blog/managing-…

Dremio (@dremio) 's Twitter Profile Photo

Are you heading to AWS re:Invent later this month? Check out this link for all the details on how you can: ➡️ Schedule a meeting with us ➡️ Enter our Dremio Cloud data challenge (for a chance to win a PS5!) ➡️ RSVP to our cocktail reception awsreinventdremio2022.splashthat.com #AWSreInvent

Are you heading to AWS re:Invent later this month? Check out this link for all the details on how you can:

➡️ Schedule a meeting with us
➡️ Enter our Dremio Cloud data challenge (for a chance to win a PS5!)
➡️ RSVP to our cocktail reception

awsreinventdremio2022.splashthat.com

#AWSreInvent
Dremio (@dremio) 's Twitter Profile Photo

We're thrilled to announce that we've been named to CNBC’s ‘Top Startups for the Enterprise’ Inaugural List 🎉 Read more about our open data lakehouse and this inaugural list here: bwnews.pr/3UehuIN #CNBC #TopStartup #Tech

Dipankar Mazumdar (@dipankartnt) 's Twitter Profile Photo

Manage data as code? Just like Git but for Data? That's right! projectnessie is an open source work that brings the capabilities of Git-like branching to the world of data & specifically to data lake table formats like #ApacheIceberg #dataengineering

Manage data as code? 

Just like Git but for Data? 

That's right!

<a href="/projectnessie/">projectnessie</a> is an open source work that brings the capabilities of Git-like branching to the world of data &amp; specifically to data lake table formats like #ApacheIceberg

#dataengineering
Dipankar Mazumdar (@dipankartnt) 's Twitter Profile Photo

The ApacheArrow project has grown in all axes 🚀 In fact, more & more tools/libraries in the #dataanalytics space have started using Arrow. In this blog post, we go through the evolution of Apache Arrow from usage, capability & community angles. dremio.com/blog/apache-ar…

The <a href="/ApacheArrow/">ApacheArrow</a> project has grown in all axes 🚀

In fact, more &amp; more tools/libraries in the #dataanalytics space have started using Arrow.

In this blog post, we go through the evolution of Apache Arrow from usage, capability &amp; community angles.

dremio.com/blog/apache-ar…
Dipankar Mazumdar (@dipankartnt) 's Twitter Profile Photo

Query planning in Apache Iceberg Being able to efficiently plan queries is super critical for faster execution of the queries run by analysts 🧑🏻‍💻 This is specifically critical when dealing with large-scale data such as data in data lakes. Read @IcebergDevs 👇 #dataengineering

Query planning in <a href="/ApacheIceberg/">Apache Iceberg</a>

Being able to efficiently plan queries is super critical for faster execution of the queries run by analysts 🧑🏻‍💻 

This is specifically critical when dealing with large-scale data such as data in data lakes. Read @IcebergDevs 👇
#dataengineering
Alex Merced | Open Data Lakehouse Advocate (@amdatalakehouse) 's Twitter Profile Photo

Reminder, if you want to learn more about Apache Iceberg I have loads of resources plus a video series all curated in this article. -> dremio.com/subsurface/apa… #BigData #DataLake #DataLakehouse

Dipankar Mazumdar (@dipankartnt) 's Twitter Profile Photo

Join Dremio’s Tech advocacy & Eng team for the very first installment of the Apache Iceberg Office Hours 📆 🚀 We will kick-off with a brief presentation on Copy-on-Write Vs Merge-on-Read strategies, followed up by Q&A on anything Iceberg related. When: December 7th, 12 PM

Join <a href="/dremio/">Dremio</a>’s Tech advocacy &amp; Eng team for the very first installment of the <a href="/ApacheIceberg/">Apache Iceberg</a> Office Hours 📆 🚀

We will kick-off with a brief presentation on Copy-on-Write Vs Merge-on-Read strategies, followed up by Q&amp;A on anything Iceberg related.

When: December 7th, 12 PM
Mim (@mim_djo) 's Twitter Profile Photo

TPCH-SF30 ; 180 million rows #AZURE D16DS_V5; 16 Cores, 64 GB RAM #Databricks Photon 41 S #DuckDB : 43 second Query Parquet files from the VM SSD, no Azure storage involved Databricks Software cost (not hardware) 4.4 $/Hour github.com/djouallah/Test…

TPCH-SF30 ; 180 million rows
#AZURE D16DS_V5; 16 Cores, 64 GB RAM
#Databricks Photon  41 S
#DuckDB : 43 second
Query Parquet files from the VM SSD, no Azure storage involved
Databricks Software cost (not hardware) 4.4 $/Hour 
github.com/djouallah/Test…
Mark Lyons (@mcl5tech) 's Twitter Profile Photo

Anyone looking for a new SA opportunity DM me and I can intro you to Roger Frey! (Great team & Roger is fantastic!!) lnkd.in/edKZsu-b

Marc Brooker (@marcjbrooker) 's Twitter Profile Photo

Microsecond-accurate time is now available in EC2 US East. So many cool things this makes possible: aws.amazon.com/about-aws/what…