#ApacheDoris is a benchmarking machine 💪 I saw the Doris vs #Trino benchmark this week so I was curious and I found many more... If you want some 🍿 on a Fri afternoon, I linked the ones I found in 🧵below
#apachespark #elasticsearch #clickhouse #duckdb #apachepinot #starrocks
Retrieving all rows from a large dataset into memory can cause out-of-memory errors. #ApacheSpark DataFrame delays computations until collect() is called, allowing for row reduction through filtering or aggregating.
This results in more efficient memory usage.
🚀 New Blog Post: 'Building an Apache Spark Performance Lab: Tools and Techniques for Spark Optimization.' Get tips, tools, and short demos to boost your Spark performance. Ideal for developers! 🛠️ #ApacheSpark #Performance
Read more: db-blog.web.cern.ch/node/195
New🔥 Ep#11: sarah guo // conviction & Elad Gil talk to Matei Zaharia, founder Databricks, creator of Apache Spark, Stanford University CS professor:
- Dolly, betting on small models
- scaling asymptotes
- LLMs in the enterprise
- academic -> founder/CTO of $1B+ revenue co
🎙no-priors.com
Amazon revealed the data arch of their package delivery platform. Since working from home, I witness a steady stream of packages on my porch and I'm starting to wonder how many GBs of data my spouse has contributed to this dataset... 💸
#apachehudi #apachespark
🧵link below👇
Scaling AI/ML Infrastructure at Uber.
#apachekafka
#apacheflink
#apachespark
uber.com/en-IE/blog/sca…
#ApacheSpark 3.5 added new array helper functions that simplify the process of working with array data. Below are a few examples showcasing these new array functions.
🚀 View other array functions: bit.ly/4c0txD1
⭐️ Bookmark this post: bit.ly/3TnNCM3
New blog post: a deep dive into dataframes and table abstractions featuring polars data, DuckDB, pandas, dbt, Apache Spark, Dask, Ponder, Fugue Project, ... — when to use which framework and how do they compare or integrate with each other
Apache Spark vs. Jupyter: The Ultimate Data Science Battle!
tinyurl.com/bdwas6we
#ApacheSpark VsJupyter #BestDataScienceTool #DataScienceTool #ApacheSpark #JupyterNotebook #AINews #AnalyticsInsight #AnalyticsInsight Magazine
Early bird catches the worm 🐦
Save $400 by registering for the Databricks #DataAISummit before April 30! You’ll explore the latest advances in #ApacheSpark , #DeltaLake , #MLflow , #LangChain , #PyTorch , #dbt , and more! #DAIS sprou.tt/1OcnUCtPPN1
Found a low cap DeSci project backed by PolygonDAO and partnered with TensorFlow Apache Spark Cerebras & many more
Sounds interesting?
Unlocking Insights with Databricks: Technavik Solutions Review.
To read our full review, click the link below:
linkedin.com/feed/update/ur…
#technavi_productshowcase #techcurator #DataScienceEngineering #DataAnalytics #ApacheSpark #CollaborativeWorking
Today’s online lecture of my #BigData class is on introducing #PySpark for data science #MachineLearning #orms #python #DataScience #dataanalytics #ApacheSpark #SQL
Big Data Visualization Tools
1)Apache Superset
2)Jupyter Notebook
3)Apache Zeppelin
4) Metabase
Course Link: buff.ly/3AkZjK5
#bigdata #apachespark #hadoop #programming #programmer #developer #code #codinglife #100DaysOfCode #100daysofcodechallenge #100DaysOfMLCode
The new #MicrosoftFabric runtime 1.2 is available
📣 Apache Spark 3.41
📣 Delta Lake 2.4.0
📣 R: 4.22
....
read more in the documentation: learn.microsoft.com/en-us/fabric/d…
Picture powered by DALL-E3 (chatGPT plus)
#ApacheSpark #PowerBI
🚀 Just dropped a fresh blog post! Dive into the world of Apache Spark optimization with flame graphs, featuring a hands-on example with Grafana Pyroscope. 🔥📈 🔗 db-blog.web.cern.ch/node/193
#ApacheSpark #FlameGraphs #Pyroscope