barisozmen.github.io

Spark Resources

Quick start

Spark 101 (14 min read) (blogpost)

Getting Started with Apache Spark 2.x (4 hour read) (e-book)

Advanced Data Science On Spark (2 hour watch) (Spark Summit Talk) (slides)

Longer reads

Mastering Apache Spark (e-book)

Tuning & Debugging

Configuration parameters (Spark docs)

7 tips to debug … (databricks blog)

Spark UI (databricks blog)

Screen Shot 2019-10-26 at 11 06 20 AM

Practical optimization (databricks training)

Spark: The Definitive Guide Book (Ch18-Monitoring and Debugging) (Ch19-Performance Tuning)

Debugging and logging best practices (blogpost)

RDD, DataFrame, Dataset APIs

A tale of three Apache Spark APIs:… (databricks blog) (Spark summit video)

Screen Shot 2019-10-26 at 10 50 22 AM