Learning Spark Streaming: Best Practices For Scaling And Optimizing Apache Spark

Learning Spark Streaming: Best Practices For Scaling And Optimizing Apache Spark

Learning Spark Streaming: Best Practices for Scaling and Optimizing Apache Spark

To build analytics tools that provide faster insights, knowing how to process data in real time is a must, and moving from batch processing to stream processing is absolutely required. Fortunately, the Spark in-memory framework/platform for processing data has added an extension devoted to fault-tolerant stream processing: Spark Streaming.

If you're familiar with Apache Spark and want to learn how to implement it for streaming jobs, this practical book is a must.

  • Understand how Spark Streaming fits in the big picture
  • Learn core concepts such as Spark RDDs, Spark Streaming clusters, and the fundamentals of a DStream
  • Discover how to create a robust deployment
  • Dive into streaming algorithmics
  • Learn how to tune, measure, and monitor Spark Streaming

About the Author

François Garillot worked on Scala's type system in 2006, earned his PhD from the French École Polytechnique in 2011, and worked at Typesafe, after a brief stint in Internet advertising. He's worked on interactive interfaces to the Scala compiler, while nourishing a strong enthusiasm for data analytics in his spare time, until Apache Spark let him fullfill this passion as his main job. He received the first Spark Certification in November 2014,...

Related Books