Clairvoyant Blog

Clairvoyant is a data and decision engineering company. We design, implement and operate data…

Follow publication

Member-only story

Productionalizing Spark Streaming Applications

Robert Sanders
Clairvoyant Blog
Published in
15 min readFeb 13, 2019

--

The Apache Spark project has become an essential tool in a Big Data Engineers toolkit. It includes many capabilities ranging from a highly performant Batch processing engine to a near-real-time Streaming Engine.

Spark Streaming

At Clairvoyant, we’ve been working with clients who are interested in building Highly Performant Real-time Big Data systems for their business. Many use cases have come up including Alert Engines, Processing IoT Data, and much more. We’ve dabbled in several types of technologies including Apache Nifi, Apache Flume, Apache Flink, and more. But one of our favorite technologies to use is Spark Streaming.

Spark Streaming is an extension to Core Apache Spark that enables scalable, high-throughput, fault-tolerant stream processing of live data streams. Source data streams can be any of the following as described in the below image and more.

Spark Streaming — source-link

--

--

Published in Clairvoyant Blog

Clairvoyant is a data and decision engineering company. We design, implement and operate data management platforms with the aim to deliver transformative business value to our customers.

Written by Robert Sanders

Senior AVP of Data Management for EXL Services | Marathon Runner | Triathlete | Endurance Athlete

Write a response