Third part of article series is about Spark Streaming unit testing. This series pretend to fill the gap between code and documentation inside Spark Unit Testing domain. Spark has a huge Framework that allow to developers to test their code in any various cases.
There is one package
spark streaming. Tests are placed here.
Streaming Unit Testing
This is the base trait for Spark Streaming test suites. This provides basic functionality to run user-defined set of input on user-defined stream operations, and verify the output. Extends
Additional service classes:
This is a input stream just for the test suites. This is equivalent to a checkpointable, replayable, reliable message queue like Kafka. It requires a sequence as input, and returns the
ith element at the
ith batch under manual clock.
This is a output stream just for the testsuites. All the output is collected into a
ConcurrentLinkedQueue. This queue is wiped clean on being restored from checkpoint. The buffer contains a sequence of RDD’s, each containing a sequence of items.
This is a output stream just for the test suites. All the output is collected into
ConcurrentLinkedQueue. This queue is wiped clean on being restored from checkpoint. The queue contains a sequence of RDD’s, each containing a sequence of partitions, each containing a sequence of items.
An object that counts the number of started / completed batches. This is implemented using a
StreamingListener. Constructing a new instance automatically registers a
StreamingListener on the given
Manages a local
StreamingContext variable, correctly stopping it after each test. Note that it also stops active
stopSparkContext is set to
true (default). In most cases you may want to leave it, to isolate environment for
SparkContext in each test.
Apache Spark Unit Testing Part 1 — Core Components
This article is about how to use own Spark repository classes for Unit Testing and pretend to fill the gap between code…
Apache Spark Unit Testing Part 2 — Spark SQL
Second part of article series about how to use Spark repository classes for Unit Testing. Spark SQL package has four…