Spark

The Spark task type allows you to utilise the power of the Apache Spark data processing engine as part of Data Governor jobs.

This currently is facilitated over a Livy connection, the Spark task type supporting all available Spark processing methods including pre-compiled JAR files and PySpark enabled Python scripts.

Selecting the Spark Parameters

The Spark task is similar to the Databricks task not only in the type of work it can do but also in its utilization. For Spark tasks you use the task builder to select the parameters you wish to use and then set those parameters when you sequence the task in the job.