Cluster Defintion Concepts

What makes up a Cluster Definition?

Cluster Definitions are designed to be flexible and dynamic, that being said they all have common components across them no matter their requirements.

Operating System/Software

Cluster Definitions set out the software that will need to be installed on the cluster. This will often determine the workload being performed by the cluster as well as its capabilities.

Hosting Tier

The Hosting Tier for the cluster depends on the type, it essentially maps directly to the hosting tier provided by the cloud provider hosting the cluster.

For example, Azure Databricks instances require an “Azure Virtual Machine Type” as its hosting tier which is just the Azure Virtual Machine series that is used to host the Databricks Cluster.

It is important to factor the Hosting Tier into your Cluster Definition as it is the driving force behind the amount of costs you’ll need to pay.

Workers

Workers define the processing units your cluster has. If the Cluster Definition type supports auto-scaling Data Governor is able to submit a minimum and maximum amount of workers which the cluster can scale between.