Azure Key Vault with Databricks

This blog will help you if you are looking for managed big data platform by leveraging the Azure cloud. Azure Databricks platform help you to spin up a Spark cluster without any management and along with that its provide reporting as well.

We have done with the spark cluster but the biggest issue is how can we secure it by parsing the sensitive data in the codebase. Azur has a Key Vault service which basically stores the sensitive data with versioning.

Let’s integrate the Azure Key vault with Databricks. First, we need to create a Key Vault.

Post creating the key vault we need to define the Secret Scope for the key vault in Azure databricks. For that, you have used the below URL

https://<databricks-instance>#secrets/createScope. Post successful authentication you will get the databricks instance like below https://xxxxxxxxxxxx.x.azuredatabricks.net/#secrets/createScope
Define the Scope Name. We will use this scope name in the codebase

DNS Name and Resource ID will be available in the Key Vault Portal.

Now we have defined the Secret Scope and created the keys as well. Let’s access the same from databricks

import mysql.connectorUsername = dbutils.secrets.get(scope = "scope_name",key = "user")
Password = dbutils.secrets.get(scope = "scope_name",key = "password")
mydb = mysql.connector.connect(
host="mysql.xxxx.xxx",
user=Username,
password=Password
)
mycursor = mydb.cursor()
mycursor.execute("SHOW DATABASES")
for x in mycursor:
print(x)

In the above code, we are using dbutils.secrets.get() function to get the key values from the key vault.

DevOps Engineer with 10+ years of experience in the IT Industry. In-depth experience in building highly complex, scalable, secure and distributed systems.