Creating a mount point in Databricks allows you to access external storage (like Azure Blob Storage, ADLS, or AWS S3) as if it were part of your Databricks file system (DBFS). Here’s a step-by-step guide on how to create a mount point in Databricks:
For Azure Data Lake Storage Gen2 or Blob Storage
1. Get your storage account information and credentials
You will need:
o Storage account name
o Container name
o Access key or SAS token or OAuth credentials
2. Mount the storage using Databricks CLI or notebook code
You typically do this in a Databricks notebook by using the dbutils.fs.mount() function.
Here is an example mounting an Azure Blob Storage container using an access key:
For Azure Data Lake Storage Gen2 with OAuth (more secure)
You configure a service principal and use OAuth tokens for mounting.
Example code snippet:
configs = {
"fs.azure.account.auth.type": "OAuth",
"fs.azure.account.oauth.provider.type": "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
"fs.azure.account.oauth2.client.id": "<application-id>",
"fs.azure.account.oauth2.client.secret": dbutils.secrets.get(scope="<scope-name>", key="<secret-key>"),
"fs.azure.account.oauth2.client.endpoint": "https://login.microsoftonline.com/<directory-id>/oauth2/token"
}
dbutils.fs.mount(
source = "abfss://<container-name>@<storage-account-name>.dfs.core.windows.net/",
mount_point = "/mnt/<mount-name>",
extra_configs = configs
)