Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
This page describes how to connect to an Azure Data Lake Storage Gen2 (ADLS Gen2) external location. After completing this connection, you can govern access to these ADLS Gen2 objects using Unity Catalog.
To connect to an ADLS Gen2 container path, you need two Unity Catalog securable objects. The first is a storage credential, which specifies an Azure managed identity that allows access to the ADLS Gen2 container. You need this storage credential for the second required object: an external location, which defines the path to your ADLS Gen2 storage location and the credentials required to access that location.
Requirements
In Azure Databricks:
- Azure Databricks workspace enabled for Unity Catalog.
CREATE STORAGE CREDENTIALprivilege on the Unity Catalog metastore attached to the workspace. Account admins and metastore admins have this privilege by default.CREATE EXTERNAL LOCATIONprivilege on both the Unity Catalog metastore and the storage credential referenced by the external location. Metastore admins and workspace admins have this privilege by default.
In your Azure account:
An ADLS Gen2 storage account and container. To avoid egress charges, this should be in the same region as the workspace you want to access the data from.
External location paths must contain only standard ASCII characters (letters
A–Z,a–z, digits0–9, and common symbols like/,_,-).Azure Data Lake Storage storage accounts that you use as external locations must have a hierarchical namespace.
You cannot use Azure storage containers with immutability (WORM - Write Once, Read Many) policies enabled as external locations. Unity Catalog requires DELETE permissions on storage containers, which immutability policies prevent. For more information about immutability policies, see Configure immutability policies for containers.
If public network access is disabled on the storage account, you must enable the Allow Azure trusted services option to allow Azure Databricks to connect to the storage account. You can configure this setting using the Azure CLI:
# Check current network rule set az storage account show --name <storage_account_name> --resource-group <resource_group_name> --query "networkRuleSet" # Set bypass for Azure Services az storage account update \ --name <storage_account_name> \ --resource-group <resource_group_name> \ --bypass AzureServices
Permission to create an access connector for Azure Databricks in your Azure subscription. To do this, you must be a
ContributororOwnerof an Azure resource group.Permission to modify the access policy for the storage account. To do this, you must be the owner or a user with the
User Access AdministratorAzure RBAC role on your storage account.
Create a storage credential that accesses ADLS Gen2
To create a storage credential for access to an ADLS Gen2 container, you first create an access connector for Azure Databricks with a managed identity, then grant the managed identity access to your storage account.
Step 1: Create an access connector for Azure Databricks
An access connector for Azure Databricks is a first-party Azure resource that lets you connect managed identities to an Azure Databricks account. Each access connector for Azure Databricks can include a system-assigned managed identity, one or more user-assigned managed identities, or both.
Log in to the Azure Portal as a Contributor or Owner of a resource group.
Click + Create or Create a resource.
Search for Access Connector for Azure Databricks and select it.
Click Create.
On the Basics tab, accept, select, or enter values for the following fields:
- Subscription: This is the Azure subscription that the access connector will be created in. The default is the Azure subscription you are currently using. It can be any subscription in the tenant.
- Resource group: This is the Azure resource group that the access connector will be created in.
- Name: Enter a name that indicates the purpose of the connector.
- Region: This should be the same region as the storage account that you will connect to.
Click Next, enter tags, and click Next.
On the Managed Identity tab, create the managed identities as follows:
- To use a system-assigned managed identity, set Status to On.
- To add user-assigned managed identities, click + Add and select one or more user-assigned managed identities.

Click Review + create.
Review your configuration settings, then click Create.
When the deployment is complete, click Go to resource.
Make a note of the Resource ID.
The resource ID is in the format:
/subscriptions/12f34567-8ace-9c10-111c-aea8eba12345c/resourceGroups/<resource-group>/providers/Microsoft.Databricks/accessConnectors/<connector-name>
Step 2: Grant the managed identity access to the storage account
To grant the permissions in this step, you must have the Owner or User Access Administrator Azure RBAC role on your storage account.
You have the following options when you grant the managed identity access to the storage account and container:
- Grant read and write access to the entire storage account using the
Storage Blob Data Contributorrole. - Grant a more limited role on the storage account using the
Storage Blob Delegatorrole and read and write access to a specific container using theStorage Blob Data Contributorrole.
The instructions that follow assume that you are granting the Storage Blob Data Contributor role on the storage account, but you can substitute the other options as needed:
- Log in to your Azure Data Lake Storage account.
- Go to Access Control (IAM), click + Add, and select Add role assignment.
- Select the Storage Blob Data Contributor role and click Next.
- Under Assign access to, select Managed identity.
- Click +Select Members, and select either Access connector for Azure Databricks or User-assigned managed identity.
- Search for your connector name or user-assigned identity, select it, and click Review and Assign.
Step 3: Grant the managed identity access to file events
Granting your managed identity access to file events allows Azure Databricks to subscribe to file event notifications emitted by cloud providers. This makes file processing more efficient. For more information, see Set up file events for an external location.
To grant the permissions in this step, you must have the Owner or User Access Administrator Azure RBAC role on your storage account.
- Log in to your Azure Data Lake Storage account.
- Go to Access Control (IAM), click + Add, and select Add role assignment.
- Select the Storage Queue Data Contributor role, and click Next.
- Under Assign access to, select Managed identity.
- Click +Select Members, and select either Access connector for Azure Databricks or User-assigned managed identity.
- Search for your connector name or user-assigned identity, select it, and click Review and Assign.
Step 4: Grant Azure Databricks access to configure file events on your behalf
Note
This step is optional but highly recommended. If you do not grant Azure Databricks access to configure file events on your behalf, you must configure file events manually for each location and you also will have limited access to critical features that Databricks might release in the future. For more information about file events, see Set up file events for an external location.
This step allows Azure Databricks to set up file events automatically. To grant the permissions in this step, you must have the Owner or User Access Administrator Azure RBAC roles on your managed identity and the resource group that your Azure Data Lake Storage account is in.
Follow the instructions in Step 3: Grant the managed identity access to file events and assign the Storage Account Contributor to your managed identity.
This role does not replace the roles granted in steps 2 or 3 on this page, but is in addition to them.
Navigate to the Azure resource group that your Azure Data Lake Storage account is in.
Go to Access Control (IAM), click + Add, and select Add role assignment.
Select the EventGrid EventSubscription Contributor role and click Next.
Under Assign access to, select Managed identity.
Click +Select Members, and select either Access connector for Azure Databricks or User-assigned managed identity.
Search for your connector name or user-assigned identity, select it, and click Review and Assign.
Step 5: Create the storage credential in Databricks
Now that you have created an access connector with a managed identity and granted it permissions to your storage account, you can create the storage credential in Azure Databricks.
Log in to your Unity Catalog-enabled Azure Databricks workspace as a user who has the
CREATE STORAGE CREDENTIALprivilege on the metastore.In the sidebar, click
Catalog.
Click
, then click Create a credential.Select a Credential Type of Azure Managed Identity.
Enter a Storage credential name and an optional comment.
Enter the Access Connector ID (the resource ID you noted in Step 1).
The resource ID is in the format:
/subscriptions/12f34567-8ace-9c10-111c-aea8eba12345c/resourceGroups/<resource-group>/providers/Microsoft.Databricks/accessConnectors/<connector-name>(Optional) If you created the access connector using a user-assigned managed identity, enter the Managed Identity ID (the resource ID of the managed identity).
(Optional) If you want users to have read-only access to the external locations that use this storage credential, click Advanced Options and select Limit to read-only use. For more information, see Mark a storage credential as read-only.
Click Create.
(Optional) Bind the storage credential to specific workspaces.
By default, any privileged user can use the storage credential on any workspace attached to the metastore. If you want to allow access only from specific workspaces, go to the Workspaces tab and assign workspaces. See Assign a storage credential to specific workspaces.
You can now create an external location that references your storage credential.
Create an external location for an ADLS Gen2 container
This section describes how to create an external location using either Catalog Explorer or SQL. It assumes that you already have a storage credential that allows access to your ADLS Gen2 container. If you don't have a storage credential, follow the steps in Create a storage credential that accesses ADLS Gen2.
Option 1: Create an external location manually using Catalog Explorer
You can create an external location manually using Catalog Explorer.
To create the external location:
Log in to a workspace that is attached to the metastore.
In the sidebar, click
Catalog.
Click
, then click Create an external location.Enter an External location name.
Under Storage type, select Azure.
Under URL, enter the ADLS Gen2 container path. For example,
abfss://mycontainer@mystorageaccount.dfs.core.windows.net/<path>.Under Storage credential, select the storage credential that grants access to the external location.
(Optional) If you want users to have read-only access to the external location, click Advanced Options and select Limit to read-only use. For more information, see Mark an external location as read-only.
(Optional) If the external location is intended for a Hive metastore federated catalog, click Advanced options and enable Fallback mode.
(Optional) To enable the ability to subscribe to change notifications on the external location, click Advanced Options and select Enable file events.
For details, see Set up file events for an external location.
Click Create.
(Optional) Bind the external location to specific workspaces.
By default, any privileged user can use the external location on any workspace attached to the metastore. If you want to allow access only from specific workspaces, go to the Workspaces tab and assign workspaces. See Assign an external location to specific workspaces.
Go to the Permissions tab to grant permission to use the external location.
For anyone to use the external location you must grant permissions:
- To use the external location to add a managed storage location to metastore, catalog, or schema, grant the
CREATE MANAGED LOCATIONprivilege. - To create external tables or volumes, grant
CREATE EXTERNAL TABLEorCREATE EXTERNAL VOLUME.
Follow these steps:
- Click Grant.
- On the Grant on
<external location>dialog, select users, groups, or service principals in Principals field, and select the privilege you want to grant. - Click Grant.
- To use the external location to add a managed storage location to metastore, catalog, or schema, grant the
Option 2: Create an external location using SQL
To create an external location using SQL, run the following command in a notebook or the SQL query editor. Replace the placeholder values. For required permissions and prerequisites, see Requirements.
<location-name>: A name for the external location. Iflocation_nameincludes special characters, such as hyphens (-), it must be surrounded by backticks (` `). See Names.<container-path>: The path in your cloud tenant that this external location grants access to. For example,abfss://mycontainer@mystorageaccount.dfs.core.windows.net/.<storage-credential-name>: The name of the storage credential that authorizes reading from and writing to the container. If the storage credential name includes special characters, such as hyphens (-), it must be surrounded by backticks (` `).
CREATE EXTERNAL LOCATION [IF NOT EXISTS] `<location-name>`
URL '<container-path>'
WITH ([STORAGE] CREDENTIAL `<storage-credential-name>`)
[COMMENT '<comment-string>'];
If you want to limit external location access to specific workspaces in your account, also known as workspace binding or external location isolation, see Assign an external location to specific workspaces.
Verify the connection
To verify that you've successfully created the external location, try to read a file from the external location. For example, suppose that you have an external location abfss://mycontainer@mystorageaccount.dfs.core.windows.net/ containing a CSV file named example.csv. To read from the abfss://mycontainer@mystorageaccount.dfs.core.windows.net/example.csv file, follow these steps:
In the sidebar, click
Workspace.
Click Create, then select Notebook.
Run the following Python code snippet:
display(dbutils.fs.ls('abfss://mycontainer@mystorageaccount.dfs.core.windows.net/'))This displays a list of file paths in the external location. In this example, the
abfss://mycontainer@mystorageaccount.dfs.core.windows.net/example.csvfile appears in the output.To read a specific file in the external location, run the following Python code snippet:
spark.read.format("csv") \ .option("header", "true") \ .option("delimiter", ";") \ .load('abfss://mycontainer@mystorageaccount.dfs.core.windows.net/example.csv') \ .display()This displays the data in the
abfss://mycontainer@mystorageaccount.dfs.core.windows.net/example.csvfile.
Next steps
- Grant other users permission to use external locations. See Manage external locations.
- Define managed storage locations using external locations. See Specify a managed storage location in Unity Catalog.
- Define external tables using external locations. See Work with external tables.
- Define external volumes using external locations. See What are Unity Catalog volumes?.