> ## Documentation Index
> Fetch the complete documentation index at: https://docs.unstructured.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Azure Blob Storage

<Note>
  First time creating a connector? [Read this first](/api-reference/workflow/connector-first-time-reqs).
</Note>

Ingest your files into Unstructured from Azure Blob Storage.

## Requirements

You will need:

The following video shows how to fulfill the minimum set of Azure Storage account requirements:

<iframe width="560" height="315" src="https://www.youtube.com/embed/Vl3KCphlh9Y" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen />

<Note>
  If you are generating an SAS token as shown in the preceding video, be sure to set the following permissions:

  * **Read** and **List** for reading from the container only.
  * **Write** and **List** for writing to the container only.
  * **Read**, **Write**, and **List** for both reading from and writing to the container.
</Note>

Here are some more details about these requirements:

* An Azure account. To create one, [learn how](https://azure.microsoft.com/pricing/purchase-options/azure-account).

  <iframe width="560" height="315" src="https://www.youtube.com/embed/2bQ6WiJ1ncA" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen />

* An Azure Storage account, and a container within that account. [Create a storage account](https://learn.microsoft.com/azure/storage/common/storage-account-create). [Create a container](https://learn.microsoft.com/azure/storage/blobs/blob-containers-portal).

  <iframe width="560" height="315" src="https://www.youtube.com/embed/AhuNgBafmUo" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen />

  <iframe width="560" height="315" src="https://www.youtube.com/embed/xmndjYnGvcs" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen />

* The Azure Storage remote URL, using the format `az://<container-name>/<path/to/file/or/folder/in/container/as/needed>`

  For example, if your container is named `my-container`, and there is a folder in the container named `my-folder`, the
  Azure Storage remote URL would be `az://my-container/my-folder/`.

* An SAS token (recommended), access key, or connection string for the Azure Storage account.  [Create an SAS token (recommended)](https://learn.microsoft.com/azure/ai-services/translator/document-translation/how-to-guides/create-sas-tokens). [Get an access key](https://learn.microsoft.com/azure/storage/common/storage-account-keys-manage#view-account-access-keys). [Get a connection string](https://learn.microsoft.com/azure/storage/common/storage-configure-connection-string#configure-a-connection-string-for-an-azure-storage-account).

  Create an SAS token (recommended):

  <iframe width="560" height="315" src="https://www.youtube.com/embed/X6cmJ2IbVzo?start=240&end=370" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen />

  Get an access key or connection string:

  <iframe width="560" height="315" src="https://www.youtube.com/embed/muMmcwVfFqs" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen />

## Examples

To create an Azure Blob Storage source connector, see the following examples.

For more information on working with source connectors using the Unstructured API, see [Source endpoints](/api-reference/api/source/source-apis).

<CodeGroup>
  ```python Python SDK theme={null}
  import os

  from unstructured_client import UnstructuredClient
  from unstructured_client.models.operations import CreateSourceRequest
  from unstructured_client.models.shared import CreateSourceConnector

  with UnstructuredClient(api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")) as client:
      response = client.sources.create_source(
          request=CreateSourceRequest(
              create_source_connector=CreateSourceConnector(
                  name="<name>",
                  type="azure",
                  config={
                      "remote_url": "az://<container-name>/<path/to/file/or/folder>",
                      "recursive": <True|False>,

                      # For anonymous authentication, omit the following auth keys.
                      
                      # For SAS token authentication:
                      # "account_name": "<account-name>",
                      # "sas_token": "<sas-token>",

                      # For account key authentication:
                      # "account_name": "<account-name>",
                      # "account_key": "<account-key>",

                      # For connection string authentication:
                      # "connection_string": "<connection-string>"
                  }
              )
          )
      )

      print(response.source_connector_information)
  ```

  ```bash curl theme={null}
  curl --request 'POST' --location \
  "$UNSTRUCTURED_API_URL/sources" \
  --header 'accept: application/json' \
  --header "unstructured-api-key: $UNSTRUCTURED_API_KEY" \
  --header 'content-type: application/json' \
  --data \
  '{
      "name": "<name>",
      "type": "azure",
      "config": {
          "remote_url": "az://<container-name>/<path/to/file/or/folder>",
          "recursive": <true|false>,
      
          # For anonymous authentication, do not set any of the 
          # following fields.

          # For SAS token authentication:
          "account_name": "<account-name>",
          "sas_token": "<sas-token>"

          # For account key authentication:
          "account_name": "<account-name>",
          "account_key": "<account-key>"

          # For connection string authentication:
          "connection_string": "<connection-string>"
      }
  }'
  ```
</CodeGroup>

## Configuration settings

Replace the preceding placeholders as follows:

<ParamField body="name" type="string" required>
  A unique name for this connector.
</ParamField>

<ParamField body="remote_url" type="string" required>
  The Azure Storage remote URL, with the format `az://<container-name>/<path/to/file/or/folder/in/container/as/needed>`. For example, if your container is named `my-container`, and there is a folder in the container named `my-folder`, the Azure Storage remote URL would be `az://my-container/my-folder/`.
</ParamField>

<ParamField body="account_name" type="string">
  The Azure Storage account name. Required for SAS token authentication and account key authentication.
</ParamField>

<ParamField body="sas_token" type="string" required>
  For SAS token authentication, the SAS token for the Azure Storage account.
</ParamField>

<ParamField body="account_key" type="string" required>
  For account key authentication, the key for the Azure Storage account.
</ParamField>

<ParamField body="connection_string" type="string" required>
  For connection string authentication, the connection string for the Azure Storage account.
</ParamField>

<ParamField body="recursive" type="boolean" default="false">
  Source connector only. Set to `true` to recursively access files from subfolders within the container.
</ParamField>

<h2 id="set-up-enterprise-connect-authentication">
  Set up Enterprise Connect authentication
</h2>

<Note>
  Enterprise Connect is available for [dedicated instance](/business/dedicated-instances/overview) customers only, and must be enabled on your instance before use. Contact your Unstructured account team or [Unstructured Support](https://support.unstructured.io/) to request access and have it enabled.
</Note>

Enterprise Connect is an authentication method for Azure connectors. It uses a federated identity credential to authenticate Unstructured as a customer-configured App Registration. During a workflow run, Unstructured uses this credential to receive a short-lived access token. Tokens expire automatically and no secrets are stored. For an overview, see [Enterprise Connect for Azure](/business/azure/enterprise-connect).

To configure an Azure Blob Storage connector to use Enterprise Connect, first complete the following setup in your Azure subscription:

1. Create an App Registration for Unstructured in Microsoft Entra ID.

   In your Azure subscription, follow the instructions in [How to register an app in Microsoft Entra ID](https://learn.microsoft.com/en-us/entra/identity-platform/quickstart-register-app) in the Microsoft Entra documentation. Enter a meaningful name for your App Registration (for example, `unstructured-connector`). For **Supported account types**, select **Single tenant only**.

   You are registering this app for a third-party service (Unstructured) accessing resources in your own tenant. This is the [single-tenant scenario](https://learn.microsoft.com/en-us/entra/identity-platform/single-and-multi-tenant-apps) as defined by Microsoft.

2. Add a federated identity credential to the App Registration.

   Follow the instructions in [Configure an app to trust an external identity provider](https://learn.microsoft.com/en-us/entra/workload-id/workload-identity-federation-create-trust) in the Microsoft Entra documentation. Navigate to your App Registration, select **Certificates & secrets** in the left navigation pane, select the **Federated credentials** tab, and select **Add credential**.

   For **Federated credential scenario**, select **Other issuer**.

   Set the following values:

   | Field        | Value                                                                                                                                                                                                                                                                                                            |
   | ------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
   | **Name**     | A unique name for this credential (for example, `unstructured-federated-credential`). This cannot be changed after creation.                                                                                                                                                                                     |
   | **Issuer**   | The OIDC issuer URL for your Unstructured instance. Get this value from your Unstructured account team. (Example: `https://oidc.prod-aks.example.com/...`)                                                                                                                                                       |
   | **Subject**  | The service account identity for your Unstructured instance. Get this value from your Unstructured account team. (Example: `system:serviceaccount:etl:etl-job-runner`) This value must exactly match what Unstructured provides. If it does not match, the token exchange will fail without displaying an error. |
   | **Audience** | Set this to `api://AzureADTokenExchange`.                                                                                                                                                                                                                                                                        |

   Your Unstructured instance may require more than one federated identity credential. The platform uses separate identities for different operations, such as connection testing and running workflows. If your account team provides more than one Subject value, repeat these steps for each one.

3. Add a role assignment to grant your App Registration access to your Azure Blob Storage account.

   See [Assign Azure roles using the Azure portal](https://learn.microsoft.com/en-us/azure/role-based-access-control/role-assignments-portal) in the Azure documentation. Use the following values:

   * **Scope**: the Azure Blob Storage account that contains the data you want the connector to access.
   * **Role**: select **Storage Blob Data Reader** for a source, or **Storage Blob Data Contributor** for a destination.
   * **Members**: select **User, group, or service principal**, then search for and select the App Registration you created in Step 1.

   When you reach the **Review + assign** tab, click **Review + assign** to complete the assignment.

4. Note the following values from your App Registration. You will need them when configuring the connector in Unstructured. Both values are available on the **Overview** page of your App Registration in the [Microsoft Entra admin center](https://entra.microsoft.com).

   * The **Tenant ID** (also called Directory ID) for your Azure subscription.
   * The **Client ID** of your App Registration.

Next, see the **Create the connector with Enterprise Connect** section below for examples.

### Create the source connector with Enterprise Connect

The following examples create an Azure Blob Storage source connector using Enterprise Connect authentication.

For more information on working with source connectors using the Unstructured API, see [Source endpoints](/api-reference/api/source/source-apis).

<CodeGroup>
  ```python Python SDK theme={null}
  import os

  from unstructured_client import UnstructuredClient
  from unstructured_client.models.operations import CreateSourceRequest
  from unstructured_client.models.shared import CreateSourceConnector

  with UnstructuredClient(api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")) as client:
      response = client.sources.create_source(
          request=CreateSourceRequest(
              create_source_connector=CreateSourceConnector(
                  name="<name>",
                  type="azure",
                  config={
                      "remote_url": "az://<container-name>/<path/to/file/or/folder>",
                      "account_name": "<account-name>",
                      "tenant_id": "<tenant-id>",
                      "client_id": "<client-id>",
                      "recursive": <True|False>  # Boolean: True or False, no quotes
                  }
              )
          )
      )

      print(response.source_connector_information)
  ```

  ```bash curl theme={null}
  curl --request 'POST' --location \
  "$UNSTRUCTURED_API_URL/sources" \
  --header 'accept: application/json' \
  --header "unstructured-api-key: $UNSTRUCTURED_API_KEY" \
  --header 'content-type: application/json' \
  --data \
  '{
      "name": "<name>",
      "type": "azure",
      "config": {
          "remote_url": "az://<container-name>/<path/to/file/or/folder>",
          "account_name": "<account-name>",
          "tenant_id": "<tenant-id>",
          "client_id": "<client-id>",
          "recursive": <true|false>
      }
  }'
  ```
</CodeGroup>

Replace the preceding placeholders as follows.

<ParamField body="name" type="string" required>
  A unique name for this connector.
</ParamField>

<ParamField body="remote_url" type="string" required>
  The Azure Storage remote URL, with the format `az://<container-name>/<path/to/file/or/folder/in/container/as/needed>`. For example, if your container is named `my-container`, and there is a folder in the container named `my-folder`, the Azure Storage remote URL would be `az://my-container/my-folder/`.
</ParamField>

<ParamField body="account_name" type="string" required>
  The Azure Storage account name.
</ParamField>

<ParamField body="tenant_id" type="string" required>
  The Tenant ID (also called Directory ID) for your Azure subscription.
</ParamField>

<ParamField body="client_id" type="string" required>
  The Client ID of your App Registration.
</ParamField>

<ParamField body="recursive" type="boolean" default="false">
  Source connector only. Set to `true` to recursively access files from subfolders within the container.
</ParamField>
