Introduction

Follow these steps to deploy the Unstructured API service into your Azure account.

1

Log in to the Azure Portal

2

Access Azure Marketplace

Go to the Unstructured Data Prepocessing - Customer Hosted API offering in the Azure Marketplace.

Azure Marketplace

3

Start the deployment process

  • Click Get It Now and fill out the form.

  • Read the terms and click Continue.

  • Click Create.

Deployment Process

4

Configure the deployment options

On the Create a virtual machine page, click the Basics tab and follow these steps:

  • Project details section

    • Select the Subscription and Resource group from the dropdown menus.

    • Or, you can create a new resource group by clicking Create new.

project details

  • Instance details section

    • Provide a name in the Virtual machine name field.

    • Select a Region from the dropdown menu.

    • Image: Select Unstructured Customer Hosted API Hourly - x64 Gen2 (default).

    • Size: Select a VM size from the dropdown menu. To learn more, see Azure VM comparisons.

instance details

  • Administrator account section

    • Authentication type: Select Password or SSH public key.

    • Enter the credential settings, depending on the authentication type.

administrator account

5

Set up the load balancer

Before you click Review + create, click the Networking tab and follow these steps:

  • Networking interface (required fields)

  • Load balancing section

    • Load balancing options: Select Azure load balancer.

    • Select a load balancer: Click Create a load balancer and fill out the following fields in the pop-up window, or select an existing load balancer from the dropdown menu:

      • Enter a Load balancer name.

      • Type: Select Public or Internal.

      • Protococl: Select TCP or UDP.

      • Port and Backend Port: Set both to 80.

load balancer

6

Finalize and deploy

  • Click Review + create.

  • Wait for validation.

  • Click Create.

deployment

7

Post-deployment steps

  • Go to the Virtual machine in the Azure console.

  • Get the Public IP address for the Load balancer.

  • The deployed endpoint is http://<load-balancer-public-IP-address>/general/v0/general

retrieve public ip

8

Verification and Testing

Perform API testing with curl. For example, run the following command, replacing:

  • <load-balancer-public-IP-address> with your load balancer’s public IP address.
  • <path/to/input-file> with the path on your local machine to a file to process. If you do not have any input files available, you can download any of the ones from the example-docs folder in GitHub.
  • <path/to/output-file> with the path on your local machine to the processed output in JSON format.
curl -q -X POST http://<load-balancer-public-IP-address>/general/v0/general \
     -H 'accept: application/json' \
     -H 'Content-Type: multipart/form-data' \
     -F files=@<path/to/input-file> \
     -o <path/to/output-file>.json

testing

Unstructured does not recommend POST to process multiple files at a time. Instead, use the Unstructured CLI or the Unstructured Python SDK with their provided source connectors and destination connectors.