You can use Amazon S3 events, such as adding new files to—or updating existing files within—S3 buckets, to automatically run Unstructured ETL+ workflows that rely on those buckets as sources. This enables a no-touch approach to having Unstructured automatically process new and updated files in S3 buckets as they are added or updated. This example shows how to automate this process by adding an AWS Lambda function to your AWS account. This function runs whenever a new file is added to—or an existing file is updated within—the specified S3 bucket. This function then calls the Unstructured Workflow Endpoint to automatically run the specified corresponding Unstructured ETL+ workflow in your Unstructured account.
This example uses a custom AWS Lambda function that you create and maintain. Any issues with file detection, timing, or function invocation could be related to your custom function, rather than with Unstructured. If you are getting unexpected or no results, be sure to check your custom function’s Amazon CloudWatch logs first for any informational and error messages.

Requirements

To use this example, you will need the following:
  • An Unstructured account, and an Unstructured API key for your account, as follows:
    1. If you do not already have an Unstructured account, sign up for free. After you sign up, you are automatically signed in to your new Unstructured Starter account, at https://platform.unstructured.io.
      To sign up for a Team or Enterprise account instead, contact Unstructured Sales, or learn more.
    2. If you have an Unstructured Starter or Team account and are not already signed in, sign in to your account at https://platform.unstructured.io.
      For an Enterprise account, see your Unstructured account administrator for instructions, or email Unstructured Support at support@unstructured.io.
    3. Get your Unstructured API key:
      a. After you sign in to your Unstructured Starter account, click API Keys on the sidebar.
      For a Team or Enterprise account, before you click API Keys, make sure you have selected the organizational workspace you want to create an API key for. Each API key works with one and only one organizational workspace. Learn more.
      b. Click Generate API Key.
      c. Follow the on-screen instructions to finish generating the key.
      d. Click the Copy icon next to your new key to add the key to your system’s clipboard. If you lose this key, simply return and click the Copy icon again.
  • The Unstructured Workflow Endpoint URL for your account, as follows:
    1. In the Unstructured UI, click API Keys on the sidebar.
    2. Note the value of the Unstructured Workflow Endpoint field.
  • An S3 source connector in your Unstructured account. Learn how.
  • Some available destination connector in your Unstructured account.
  • A workflow that uses the preceding source and destination connectors. Learn how.

Step 1: Create the Lambda function

  1. Sign in to the AWS Management Console for your account.
  2. Browse to and open the Lambda console.
  3. On the sidebar, click Functions.
  4. Click Create function.
  5. Select Author from scratch.
  6. For Function name, enter a name for your function, such as RunUnstructuredWorkflow.
  7. For Runtime, select Node.js 22.x.
  8. For Architecture, select x86_64.
  9. Under Permissions, expand Change default execution role, and make sure Create a new role with basic Lambda permissions is selected.
  10. Click Create function. After the function is created, the function’s code and configuration settings page appears.

Step 2: Add code to the function

  1. With the function’s code and configuration settings page open from the previous step, click the Code tab.
  2. In the Code source tile, replace the contents of the index.mjs file with the following code instead. If the index.mjs file is not visible, do the following:
    1. Show the Explorer: on the sidebar, click Explorer.
    2. In the Explorer pane, expand the function name.
    3. Click to open the index.mjs file.
    Here is the code for the index.mjs file:
    import https from 'https';
    
    export const handler = async (event) => {
      const apiUrl = process.env.UNSTRUCTURED_API_URL;
      const apiKey = process.env.UNSTRUCTURED_API_KEY;
    
      if (!apiUrl || !apiKey) {
        throw new Error('Missing UNSTRUCTURED_API_URL or UNSTRUCTURED_API_KEY environment variable or both.');
      }
    
      const url = new URL(apiUrl);
    
      const options = {
        hostname: url.hostname,
        path: url.pathname,
        method: 'POST',
        headers: {
          'accept': 'application/json',
          'unstructured-api-key': apiKey
        }
      };
    
      const postRequest = () => new Promise((resolve, reject) => {
        const req = https.request(options, (res) => {
          let responseBody = '';
          res.on('data', (chunk) => { responseBody += chunk; });
          res.on('end', () => {
            resolve({ statusCode: res.statusCode, body: responseBody });
          });
        });
        req.on('error', reject);
        req.end();
      });
    
      try {
        const response = await postRequest();
        console.log(`POST status: ${response.statusCode}, body: ${response.body}`);
      } catch (error) {
        console.error('Error posting to endpoint:', error);
      }
    
      return {
        statusCode: 200,
        body: JSON.stringify('Lambda executed successfully')
      };
    };
    
  3. In Explorer, expand Deploy (Undeployed Changes).
  4. Click Deploy.
  5. Click the Configuration tab.
  6. On the sidebar, click Environment variables.
  7. Click Edit.
  8. Click Add environment variable.
  9. For Key, enter UNSTRUCTURED_API_URL.
  10. For Value, enter <unstructured-api-url>/workflows/<workflow-id>/run. Replace the following placeholders:
    • Replace <unstructured-api-url> with your Unstructured Workflow Endpoint value.
    • Replace <workflow-id> with the ID of your Unstructured workflow.
    The Value should now look similar to the following:
    https://platform.unstructuredapp.io/api/v1/workflows/11111111-1111-1111-1111-111111111111/run
    
  11. Click Add environment variable again.
  12. For Key, enter UNSTRUCTURED_API_KEY.
  13. For Value, enter your Unstructure API key value.
  14. Click Save.

Step 3: Create the function trigger

  1. Browse to and open the S3 console.
  2. Browse to and open the S3 bucket that corresponds to your S3 source connector. The bucket’s settings page appears.
  3. Click the Properties tab.
  4. In the Event notifications tile, click Create event notification.
  5. In the General configuration tile, enter a name for your event notification, such as UnstructuredWorkflowNotification.
  6. (Optional) For Prefix, enter any prefix to limit the Lambda function’s scope to only the specified prefix. For example, to limit the scope to only the input/ folder within the S3 bucket, enter input/.
    AWS does not recommend reading from and writing to the same S3 bucket because of the possibility of accidentally running Lambda functions in loops. However, if you must read from and write to the same S3 bucket, AWS strongly recommends specifying a Prefix value. Learn more.
  7. (Optional) For Suffix, enter any file extensions to limit the Lambda function’s scope to only the specified file extensions. For example, to limit the scope to only files with the .pdf extension, enter .pdf.
  8. In the Event types tile, check the box titled All object create events (s3:ObjectCreated:*).
  9. In the Destination tile, select Lambda function and Choose from your Lambda functions.
  10. In the Lambda function tile, select the Lambda function that you created earlier in Step 1.
  11. Click Save changes.

Step 4: Trigger the function

  1. With the S3 bucket’s settings page open from the previous step, click the Objects tab.
  2. If you specified a Prefix value earlier in Step 3, then click to open the folder that corresponds to your Prefix value.
  3. Click Upload, and then follow the on-screen instructions to upload a file to the bucket’s root. If, however, you clicked to open the folder that corresponds to your Prefix value instead, then follow the on-screen instructions to upload a file to that folder instead.

Step 5: View the trigger results

  1. In the Unstructured user interface for your account, click Jobs on the sidebar.
  2. In the list of jobs, click the newly running job for your workflow.
  3. After the job status shows Finished, go to your destination location to see the results.

Step 6 (Optional): Delete the trigger

  1. To stop the function from automatically being triggered whenever you add new files to—or update existing files within—the S3 bucket, browse to and open the S3 console.
  2. Browse to and open the bucket that corresponds to your S3 source connector. The bucket’s settings page appears.
  3. Click the Properties tab.
  4. In the Event notifications tile, check the box next to the name of the event notification that you added earlier in Step 3.
  5. Click Delete, and then click Confirm.