Connect Notion to your preprocessing pipeline, and batch process all your documents using unstructured-ingest to store structured outputs locally on your filesystem.

First, install the Notion dependencies as shown here:

pip install "unstructured-ingest[notion]"

Make sure to provide notion-api-key. To get the credentials for your Notion workspace, follow the steps described in Notion documentation.

Optionally, specify the following parameters:

  • page-ids: Notion page IDs to extract text from.
  • database-ids: Notion database IDs to extract text from.

Make sure to set the --partition-by-api flag and pass in your API key with --api-key:

Additionally, if you’re using Unstructured Serverless API, your locally deployed Unstructured API, or an Unstructured API deployed on Azure or AWS, you also need to specify the API URL via the --partition-endpoint argument.