> ## Documentation Index
> Fetch the complete documentation index at: https://docs.unstructured.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Confluence

<Note>
  First time creating a connector? [Read this first](/pipelines/connector-first-time-reqs).
</Note>

Ingest your files into Unstructured from Confluence.

The requirements are as follows.

* A [Confluence Cloud account](https://www.atlassian.com/software/confluence/pricing) or
  [Confluence Data Center installation](https://confluence.atlassian.com/doc/installing-confluence-data-center-203603.html).

* The site URL for your [Confluence Cloud account](https://community.atlassian.com/t5/Confluence-questions/confluence-cloud-url/qaq-p/1157148) or
  [Confluence Data Center installation](https://confluence.atlassian.com/confkb/how-to-find-your-site-url-to-set-up-the-confluence-data-center-and-server-mobile-app-938025792.html).

* A user in your [Confluence Cloud account](https://confluence.atlassian.com/cloud/invite-edit-and-remove-users-744721624.html) or
  [Confluence Data Center installation](https://confluence.atlassian.com/doc/add-and-invite-users-138313.html).

* The user must have the correct permissions in your
  [Conflunce Cloud account](https://support.atlassian.com/confluence-cloud/docs/what-are-confluence-cloud-permissions-and-restrictions/) or
  [Confluence Data Center installation](https://confluence.atlassian.com/doc/permissions-and-restrictions-139557.html) to
  access the target spaces and pages.

* One of the following:

  * For Confluence Cloud or Confluence Data Center, the target user's name or email address, and password.
    [Change a Confluence Cloud user's password](https://support.atlassian.com/confluence-cloud/docs/change-your-confluence-password/).
    [Change a Confluence Data Center user's password](https://confluence.atlassian.com/doc/change-your-password-139416.html).
  * For Confluence Cloud only, the target user's name or email address, and API token.
    [Create an API token](https://support.atlassian.com/atlassian-account/docs/manage-api-tokens-for-your-atlassian-account/).
  * For Confluence Data Center only, the target user's personal access token (PAT).
    [Create a PAT](https://confluence.atlassian.com/enterprise/using-personal-access-tokens-1026032365.html).

* Optionally, the keys (not display names) of the specific [spaces](https://support.atlassian.com/confluence-cloud/docs/navigate-spaces/) in the Confluence instance to access. To get a space's key,
  which is different from a space's display name, open the space in your web browser and look at the URL. The space's key is the part of the URL after `spaces/` but before the next `/` after that.

The following video provides related setup information for Confluence Cloud:

<iframe width="560" height="315" src="https://www.youtube.com/embed/3PsFJkcIotI" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen />

## Document permissions metadata

The source connector outputs any permissions information that it can find in the source location about the processed source documents and associates that information with each
corresponding element that is generated. This permissions information is output into the `permissions_data` field, which is within the
`data_source` field under the element's `metadata` field. This information lists the users or groups, if any, that have
permissions to read, update, or delete the element's associated source document.

<Warning>
  The permissions metadata Unstructured outputs should not be used for runtime authorization
  or access control enforcement.

  Unstructured outputs document permissions metadata that is accurate only
  at the point in time when Unstructured ingested the corresponding document
  to which those permissions applied. Because this metadata is a
  point-in-time copy of the permissions in the source location, these
  metadata outputs that are sent to your destination location are not
  always guaranteed to match the current permissions in the source location.

  Also, be aware that Unstructured updates permission metadata for a document only when the document's content has changed.

  This is because Unstructured performs incremental processing of documents only when documents' *content* has changed—not when only the documents'
  *permissions* have changed. Whenever Unstructured performs incremental processing of documents for a
  workflow (in other words, if **Reprocess All Files** is turned off or set
  to false for a workflow), that worfklow will not output metadata for any
  document permissions that have been added, changed, or removed since the
  previous workflow run, *unless* the corresponding documents' content
  has also been changed since the previous workflow run.
</Warning>

The following example shows what the output looks like. Ellipses indicate content that has been omitted from this example for brevity.

```json theme={null}
[
    {
        "...": "...",
        "metadata": {
            "...": "...",
            "data_source": {
                "...": "...",
                "permissions_data": [
                    {
                        "read": {
                            "users": [
                                "11111:11111111-1111-1111-1111-111111111111"
                            ],
                            "groups": [
                                "22222222-2222-2222-2222-22222222",
                                "33333333-3333-3333-3333-33333333"
                            ]
                        }
                    },
                    {
                        "update": {
                            "users": [
                                "44444:44444444-4444-4444-4444-44444444",
                                "55555:55555555-5555-5555-5555-55555555"
                            ],
                            "groups": [
                                "66666666-6666-6666-6666-66666666",
                            ]
                        }
                    },
                    {
                        "delete": {
                            "users": [
                                "77777:77777777-7777-7777-7777-77777777"
                            ],
                            "groups": [
                                "88888888-8888-8888-8888-88888888"
                            ]
                        }
                    }
                ],
                "...": "..."
            }
        }
    }
]
```

To look up information about a particular Confluence user, use the user's ID (also known as their *account ID*) from the preceding output to call the
[GET /wiki/rest/api/user](https://developer.atlassian.com/cloud/confluence/rest/v1/api-group-users/#api-wiki-rest-api-user-get)
operation in the Confluence REST API.

To look up information about a particular Confluence group, use the group's ID from the preceding output to call the
[GET /wiki/rest/api/group/by-id](https://developer.atlassian.com/cloud/confluence/rest/v1/api-group-group/#api-wiki-rest-api-group-by-id-get)
operation in the Confluence REST API.

To create the source connector:

1. On the sidebar, click **Connectors**.
2. Click **Sources**.
3. Cick **New** or **Create Connector**.
4. Give the connector some unique **Name**.
5. In the **Provider** area, click **Confluence**.
6. Click **Continue**.
7. Follow the on-screen instructions to fill in the fields as described later on this page.
8. Click **Save and Test**.

Fill in the following fields:

* **Name** (*required*): A unique name for this connector.
* **URL** (*required*): The target Confluence site's URL.
* For username and password authentication: for **Authentication Method**, select **Username and Password**. Then enter the username into the **Username** field and the password into the **Password** field.
* For API token authentication: for **Authentication Method**, select **Username and API Token**. Then enter the username into the **Username** field and the API token into the **API Token** field.
* For personal access token (PAT) authentication: for **Authentication Method**, select **Personal Access Token**. Then enter the PAT into the **Personal Access Token** field.
* **Cloud**: Check this box if you are using Confluence Cloud. By default this box is unchecked.
* **Max number of spaces**: The maximum number of Confluence spaces to access within the Confluence Cloud instance.
  The default is 500 unless otherwise specified.
* **Max number of docs per space**: The maximum number of documents to access within each space.
  The default is 150 unless otherwise specified.
* **List of spaces**: A comma-separated string that lists the keys (not display names) of all of the spaces to access, for example: `luke,paul`.
  By default, if no space keys are specified, and the **Max Number of Spaces** is reached for the instance, be aware that you might get
  unexpected results.
* **Extract inline images**: Check this box to download images and replace the HTML content with Base64-encoded images. By default, this box is unchecked.
* **Extract files**: Check this box to download any embedded files in pages. By default, this box is unchecked.
