Confluence - Unstructured

If you’re new to Unstructured, read this note first.

Before you can create a source connector, you must first sign in to your Unstructured account:

If you do not already have an Unstructured account, go to https://unstructured.io/contact and fill out the online form to indicate your interest.
If you already have an Unstructured account, sign in by using the URL of the sign in page that Unstructured provided to you when your Unstructured account was created. If you do not have this URL, contact Unstructured Sales at sales@unstructured.io.

After you sign in, the Unstructured user interface (UI) appears, which you use to create your source connector.

After you create the source connector, add it along with a destination connector to a workflow. Then run the worklow as a job. To learn how, try out the hands-on UI quickstart or watch the 4-minute video tutorial.

You can also create source connectors with the Unstructured API. Learn how.

If you need help, reach out to the community on Slack, or contact us directly.

You are now ready to start creating a source connector! Keep reading to learn how.

Ingest your files into Unstructured from Confluence.

The requirements are as follows.

A Confluence Cloud account or Confluence Data Center installation.
The site URL for your Confluence Cloud account or Confluence Data Center installation.
A user in your Confluence Cloud account or Confluence Data Center installation.
The user must have the correct permissions in your Conflunce Cloud account or Confluence Data Center installation to access the target spaces and pages.
One of the following:
- For Confluence Cloud or Confluence Data Center, the target user’s name or email address, and password. Change a Confluence Cloud user’s password. Change a Confluence Data Center user’s password.
- For Confluence Cloud only, the target user’s name or email address, and API token. Create an API token.
- For Confluence Data Center only, the target user’s personal access token (PAT). Create a PAT.
Optionally, the names of the specific spaces in the Confluence instance to access.

The following video provides related setup information for Confluence Cloud:

Document permissions metadata

The source connector outputs any permissions information that it can find in the source location about the processed source documents and associates that information with each corresponding element that is generated. This permissions information is output into the permissions_data field, which is within the data_source field under the element’s metadata field. This information lists the users or groups, if any, that have permissions to read, update, or delete the element’s associated source document.

The following example shows what the output looks like. Ellipses indicate content that has been omitted from this example for brevity.

[
    {
        "...": "...",
        "metadata": {
            "...": "...",
            "data_source": {
                "...": "...",
                "permissions_data": [
                    {
                        "read": {
                            "users": [
                                "11111:11111111-1111-1111-1111-111111111111"
                            ],
                            "groups": [
                                "22222222-2222-2222-2222-22222222",
                                "33333333-3333-3333-3333-33333333"
                            ]
                        }
                    },
                    {
                        "update": {
                            "users": [
                                "44444:44444444-4444-4444-4444-44444444",
                                "55555:55555555-5555-5555-5555-55555555"
                            ],
                            "groups": [
                                "66666666-6666-6666-6666-66666666",
                            ]
                        }
                    },
                    {
                        "delete": {
                            "users": [
                                "77777:77777777-7777-7777-7777-77777777"
                            ],
                            "groups": [
                                "88888888-8888-8888-8888-88888888"
                            ]
                        }
                    }
                ],
                "...": "..."
            }
        }
    }
]

To look up information about a particular Confluence user, use the user’s ID (also known as their account ID) from the preceding output to call the GET /wiki/rest/api/user operation in the Confluence REST API.

To look up information about a particular Confluence group, use the group’s ID from the preceding output to call the GET /wiki/rest/api/group/by-id operation in the Confluence REST API.

To create the source connector:

On the sidebar, click Connectors.
Click Sources.
Cick New or Create Connector.
Give the connector some unique Name.
In the Provider area, click Confluence.
Click Continue.
Follow the on-screen instructions to fill in the fields as described later on this page.
Click Save and Test.

Fill in the following fields:

Name (required): A unique name for this connector.
URL (required): The target Confluence site’s URL.
For username and password authentication: for Authentication Method, select Username and Password. Then enter the username into the Username field and the password into the Password field.
For API token authentication: for Authentication Method, select Username and API Token. Then enter the username into the Username field and the API token into the API Token field.
For personal access token (PAT) authentication: for Authentication Method, select Personal Access Token. Then enter the PAT into the Personal Access Token field.
Cloud: Check this box if you are using Confluence Cloud. By default this box is unchecked.
Max number of spaces: The maximum number of Confluence spaces to access within the Confluence Cloud instance. The default is 500 unless otherwise specified.
Max number of docs per space: The maximum number of documents to access within each space. The default is 150 unless otherwise specified.
List of spaces: A comma-separated string that lists the names of all of the spaces to access, for example: luke,paul. By default, if no space names are specified, and the Max Number of Spaces is reached for the instance, be aware that you might get unexpected results.
Extract inline images: Check this box to download images and replace the HTML content with Base64-encoded images. By default, this box is unchecked.
Extract files: Check this box to download any embedded files in pages. By default, this box is unchecked.

On this page

Document permissions metadata

If you’re new to Unstructured, read this note first.

Before you can create a source connector, you must first sign in to your Unstructured account:

If you do not already have an Unstructured account, go to https://unstructured.io/contact and fill out the online form to indicate your interest.
If you already have an Unstructured account, sign in by using the URL of the sign in page that Unstructured provided to you when your Unstructured account was created. If you do not have this URL, contact Unstructured Sales at sales@unstructured.io.

After you sign in, the Unstructured user interface (UI) appears, which you use to create your source connector.

You can also create source connectors with the Unstructured API. Learn how.

If you need help, reach out to the community on Slack, or contact us directly.

You are now ready to start creating a source connector! Keep reading to learn how.

Ingest your files into Unstructured from Confluence.

The requirements are as follows.

A Confluence Cloud account or Confluence Data Center installation.
The site URL for your Confluence Cloud account or Confluence Data Center installation.
A user in your Confluence Cloud account or Confluence Data Center installation.
The user must have the correct permissions in your Conflunce Cloud account or Confluence Data Center installation to access the target spaces and pages.
One of the following:
- For Confluence Cloud or Confluence Data Center, the target user’s name or email address, and password. Change a Confluence Cloud user’s password. Change a Confluence Data Center user’s password.
- For Confluence Cloud only, the target user’s name or email address, and API token. Create an API token.
- For Confluence Data Center only, the target user’s personal access token (PAT). Create a PAT.
Optionally, the names of the specific spaces in the Confluence instance to access.

The following video provides related setup information for Confluence Cloud:

Document permissions metadata

The following example shows what the output looks like. Ellipses indicate content that has been omitted from this example for brevity.

[
    {
        "...": "...",
        "metadata": {
            "...": "...",
            "data_source": {
                "...": "...",
                "permissions_data": [
                    {
                        "read": {
                            "users": [
                                "11111:11111111-1111-1111-1111-111111111111"
                            ],
                            "groups": [
                                "22222222-2222-2222-2222-22222222",
                                "33333333-3333-3333-3333-33333333"
                            ]
                        }
                    },
                    {
                        "update": {
                            "users": [
                                "44444:44444444-4444-4444-4444-44444444",
                                "55555:55555555-5555-5555-5555-55555555"
                            ],
                            "groups": [
                                "66666666-6666-6666-6666-66666666",
                            ]
                        }
                    },
                    {
                        "delete": {
                            "users": [
                                "77777:77777777-7777-7777-7777-77777777"
                            ],
                            "groups": [
                                "88888888-8888-8888-8888-88888888"
                            ]
                        }
                    }
                ],
                "...": "..."
            }
        }
    }
]

To look up information about a particular Confluence group, use the group’s ID from the preceding output to call the GET /wiki/rest/api/group/by-id operation in the Confluence REST API.

To create the source connector:

On the sidebar, click Connectors.
Click Sources.
Cick New or Create Connector.
Give the connector some unique Name.
In the Provider area, click Confluence.
Click Continue.
Follow the on-screen instructions to fill in the fields as described later on this page.
Click Save and Test.

Fill in the following fields:

Name (required): A unique name for this connector.
URL (required): The target Confluence site’s URL.
For username and password authentication: for Authentication Method, select Username and Password. Then enter the username into the Username field and the password into the Password field.
For API token authentication: for Authentication Method, select Username and API Token. Then enter the username into the Username field and the API token into the API Token field.
For personal access token (PAT) authentication: for Authentication Method, select Personal Access Token. Then enter the PAT into the Personal Access Token field.
Cloud: Check this box if you are using Confluence Cloud. By default this box is unchecked.
Max number of spaces: The maximum number of Confluence spaces to access within the Confluence Cloud instance. The default is 500 unless otherwise specified.
Max number of docs per space: The maximum number of documents to access within each space. The default is 150 unless otherwise specified.
List of spaces: A comma-separated string that lists the names of all of the spaces to access, for example: luke,paul. By default, if no space names are specified, and the Max Number of Spaces is reached for the instance, be aware that you might get unexpected results.
Extract inline images: Check this box to download images and replace the HTML content with Base64-encoded images. By default, this box is unchecked.
Extract files: Check this box to download any embedded files in pages. By default, this box is unchecked.

On this page

Document permissions metadata

​Document permissions metadata

FAQ

​Document permissions metadata

Document permissions metadata

Document permissions metadata