> ## Documentation Index
> Fetch the complete documentation index at: https://docs.unstructured.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Neo4j

Batch process all your records to store structured outputs in a Neo4j account.

The requirements are as follows.

* A [Neo4j deployment](https://neo4j.com/deployment-center/).

  * For the [Unstructured Pipelines](/pipelines/overview) or the [Unstructured API](/api-reference/overview), local Neo4j deployments are not supported.
  * For [Unstructured Ingest](/open-source/ingestion/overview), local and non-local Neo4j deployments are supported.

  The following video shows how to set up a Neo4j Aura deployment:

  <iframe width="560" height="315" src="https://www.youtube.com/embed/fo8uDIm1zCE" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen />

* The username and password for the user who has access to the Neo4j deployment. The default user is typically `neo4j`.

  * For a Neo4j Aura instance, the defaut user's is typically set when the instance is created.
  * For an AWS Marketplace, Microsoft Azure Marketplace, or Google Cloud Marketplace deployment of Neo4j, the default user is typically set during the deployment process.
  * For a local Neo4j deployment, you can [set the default user's initial password](https://neo4j.com/docs/operations-manual/current/configuration/set-initial-password/) or [recover an admin user and its password](https://neo4j.com/docs/operations-manual/current/authentication-authorization/password-and-user-recovery/).

* The connection URI for the Neo4j deployment, which starts with `neo4j://`, `neo4j+s://`, `bolt://`, or `bolt+s://`; followed by `localhost` or the host name; and sometimes ending with a colon and the port number (such as `:7687`). For example:

  * For a Neo4j Aura deployment, browse to the target Neo4j instance in the Neo4j Aura account and click **Connect > Drivers** to get the connection URI, which follows the format `neo4j+s://<host-name>`. A port number is not used or needed.
  * For an AWS Marketplace, Microsoft Azure Marketplace, or Google Cloud Marketplace deployment of Neo4j, see
    [Neo4j on AWS](https://neo4j.com/docs/operations-manual/current/cloud-deployments/neo4j-aws/),
    [Neo4j on Azure](https://neo4j.com/docs/operations-manual/current/cloud-deployments/neo4j-azure/), or
    [Neo4j on GCP](https://neo4j.com/docs/operations-manual/current/cloud-deployments/neo4j-gcp/)
    for details about how to get the connection URI.
  * For a local Neo4j deployment, the URI is typically `bolt://localhost:7687`
  * For other Neo4j deployment types, see the deployment provider's documentation.

  [Learn more](https://neo4j.com/docs/browser-manual/current/operations/dbms-connection).

* The name of the target database in the Neo4j deployment. A default Neo4j deployment typically contains two standard databases: one named `neo4j` for user data and another
  named `system` for system data and metadata. Some Neo4j deployment types support more than these two databases per deployment;
  Neo4j Aura instances do not.

  * [Create additional databases](https://neo4j.com/docs/operations-manual/current/database-administration/standard-databases/create-databases/)
    for a local Neo4j deployment that uses Enterprise Edition; or for Neo4j on AWS, Neo4j on Azure, or Neo4j on GCP deployments.
  * [Get a list of additional available databases](https://neo4j.com/docs/operations-manual/current/database-administration/standard-databases/listing-databases/)
    for a local Neo4j deployment that uses Enterprise Edition; or for Neo4j on AWS, Neo4j on Azure, or Neo4j on GCP deployments.

The Neo4j connector dependencies:

```bash CLI, Python theme={null}
pip install "unstructured-ingest[neo4j]"
```

You might also need to install additional dependencies, depending on your needs. [Learn more](/open-source/ingestion/ingest-dependencies).

The following environment variables:

* `NEO4J_USERNAME` - The name of the target user with access to the target Neo4j deployment, represented by `--username` (CLI) or `username` (Python).
* `NEO4J_PASSWORD` - The user's password, represented by `--password` (CLI) or `password` (Python).
* `NEO4J_URI` - The connection URI for the deployment, represented by `--uri` (CLI) or `uri` (Python).
* `NEO4J_DATABASE` - The name of the database in the deployment, represented by `--database` (CLI) or `database` (Python).

Now call the Unstructured CLI or Python. The source connector can be any of the ones supported. This example uses the local source connector.

This example sends files to Unstructured for processing by default. To process files locally instead, see the instructions at the end of this page.

<CodeGroup>
  ```bash CLI theme={null}
  #!/usr/bin/env bash

  # Chunking and embedding are optional.

  unstructured-ingest \
    local \
      --input-path $LOCAL_FILE_INPUT_DIR \
      --chunking-strategy by_title \
      --embedding-provider huggingface \
      --partition-by-api \
      --api-key $UNSTRUCTURED_API_KEY \
      --partition-endpoint $UNSTRUCTURED_API_URL \
      --strategy hi_res \
      --additional-partition-args="{\"split_pdf_page\":\"true\", \"split_pdf_allow_failed\":\"true\", \"split_pdf_concurrency_level\": 15}" \
    neo4j \
      --username $NEO4J_USERNAME \
      --password $NEO4J_PASSWORD \
      --uri $NEO4J_URI \ # <scheme>://<host>:<port>
      --database $NEO4J_DATABASE \
      --batch-size 100
  ```

  ```python Python Ingest theme={null}
  import os

  from unstructured_ingest.pipeline.pipeline import Pipeline
  from unstructured_ingest.interfaces import ProcessorConfig

  from unstructured_ingest.processes.connectors.neo4j import (
      Neo4jAccessConfig,
      Neo4jConnectionConfig,
      Neo4jUploadStagerConfig,
      Neo4jUploaderConfig
  )
  from unstructured_ingest.processes.connectors.local import (
      LocalIndexerConfig,
      LocalConnectionConfig,
      LocalDownloaderConfig
  )
  from unstructured_ingest.processes.partitioner import PartitionerConfig
  from unstructured_ingest.processes.chunker import ChunkerConfig
  from unstructured_ingest.processes.embedder import EmbedderConfig

  # Chunking and embedding are optional.

  if __name__ == "__main__":
      Pipeline.from_configs(
          context=ProcessorConfig(),
          indexer_config=LocalIndexerConfig(input_path=os.getenv("LOCAL_FILE_INPUT_DIR")),
          downloader_config=LocalDownloaderConfig(),
          source_connection_config=LocalConnectionConfig(),
          partitioner_config=PartitionerConfig(
              partition_by_api=True,
              api_key=os.getenv("UNSTRUCTURED_API_KEY"),
              partition_endpoint=os.getenv("UNSTRUCTURED_API_URL"),
              additional_partition_args={
                  "split_pdf_page": True,
                  "split_pdf_allow_failed": True,
                  "split_pdf_concurrency_level": 15
              }
          ),
          chunker_config=ChunkerConfig(chunking_strategy="by_title"),
          embedder_config=EmbedderConfig(embedding_provider="huggingface"),
          destination_connection_config=Neo4jConnectionConfig(
              access_config=Neo4jAccessConfig(password=os.getenv("NEO4J_PASSWORD")),
              username=os.getenv("NEO4J_USERNAME"),
              uri=os.getenv("NEO4J_URI"),
              database=os.getenv("NEO4J_DATABASE"),
          ),
          stager_config=Neo4jUploadStagerConfig(),
          uploader_config=Neo4jUploaderConfig(batch_size=100)
      ).run()
  ```
</CodeGroup>

For the Unstructured Ingest CLI and the Unstructured Ingest Python library, you can use the `--partition-by-api` option (CLI) or `partition_by_api` (Python) parameter to specify where files are processed:

* To do local file processing, omit `--partition-by-api` (CLI) or `partition_by_api` (Python), or explicitly specify `partition_by_api=False` (Python).

  Local file processing does not use an Unstructured API key or API URL, so you can also omit the following, if they appear:

  * `--api-key $UNSTRUCTURED_API_KEY` (CLI) or `api_key=os.getenv("UNSTRUCTURED_API_KEY")` (Python)
  * `--partition-endpoint $UNSTRUCTURED_API_URL` (CLI) or `partition_endpoint=os.getenv("UNSTRUCTURED_API_URL")` (Python)
  * The environment variables `UNSTRUCTURED_API_KEY` and `UNSTRUCTURED_API_URL`

* To send files to the legacy [Unstructured Partition Endpoint](/api-reference/legacy-api/partition/overview) for processing, specify `--partition-by-api` (CLI) or `partition_by_api=True` (Python).

  Unstructured also requires an Unstructured API key and API URL, by adding the following:

  * `--api-key $UNSTRUCTURED_API_KEY` (CLI) or `api_key=os.getenv("UNSTRUCTURED_API_KEY")` (Python)
  * `--partition-endpoint $UNSTRUCTURED_API_URL` (CLI) or `partition_endpoint=os.getenv("UNSTRUCTURED_API_URL")` (Python)
  * The environment variables `UNSTRUCTURED_API_KEY` and `UNSTRUCTURED_API_URL`, representing your API key and API URL, respectively.

  <Note>
    You must specify the API URL only if you are not using the default API URL for Unstructured Ingest, which applies to **Let's Go**, **Pay-As-You-Go**, and **Business SaaS** accounts.

    The default API URL for Unstructured Ingest is `https://api.unstructuredapp.io/general/v0/general`, which is the API URL for the legacy [Unstructured Partition Endpoint](/api-reference/legacy-api/partition/overview). However, you should always use the URL that was provided to you when your Unstructured account was created. If you do not have this URL, email Unstructured Support at [support@unstructured.io](mailto:support@unstructured.io).

    If you do not have an API key, [get one now](/api-reference/legacy-api/partition/overview).

    If you are using a **Business** account, the process
    for generating Unstructured API keys, and the Unstructured API URL that you use, are different.
    For instructions, see your Unstructured account administrator, or email Unstructured Support at [support@unstructured.io](mailto:support@unstructured.io).
  </Note>

## Graph Output

The graph output of the Neo4j destination connector is represented in the following diagram:

```mermaid theme={null}
graph BT
    subgraph dn [Document Node]
    D[Document]
    end
    style dn stroke-dasharray: 5
    
    subgraph en [Element Nodes]
    UE1[UnstructuredElement]
    UE2[UnstructuredElement]
    UE3[UnstructuredElement]
    UE4[UnstructuredElement]
    UE5[UnstructuredElement]
    UE6[UnstructuredElement]
    end
    style en stroke-dasharray: 5
    
    UE1 -->|PART_OF_DOCUMENT| D
    UE2 -->|PART_OF_DOCUMENT| D
    UE3 -->|PART_OF_DOCUMENT| D
    UE4 -->|PART_OF_DOCUMENT| D
    UE5 -->|PART_OF_DOCUMENT| D
    UE6 -->|PART_OF_DOCUMENT| D

    subgraph cn [Chunk Nodes]
    C1[Chunk]
    C2[Chunk]
    C3[Chunk]
    C4[Chunk]
    end
    style cn stroke-dasharray: 5
    
    C1 -->|NEXT_CHUNK| C2
    C2 -->|NEXT_CHUNK| C3
    C3 -->|NEXT_CHUNK| C4

    C1 -->|PART_OF_DOCUMENT| D
    C2 -->|PART_OF_DOCUMENT| D
    C3 -->|PART_OF_DOCUMENT| D
    C4 -->|PART_OF_DOCUMENT| D

    UE1 -.->|PART_OF_CHUNK| C1
    UE2 -.->|PART_OF_CHUNK| C1
    UE3 -.->|PART_OF_CHUNK| C2
    UE4 -.->|PART_OF_CHUNK| C3
    UE5 -.->|PART_OF_CHUNK| C4
    UE6 -.->|PART_OF_CHUNK| C4
```

[View the preceding diagram in full-screen mode](https://mermaid.live/view#pako:eNqFlN9vgjAQx_-Vps-6REEfeFiyFZYli7hskCyTxXS0ihFaU9oHo_7vq_IjgIzyxN330157d70TjDmh0IFbgQ8JeA4iBvSXq9_CQRhYuTxWGWUS-Br9KQC39pYOyki5VB5Tel2XS8H3dExwnmAh8NEBs4LohKA6hJfSOkJe7hh6k1XI9C4qlkpQUjK1Oh1UrUHVHlRng-p8QO1kgRqzoC8JxuPH8_vTR7BevqzdJQoXnh-cgVvf0wRYJsA2ATMTMP8f6FQz1tVEiWL7Vi3RpHBW5rRtWm3TbpmdnMbGnKIipb73FazRa-i_nXXAKvC9ZFWHuJfs6nrIUCVkKBIy1AjZpgTfGuWhwVRnnDT6ZFC3-vVpo0v6dKvRJH263eiRXh2OYEZFhndEj5nTlY6gTPSriaCjfwndYJXKCEbsolGsJP88shg6-onRERRcbRPobHCaa0sdCJbU3WHdbFmFHDD75jyrIUp2kotFMddu4-3yB3k-fcg).

In the preceding diagram:

* The `Document` node represents the source file.
* The `UnstructuredElement` nodes represent the source file's Unstructured `Element` objects, before chunking.
* The `Chunk` nodes represent the source file's Unstructured `Element` objects, after chunking.
* Each `UnstructuredElement` node has a `PART_OF_DOCUMENT` relationship with the `Document` node.
* Each `Chunk` node also has a `PART_OF_DOCUMENT` relationship with the `Document` node.
* Each `UnstructuredElement` node has a `PART_OF_CHUNK` relationship with a `Chunk` element.
* Each `Chunk` node, except for the "last" `Chunk` node, has a `NEXT_CHUNK` relationship with its "next" `Chunk` node.

Learn more about [document elements](/concepts/document-elements) and [chunking](/concepts/chunking).

Some related example Neo4j graph queries include the following.

Query for all available nodes and relationships:

```text theme={null}
MATCH path=(source)-[relationship]->(target)
RETURN path
```

Query for `Chunk` to `Document` relationships:

```text theme={null}
MATCH (chunk:Chunk)-[relationship:PART_OF_DOCUMENT]->(doc:Document)
RETURN chunk, relationship, doc
```

Query for `UnstructuredElement` to `Document` relationships:

```text theme={null}
MATCH (element:UnstructuredElement)-[relationship:PART_OF_DOCUMENT]->(doc:Document)
RETURN element, relationship, doc
```

Query for `UnstructuredElement` to `Chunk` relationships:

```text theme={null}
MATCH (element:UnstructuredElement)-[relationship:PART_OF_CHUNK]->(chunk:Chunk)
RETURN element, relationship, chunk
```

Query for `Chunk` to `Chunk` relationships:

```text theme={null}
MATCH (this:Chunk)-[relationship:NEXT_CHUNK]->(previous:Chunk)
RETURN this, relationship, previous
```

Query for `UnstructuredElement` to `Chunk` to `Document` relationships:

```text theme={null}
MATCH (element:UnstructuredElement)-[ecrelationship:PART_OF_CHUNK]-(chunk:Chunk)-[cdrelationship:PART_OF_DOCUMENT]->(doc:Document)
RETURN element, ecrelationship, chunk, cdrelationship, doc
```

Query for `UnstructuredElements` containing the text `jury`, and show their `Chunk` relationships:

```text theme={null}
MATCH (element:UnstructuredElement)-[relationship:PART_OF_CHUNK]->(chunk:Chunk)
WHERE element.text =~ '(?i).*jury.*'
RETURN element, relationship, chunk
```

Query for the `Chunk` with the specified `id`, and show its `UnstructuredElement` relationships:

```text theme={null}
MATCH (element:UnstructuredElement)-[relationship:PART_OF_CHUNK]->(chunk:Chunk)
WHERE chunk.id = '731508bf53637ce4431fe93f6028ebdf'
RETURN element, relationship, chunk
```

Additionally, for the [Unstructured Pipelines](/pipelines/overview) and [Unstructured API's workflow operations](/api-reference/workflow/),
when a [Named entity recognition (NER)](/concepts/enriching/ner) DAG node is added to a custom workflow,
any recognized entities are output as `Entity` nodes in the graph.

This additional graph output of the Neo4j destination connector is represented in the following diagram:

```mermaid theme={null}
graph TD
    Chunk -->|HAS_ENTITY| Entity
    Entity -->|ENTITY_TYPE| Entity
```

In the preceding diagram:

* The `Chunk` node represents one of the source file's Unstructured `Element` objects, after chunking.
* The `Entity` node represents a recognized entity.
* A `Chunk` node can have `HAS_ENTITY` relationships with `Entity` nodes.
* An `Entity` node can have `ENTITY_TYPE` relationships with other `Entity` nodes.

Some related example Neo4j graph queries include the following.

Query for all available nodes and relationships:

```text theme={null}
MATCH path=(source)-[relationship]->(target)
RETURN path
```

Query for `Entity` to `Entity` relationships:

```text theme={null}
MATCH (child:Entity)-[relationship:ENTITY_TYPE]->(parent:Entity)
RETURN child, relationship, parent
```

Query for `Entity` nodes containing the text `PERSON`, and show their `Entity` relationships:

```text theme={null}
MATCH (child:Entity)-[relationship:ENTITY_TYPE]->(parent:Entity)
WHERE parent.id = 'PERSON'
RETURN child, relationship, parent
```

Query for `Entity` nodes containing the text `amendment`, and show their `Chunk` relationships:

```text theme={null}
MATCH (element:Chunk)-[relationship:HAS_ENTITY]->(entity:Entity)
WHERE entity.id =~ '(?i).*amendment.*'
RETURN element, relationship, entity
```

QUERY FOR `Entity` nodes containing the text `PERSON`, and show their `Entity` to `Entity` to `Chunk` relationships:

```text theme={null}
MATCH (chunk:Chunk)-[ccrelationship:HAS_ENTITY]-(child:Entity)-[cprelationship:ENTITY_TYPE]->(parent:Entity)
WHERE parent.id =~ 'PERSON'
RETURN chunk, ccrelationship, child, cprelationship, parent
```