Send processed data from Unstructured to Neo4j.

The requirements are as follows.

  • A Neo4j deployment.

    The following video shows how to set up a Neo4j Aura deployment:

  • The username and password for the user who has access to the Neo4j deployment. The default user is typically neo4j.

  • The connection URI for the Neo4j deployment, which starts with neo4j://, neo4j+s://, bolt://, or bolt+s://; followed by localhost or the host name; and sometimes ending with a colon and the port number (such as :7687). For example:

    • For a Neo4j Aura deployment, browse to the target Neo4j instance in the Neo4j Aura account and click Connect > Drivers to get the connection URI, which follows the format neo4j+s://<host-name>. A port number is not used or needed.
    • For an AWS Marketplace, Microsoft Azure Marketplace, or Google Cloud Marketplace deployment of Neo4j, see Neo4j on AWS, Neo4j on Azure, or Neo4j on GCP for details about how to get the connection URI.
    • For a local Neo4j deployment, the URI is typically bolt://localhost:7687
    • For other Neo4j deployment types, see the deployment provider’s documentation.

    Learn more.

  • The name of the target database in the Neo4j deployment. A default Neo4j deployment typically contains two standard databases: one named neo4j for user data and another named system for system data and metadata. Some Neo4j deployment types support more than these two databases per deployment; Neo4j Aura instances do not.

Graph Output

The graph ouput of the Neo4j destination connector is represented in the following diagram:

View the preceding diagram in full-screen mode.

In the preceding diagram:

  • The Document node represents the source file.
  • The UnstructuredElement nodes represent the source file’s Unstructured Element objects, before chunking.
  • The Chunk nodes represent the source file’s Unstructured Element objects, after chunking.
  • Each UnstructuredElement node has a PART_OF_DOCUMENT relationship with the Document node.
  • Each Chunk node also has a PART_OF_DOCUMENT relationship with the Document node.
  • Each UnstructuredElement node has a PART_OF_CHUNK relationship with a Chunk element.
  • Each Chunk node, except for the “last” Chunk node, has a NEXT_CHUNK relationship with its “next” Chunk node.

Learn more about document elements and chunking.

Some related example Neo4j graph queries include the following.

Query for all nodes:

MATCH (n)
RETURN n

Query for Chunk to Document relationships:

MATCH (chunk:Chunk)-[:PART_OF_DOCUMENT]->(doc:Document)
RETURN chunk, doc

Query for UnstructuredElement to Document relationships:

MATCH (element:UnstructuredElement)-[:PART_OF_DOCUMENT]->(doc:Document)
RETURN element, doc

Query for UnstructuredElement to Chunk relationships:

MATCH (element:UnstructuredElement)-[:PART_OF_CHUNK]->(chunk:Chunk)
RETURN element, chunk

Query for Chunk to Chunk relationships:

MATCH (this:Chunk)-[:NEXT_CHUNK]->(previous:Chunk)
RETURN this, previous

Query for UnstructuredElement to Chunk to Document relationships:

MATCH (element:UnstructuredElement)-[:PART_OF_CHUNK]-(chunk:Chunk)-[:PART_OF_DOCUMENT]->(doc:Document)
RETURN element, chunk, doc

Query for UnstructuredElements containing the text jury, and show their Chunk relationships:

MATCH (element:UnstructuredElement)-[:PART_OF_CHUNK]->(chunk:Chunk)
WHERE element.text =~ '(?i).*jury.*'
RETURN element, chunk

Query for the Chunk with the specified id, and show its UnstructuredElement relationships:

MATCH (element:UnstructuredElement)-[:PART_OF_CHUNK]->(chunk:Chunk)
WHERE chunk.id = '731508bf53637ce4431fe93f6028ebdf'
RETURN element, chunk

To create the destination connector:

  1. On the sidebar, click Connectors.
  2. Click Destinations.
  3. Cick New or Create Connector.
  4. Give the connector some unique Name.
  5. In the Provider area, click Neo4j.
  6. Click Continue.
  7. Follow the on-screen instructions to fill in the fields as described later on this page.
  8. Click Save and Test.

Fill in the following fields:

  • Name (required): A unique name for this connector.
  • URI (required): The connection URI for the Neo4j deployment, which typically starts with neo4j://, neo4j+s://, bolt://, or bolt+s://; is followed by the host name; and ends with a colon and the port number (such as :7473, :7474, or :7687).
  • Database (required): The name of the target database in the Neo4j deployment. A default Neo4j deployment typically contains a standard database named neo4j for user data.
  • Username (required): The name of the user who has access to the Neo4j deployment. A default Neo4j deployment typically contains a default user named neo4j.
  • Password (required): The password for the user.
  • Batch Size: The maximum number of nodes or relationships to be transmitted per batch. The default is 100 if not otherwise specified.