This page was recently updated. What do you think about it? Let us know!.
-
A Neo4j deployment.
- For the Unstructured UI or the Unstructured API, local Neo4j deployments are not supported.
- For Unstructured Ingest, local and non-local Neo4j deployments are supported.
-
The username and password for the user who has access to the Neo4j deployment. The default user is typically
neo4j
.- For a Neo4j Aura instance, the defaut user’s is typically set when the instance is created.
- For an AWS Marketplace, Microsoft Azure Marketplace, or Google Cloud Marketplace deployment of Neo4j, the default user is typically set during the deployment process.
- For a local Neo4j deployment, you can set the default user’s initial password or recover an admin user and its password.
-
The connection URI for the Neo4j deployment, which starts with
neo4j://
,neo4j+s://
,bolt://
, orbolt+s://
; followed bylocalhost
or the host name; and sometimes ending with a colon and the port number (such as:7687
). For example:- For a Neo4j Aura deployment, browse to the target Neo4j instance in the Neo4j Aura account and click Connect > Drivers to get the connection URI, which follows the format
neo4j+s://<host-name>
. A port number is not used or needed. - For an AWS Marketplace, Microsoft Azure Marketplace, or Google Cloud Marketplace deployment of Neo4j, see Neo4j on AWS, Neo4j on Azure, or Neo4j on GCP for details about how to get the connection URI.
- For a local Neo4j deployment, the URI is typically
bolt://localhost:7687
- For other Neo4j deployment types, see the deployment provider’s documentation.
- For a Neo4j Aura deployment, browse to the target Neo4j instance in the Neo4j Aura account and click Connect > Drivers to get the connection URI, which follows the format
-
The name of the target database in the Neo4j deployment. A default Neo4j deployment typically contains two standard databases: one named
neo4j
for user data and another namedsystem
for system data and metadata. Some Neo4j deployment types support more than these two databases per deployment; Neo4j Aura instances do not.- Create additional databases for a local Neo4j deployment that uses Enterprise Edition; or for Neo4j on AWS, Neo4j on Azure, or Neo4j on GCP deployments.
- Get a list of additional available databases for a local Neo4j deployment that uses Enterprise Edition; or for Neo4j on AWS, Neo4j on Azure, or Neo4j on GCP deployments.
CLI, Python
NEO4J_USERNAME
- The name of the target user with access to the target Neo4j deployment, represented by--username
(CLI) orusername
(Python).NEO4J_PASSWORD
- The user’s password, represented by--password
(CLI) orpassword
(Python).NEO4J_URI
- The connection URI for the deployment, represented by--uri
(CLI) oruri
(Python).NEO4J_DATABASE
- The name of the database in the deployment, represented by--database
(CLI) ordatabase
(Python).
--partition-by-api
option (CLI) or partition_by_api
(Python) parameter to specify where files are processed:
-
To do local file processing, omit
--partition-by-api
(CLI) orpartition_by_api
(Python), or explicitly specifypartition_by_api=False
(Python). Local file processing does not use an Unstructured API key or API URL, so you can also omit the following, if they appear:--api-key $UNSTRUCTURED_API_KEY
(CLI) orapi_key=os.getenv("UNSTRUCTURED_API_KEY")
(Python)--partition-endpoint $UNSTRUCTURED_API_URL
(CLI) orpartition_endpoint=os.getenv("UNSTRUCTURED_API_URL")
(Python)- The environment variables
UNSTRUCTURED_API_KEY
andUNSTRUCTURED_API_URL
-
To send files to the Unstructured Partition Endpoint for processing, specify
--partition-by-api
(CLI) orpartition_by_api=True
(Python). Unstructured also requires an Unstructured API key and API URL, by adding the following:--api-key $UNSTRUCTURED_API_KEY
(CLI) orapi_key=os.getenv("UNSTRUCTURED_API_KEY")
(Python)--partition-endpoint $UNSTRUCTURED_API_URL
(CLI) orpartition_endpoint=os.getenv("UNSTRUCTURED_API_URL")
(Python)- The environment variables
UNSTRUCTURED_API_KEY
andUNSTRUCTURED_API_URL
, representing your API key and API URL, respectively.
You must specify the API URL only if you are not using the default API URL for Unstructured Ingest, which applies to Starter and Team accounts.The default API URL for Unstructured Ingest ishttps://api.unstructuredapp.io/general/v0/general
, which is the API URL for the Unstructured Partition Endpoint. However, you should always use the URL that was provided to you when your Unstructured account was created. If you do not have this URL, email Unstructured Support at support@unstructured.io.If you do not have an API key, get one now.If you are using an Enterprise account, the process for generating Unstructured API keys, and the Unstructured API URL that you use, are different. For instructions, see your Unstructured account administrator, or email Unstructured Support at support@unstructured.io.
Graph Output
The graph ouput of the Neo4j destination connector is represented in the following diagram: View the preceding diagram in full-screen mode. In the preceding diagram:- The
Document
node represents the source file. - The
UnstructuredElement
nodes represent the source file’s UnstructuredElement
objects, before chunking. - The
Chunk
nodes represent the source file’s UnstructuredElement
objects, after chunking. - Each
UnstructuredElement
node has aPART_OF_DOCUMENT
relationship with theDocument
node. - Each
Chunk
node also has aPART_OF_DOCUMENT
relationship with theDocument
node. - Each
UnstructuredElement
node has aPART_OF_CHUNK
relationship with aChunk
element. - Each
Chunk
node, except for the “last”Chunk
node, has aNEXT_CHUNK
relationship with its “next”Chunk
node.
Chunk
to Document
relationships:
UnstructuredElement
to Document
relationships:
UnstructuredElement
to Chunk
relationships:
Chunk
to Chunk
relationships:
UnstructuredElement
to Chunk
to Document
relationships:
UnstructuredElements
containing the text jury
, and show their Chunk
relationships:
Chunk
with the specified id
, and show its UnstructuredElement
relationships:
Entity
nodes in the graph.
This additional graph ouput of the Neo4j destination connector is represented in the following diagram:
In the preceding diagram:
- The
Chunk
node represents one of the source file’s UnstructuredElement
objects, after chunking. - The
Entity
node represents a recognized entity. - A
Chunk
node can haveHAS_ENTITY
relationships withEntity
nodes. - An
Entity
node can haveENTITY_TYPE
relationships with otherEntity
nodes.
Entity
to Entity
relationships:
Entity
nodes containing the text PERSON
, and show their Entity
relationships:
Entity
nodes containing the text amendment
, and show their Chunk
relationships:
Entity
nodes containing the text PERSON
, and show their Entity
to Entity
to Chunk
relationships: