Ingest your files into Unstructured from Kafka.

The requirements are as follows.

To create or change a Kafka source connector, see the following examples.

# ...
from unstructured_client.models.shared import <Create|Update>SourceConnector
# ...
source_connector = <Create|Update>SourceConnector(
    name="<name>", # Create only.
    type="kafka-cloud", # Create only.
    config={
        "bootstrap_server": "<bootstrap-server>",
        "port": <port>,
        "group_id": "<group-id>",
        "kafka_api_key": "<kafka-api-key>",
        "secret": "<secret>",
        "topic": "<topic>",
        "num_messages_to_consume": <num-messages-to-consume>
    }
)

Replace the preceding placeholders as follows:

  • <name> (required) - A unique name for this connector.
  • <bootstrap-server> - The hostname of the bootstrap Kafka cluster to connect to.
  • <port> - The port number of the bootstrap Kafka cluster to connect to. The default is 9092 if not otherwise specified.
  • <group-id> - The ID of the consumer group. A consumer group is a way to allow a pool of consumers to divide the consumption of data over topics and partitions. The default is default_group_id if not otherwise specified.
  • <kafka-api-key> - For authentication, the API key for access to the cluster.
  • <secret> - For authentication, the secret for access to the cluster.
  • <topic> - The name of the topic to read messages from or write messages to on the cluster.
  • <num-messages-to-consume> - The maximum number of messages that the consumer will try to consume. The default is 100 if not otherwise specified.

To change a connector, replace <connector-id> with the source connector’s unique ID. To get this ID, see List source connectors.