> ## Documentation Index
> Fetch the complete documentation index at: https://docs.unstructured.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Kafka

<Note>
  First time creating a connector? [Read this first](/pipelines/connector-first-time-reqs).
</Note>

Ingest your files into Unstructured from Kafka.

The requirements are as follows.

* A Kafka cluster in [Confluent Cloud](https://www.confluent.io/confluent-cloud).
  ([Create a cluster](https://docs.confluent.io/cloud/current/clusters/create-cluster.html#create-ak-clusters).)

  The following video shows how to set up a Kafka cluster in Confluent Cloud:

  <iframe width="560" height="315" src="https://www.youtube.com/embed/zcKJ96J4Xvk" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen />

* The [hostname and port number](https://docs.confluent.io/cloud/current/clusters/create-cluster.html#view-a-ak-cluster) of the bootstrap Kafka cluster to connect to..

* The name of the topic to read messages from or write messages to on the cluster.
  [Create a topic](https://docs.confluent.io/cloud/current/client-apps/topics/index.html#create-topics).
  [Access available topics](https://docs.confluent.io/cloud/current/client-apps/topics/index.html#create-topics).

* For authentication, an [API key and secret](https://docs.confluent.io/cloud/current/security/authenticate/workload-identities/service-accounts/api-keys/manage-api-keys.html#add-an-api-key).

To create the source connector:

1. On the sidebar, click **Connectors**.
2. Click **Sources**.
3. Cick **New** or **Create Connector**.
4. Give the connector some unique **Name**.
5. In the **Provider** area, click **Kafka**.
6. Click **Continue**.
7. Follow the on-screen instructions to fill in the fields as described later on this page.
8. Click **Save and Test**.

Fill in the following fields:

* **Name** (*required*): A unique name for this connector.
* **Bootstrap Server** (*required*): The hostname of the bootstrap Kafka cluster to connect to.
* **Port**: The port number of the cluster.
* **Group ID**: The ID of the consumer group, if any, that is associated with the target Kafka cluster.
  (A consumer group is a way to allow a pool of consumers to divide the consumption of data
  over topics and partitions.) The default is `default_group_id` if not otherwise specified.
* **Topic** (*required*): The unique name of the topic to read messages from and write messages to on the cluster.
* **Number of messages to consume**: The maximum number of messages to get from the topic. The default is `100` if not otherwise specified.
* **Batch Size**: The maximum number of messages to send in a single batch. The default is `100` if not otherwise specified.
* **API Key** (*required*): The Kafka API key value.
* **Secret** (*required*): The secret value for the Kafka API key.

## Learn more

* <Icon icon="blog" />  [Unstructured Platform Now Integrates with Apache Kafka in Confluent Cloud](https://unstructured.io/blog/unstructured-platform-now-integrates-with-apache-kafka-in-confluent-cloud)
