Connect Discord to your preprocessing pipeline, and batch process all your documents using unstructured-ingest to store structured outputs locally on your filesystem.

Make sure to have the Discord dependencies installed:

Shell
pip install "unstructured-ingest[discord]"

To ingests the contents of Discord channels, you need to supply the following information:

  • token: an authentication token used to access Discord API
  • channels: a list of discord channel ids to ingest from

Optionally you can set the number of days to go back in history of the channels via the period argument.

For a full list of the options the Unstructured Ingest CLI accepts check unstructured-ingest discord --help.