- The MLK Archive Research Assistant website, hosted by Unstructured, which allows you to ask natural-language questions about these documents and get back answers in natural language, all in chatbot-style format. To support these answers, source citations are also provided, including confidence scores and page references.
- A notebook, developed by Unstructured and hosted in Google Colab, which demonstrates how to use a Jupyter notebook to access the same resources as the preceding MLK Archive Research Assistant website.
- A GitHub repository that contains additional details and source code for the preceding resources.
- A raw dataset website, also hosted by Unstructured, that contains links to all of the released 6,301 PDF source files spanning 243,496 pages of content, and 1 MP3 audio source file, for your own use. Additionally, Unstructured has processed these PDF source files and made the processed output available as a single, downloadable 12.9 GB JSON Lines file through this website.