The legacy of the Reverend Dr. Martin Luther King, Jr. is one of courage, justice, and transformation. Historical records surrounding the assassination of Dr. King on April 4, 1968 were declassified on July 21, 2025 by the U.S. National Archives and Records Administration. These documents provide important insights into a pivotal moment in American history and the civil rights movement. Unstructured has published several resources that allow you to use techniques such as retrieval-augmented generation (RAG) to explore these documents and use them in your own applications. These resources include:
  • The MLK Archive Research Assistant website, hosted by Unstructured, which allows you to ask natural-language questions about these documents and get back answers in natural language, all in chatbot-style format. To support these answers, source citations are also provided, including confidence scores and page references.
  • A notebook, developed by Unstructured and hosted in Google Colab, which demonstrates how to use a Jupyter notebook to access the same resources as the preceding MLK Archive Research Assistant website.
  • A GitHub repository that contains additional details and source code for the preceding resources.
  • A raw dataset website, also hosted by Unstructured, that contains links to all of the released 6,301 PDF source files spanning 243,496 pages of content, and 1 MP3 audio source file, for your own use. Additionally, Unstructured has processed these PDF source files and made the processed output available as a single, downloadable 12.9 GB JSON Lines file through this website.
For questions, feedback, or bug reports about any of the Unstructured-provided resources, post an issue on GitHub. You can also view and track open issues. For questions or feedback about the historical records themselves, contact the National Archives.