Source connectors
Wikipedia
Connect Wikipedia to your preprocessing pipeline, and batch process all your documents using unstructured-ingest
to store structured outputs locally on your filesystem.
First, install the Wikipedia dependencies as shown here.
Provide the page-title
to ingest the text from.
For a full list of the options the Unstructured Ingest CLI accepts check unstructured-ingest wikipedia --help
.