Connect Wikipedia to your preprocessing pipeline, and batch process all your documents using unstructured-ingest to store structured outputs locally on your filesystem.

First, install the Wikipedia dependencies as shown here.

pip install "unstructured-ingest[wikipedia]"

Provide the page-title to ingest the text from.

#!/usr/bin/env bash

unstructured-ingest \
  wikipedia \
    --page-title "Open Source Software" \
    --output-dir $LOCAL_FILE_OUTPUT_DIR \
    --num-processes 2 \
    --verbose \
    --strategy hi_res

For a full list of the options the Unstructured Ingest CLI accepts check unstructured-ingest wikipedia --help.