Source connectors
Biomed
Connect Biomed to your preprocessing pipeline, and batch process all your documents using unstructured-ingest
to
store structured outputs locally on your filesystem.
This connector allows you to extract Biomedical documents from the supported FTP directories:
Make sure to have the Biomed dependencies installed:
Shell
You need to provide the path, from which the documents should be downloaded. For example, to download the documents in
the path: https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/07/
, set the path
parameter to oa_pdf/07/
For a full list of the options the Unstructured Ingest CLI accepts check unstructured-ingest biomed --help
.