Biomed
Connect Biomed to your preprocessing pipeline, and batch process all your documents using unstructured-ingest
to
store structured outputs locally on your filesystem.
This connector allows you to extract Biomedical documents from the supported FTP directories:
Make sure to have the Biomed dependencies installed:
You need to provide the path, from which the documents should be downloaded. For example, to download the documents in
the path: https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/07/
, set the path
parameter to oa_pdf/07/
Make sure to set the --partition-by-api
flag and pass in your API key with --api-key
:
Additionally, if you’re using Unstructured Serverless API, your locally deployed Unstructured API, or an Unstructured API
deployed on Azure or AWS, you also need to specify the API URL via the --partition-endpoint
argument.