For those who are interested in extracting information from unstructured PDF documents, HTML pages, plain text and even document images (Tiff, PNG, scanned PDF):
Check out our REST API documentations: https://github.com/quantxt/api-docs
Try out the API on RapidAPI: https://rapidapi.com/quantxt-inc-theia/api/document-parser-and-extraction