DocWire DocToText - Powered by Silvercoders 5.0.5
A multifaceted, data extraction software development toolkit that converts all sorts of files to plain text and html. Written in C++, this data extraction tool has a parser able to convert PST & OST files along with a brand new API for better file processing. To enhance its utility, DocToText, as a data extraction tool, can be integrated with other data mining and data analytics applications. It comes equipped with a high grade, scriptable and trainable OCR that has LSTM neural networks based character recognition. This document parser is able to extract metadata along with annotations and supports a list of formats that include: DOC, XLS, XLSB, PPT, RTF, ODF (ODT, ODS, ODP), OOXML (DOCX, XLSX, PPTX), iWork (PAGES, NUMBERS, KEYNOTE), ODFXML (FODP, FODS, FODT), PDF, EML, HTML, Outlook (PST, OST), Image (JPG, JPEG, JFIF, BMP, PNM, PNG, TIFF, WEBP) and DICOM (DCM)
File List
Here is a list of all documented files with brief descriptions:
[detail level 12]
  examples
 example_1.cpp
 example_2.cpp
 example_3.cpp
 example_4.cpp
 example_5.cpp
 example_6.cpp
 example_7.cpp
 example_8.cpp
 example_9.cpp
 doctotext_c_api.hFile contains c api for doctotext software
 exception.h
 exporter.h
 formatting_style.h
 html_writer.h
 importer.h
 meta_data_writer.h
 metadata.h
 parser.h
 parser_builder.h
 parser_manager.h
 parser_parameters.h
 parser_provider.h
 parser_wrapper.h
 parsing_chain.h
 plain_text_writer.h
 simple_extractor.h
 standard_filter.h
 transformer.h