DocWire DocToText - Powered by Silvercoders 5.0.5
A multifaceted, data extraction software development toolkit that converts all sorts of files to plain text and html. Written in C++, this data extraction tool has a parser able to convert PST & OST files along with a brand new API for better file processing. To enhance its utility, DocToText, as a data extraction tool, can be integrated with other data mining and data analytics applications. It comes equipped with a high grade, scriptable and trainable OCR that has LSTM neural networks based character recognition. This document parser is able to extract metadata along with annotations and supports a list of formats that include: DOC, XLS, XLSB, PPT, RTF, ODF (ODT, ODS, ODP), OOXML (DOCX, XLSX, PPTX), iWork (PAGES, NUMBERS, KEYNOTE), ODFXML (FODP, FODS, FODT), PDF, EML, HTML, Outlook (PST, OST), Image (JPG, JPEG, JFIF, BMP, PNM, PNG, TIFF, WEBP) and DICOM (DCM)
Class Hierarchy

Go to the graphical class hierarchy

This inheritance list is sorted roughly, but not completely, alphabetically:
[detail level 123]
 Cstd::exception
 Cdoctotext::Exception
 Cdoctotext::EncryptedFileException
 Cdoctotext::ExporterExporter class is responsible for exporting the parsed data from importer or transformer to an output stream
 Cdoctotext::HtmlExporterExporter class for HTML output
 Cdoctotext::MetaDataExporterExporter class for meta data. Important: Exports only meta data as a plain text
 Cdoctotext::PlainTextExporterExporter class for plain text output
 Cdoctotext::FormattingStyle
 Cdoctotext::ImporterThe Importer class. This class is used to import a file and parse it using available parsers
 Cdoctotext::Info
 Cdoctotext::ListStyle
 Cdoctotext::Metadata
 Cdoctotext::ParserAbstract class for all parsers
 CCustomParser
 Cdoctotext::ParserWrapper< ParserType >
 Cdoctotext::parser_creator< ParserType >
 Cdoctotext::ParserBuilder
 CCustomParserBuilder
 Cdoctotext::ParserBuilderWrapper< ParserCreator >Provides the basic mechanism to build any parser
 Cdoctotext::ParserManagerParser manager class. Loads all available parsers and provides access to them
 Cdoctotext::ParserParametersStores list of parsers parameters. Every parser can query ParserParameter for a specific parameter. For example OCRParser queries ParserParameters for a language. Every parser contains ParserParameters and recursively passes it to another parser
 Cdoctotext::ParserProviderThe ParserProvider class
 CCustomParserProvider[plugin_example_1]
 Cdoctotext::ParsingChainParsingChain class is a wrapper for all defined steps of the parsing process
 Cdoctotext::SimpleExtractorBasic functionality for extracting text from a document
 Cdoctotext::StandardFilterSets of standard filters to use in parsers. example of use:
 Cdoctotext::StandardTagContains set of basic tags using in parsers
 Cdoctotext::TransformerThe Transformer transforms data from Importer or from another Transformer
 Cdoctotext::TransformerFuncWraps single function (doctotext::NewNodeCallback) into Transformer object
 Cdoctotext::wrapper_parser_creator< ParserType >
 CWriter
 Cdoctotext::HtmlWriterThe HTMLWriter class
 Cdoctotext::MetaDataWriterWrites the meta data of the document as plain text to an output stream
 Cdoctotext::PlainTextWriter