DocWire DocToText - Powered by Silvercoders 5.0.5
A multifaceted, data extraction software development toolkit that converts all sorts of files to plain text and html. Written in C++, this data extraction tool has a parser able to convert PST & OST files along with a brand new API for better file processing. To enhance its utility, DocToText, as a data extraction tool, can be integrated with other data mining and data analytics applications. It comes equipped with a high grade, scriptable and trainable OCR that has LSTM neural networks based character recognition. This document parser is able to extract metadata along with annotations and supports a list of formats that include: DOC, XLS, XLSB, PPT, RTF, ODF (ODT, ODS, ODP), OOXML (DOCX, XLSX, PPTX), iWork (PAGES, NUMBERS, KEYNOTE), ODFXML (FODP, FODS, FODT), PDF, EML, HTML, Outlook (PST, OST), Image (JPG, JPEG, JFIF, BMP, PNM, PNG, TIFF, WEBP) and DICOM (DCM)
doctotext::ParserManager Class Reference

Parser manager class. Loads all available parsers and provides access to them. More...

#include <parser_manager.h>

Public Member Functions

 ParserManager (const std::string &plugins_directory)
 
std::optional< ParserBuilder * > findParserByExtension (const std::string &file_name) const
 Returns parser builder for given extension type or nullopt if no parser is found. More...
 
std::optional< ParserBuilder * > findParserByData (const std::vector< char > &buffer) const
 Returns parser builder for given raw data or nullopt if no parser is found. More...
 
std::set< std::string > getAvailableExtensions () const
 Returns all available parsers. More...
 

Detailed Description

Parser manager class. Loads all available parsers and provides access to them.

Examples
example_5.cpp.

Definition at line 52 of file parser_manager.h.

Constructor & Destructor Documentation

◆ ParserManager()

doctotext::ParserManager::ParserManager ( const std::string &  plugins_directory)
explicit
Parameters
plugins_directorylocalization with plugins to be loaded

Member Function Documentation

◆ findParserByData()

std::optional< ParserBuilder * > doctotext::ParserManager::findParserByData ( const std::vector< char > &  buffer) const

Returns parser builder for given raw data or nullopt if no parser is found.

Parameters
bufferbuffer of raw data
Returns
specific parser builder or nullopt if no parser is found

◆ findParserByExtension()

std::optional< ParserBuilder * > doctotext::ParserManager::findParserByExtension ( const std::string &  file_name) const

Returns parser builder for given extension type or nullopt if no parser is found.

Parameters
file_namefile name with extension (e.g. ".txt", ".docx", etc.)
Returns
specific parser builder or nullopt if no parser is found
Examples
example_5.cpp.

◆ getAvailableExtensions()

std::set< std::string > doctotext::ParserManager::getAvailableExtensions ( ) const

Returns all available parsers.

Returns
sets of all available parsers

The documentation for this class was generated from the following file: