DocWire DocToText - Powered by Silvercoders 5.0.5
A multifaceted, data extraction software development toolkit that converts all sorts of files to plain text and html. Written in C++, this data extraction tool has a parser able to convert PST & OST files along with a brand new API for better file processing. To enhance its utility, DocToText, as a data extraction tool, can be integrated with other data mining and data analytics applications. It comes equipped with a high grade, scriptable and trainable OCR that has LSTM neural networks based character recognition. This document parser is able to extract metadata along with annotations and supports a list of formats that include: DOC, XLS, XLSB, PPT, RTF, ODF (ODT, ODS, ODP), OOXML (DOCX, XLSX, PPTX), iWork (PAGES, NUMBERS, KEYNOTE), ODFXML (FODP, FODS, FODT), PDF, EML, HTML, Outlook (PST, OST), Image (JPG, JPEG, JFIF, BMP, PNM, PNG, TIFF, WEBP) and DICOM (DCM)
doctotext::ParserWrapper< ParserType > Class Template Reference
Inheritance diagram for doctotext::ParserWrapper< ParserType >:
Collaboration diagram for doctotext::ParserWrapper< ParserType >:

Public Member Functions

 ParserWrapper (const std::string &file_name, const std::shared_ptr< doctotext::ParserManager > &inParserManager=nullptr)
 
 ParserWrapper (const char *buffer, size_t size, const std::shared_ptr< doctotext::ParserManager > &inParserManager=nullptr)
 
void parse () const override
 Executes text parsing. More...
 
ParserwithParameters (const ParserParameters &parameters) override
 
void setParserManager (const std::shared_ptr< doctotext::ParserManager > &inParserManager)
 
- Public Member Functions inherited from doctotext::Parser
 Parser (const std::shared_ptr< doctotext::ParserManager > &inParserManager=nullptr)
 
virtual void parse () const =0
 Executes text parsing. More...
 
virtual ParseraddOnNewNodeCallback (NewNodeCallback callback)
 Adds new function to execute when new node will be created. Node is a part of parsed text. Depends on the kind of parser it could be. For example, email from pst file or page from pdf file. More...
 
virtual ParserwithParameters (const ParserParameters &parameters)
 

Additional Inherited Members

- Protected Member Functions inherited from doctotext::Parser
FormattingStyle getFormattingStyle () const
 Loads FormattingStyle from ParserParameters. More...
 
std::ostream & getLogOutStream () const
 
bool isVerboseLogging () const
 
Info sendTag (const std::string &tag_name, const std::string &text="", const std::map< std::string, std::any > &attributes={}) const
 
Info sendTag (const Info &info) const
 
- Protected Attributes inherited from doctotext::Parser
std::shared_ptr< doctotext::ParserManagerm_parser_manager
 
ParserParameters m_parameters
 

Detailed Description

template<typename ParserType>
class doctotext::ParserWrapper< ParserType >

Definition at line 48 of file parser_wrapper.h.

Constructor & Destructor Documentation

◆ ParserWrapper() [1/2]

template<typename ParserType >
doctotext::ParserWrapper< ParserType >::ParserWrapper ( const std::string &  file_name,
const std::shared_ptr< doctotext::ParserManager > &  inParserManager = nullptr 
)
inlineexplicit

Definition at line 51 of file parser_wrapper.h.

◆ ParserWrapper() [2/2]

template<typename ParserType >
doctotext::ParserWrapper< ParserType >::ParserWrapper ( const char *  buffer,
size_t  size,
const std::shared_ptr< doctotext::ParserManager > &  inParserManager = nullptr 
)
inline

Definition at line 56 of file parser_wrapper.h.

Member Function Documentation

◆ parse()

template<typename ParserType >
void doctotext::ParserWrapper< ParserType >::parse ( ) const
inlineoverridevirtual

Executes text parsing.

Implements doctotext::Parser.

Definition at line 61 of file parser_wrapper.h.

◆ setParserManager()

template<typename ParserType >
void doctotext::ParserWrapper< ParserType >::setParserManager ( const std::shared_ptr< doctotext::ParserManager > &  inParserManager)
inline

Definition at line 75 of file parser_wrapper.h.

◆ withParameters()

template<typename ParserType >
Parser & doctotext::ParserWrapper< ParserType >::withParameters ( const ParserParameters parameters)
inlineoverridevirtual

Reimplemented from doctotext::Parser.

Definition at line 67 of file parser_wrapper.h.


The documentation for this class was generated from the following file: