DocWire DocToText - Powered by Silvercoders 5.0.5
A multifaceted, data extraction software development toolkit that converts all sorts of files to plain text and html. Written in C++, this data extraction tool has a parser able to convert PST & OST files along with a brand new API for better file processing. To enhance its utility, DocToText, as a data extraction tool, can be integrated with other data mining and data analytics applications. It comes equipped with a high grade, scriptable and trainable OCR that has LSTM neural networks based character recognition. This document parser is able to extract metadata along with annotations and supports a list of formats that include: DOC, XLS, XLSB, PPT, RTF, ODF (ODT, ODS, ODP), OOXML (DOCX, XLSX, PPTX), iWork (PAGES, NUMBERS, KEYNOTE), ODFXML (FODP, FODS, FODT), PDF, EML, HTML, Outlook (PST, OST), Image (JPG, JPEG, JFIF, BMP, PNM, PNG, TIFF, WEBP) and DICOM (DCM)
doctotext::ParserBuilder Class Referenceabstract

#include <parser_builder.h>

Inheritance diagram for doctotext::ParserBuilder:

Public Member Functions

virtual std::unique_ptr< Parserbuild (const std::string &inFileName) const =0
 Builds new parser object. More...
 
virtual std::unique_ptr< Parserbuild (const char *buffer, size_t size) const =0
 Builds new parser object. More...
 
virtual ParserBuilderwithLogStream (std::ostream *log_stream)=0
 Sets log stream for parser. More...
 
virtual ParserBuilderwithVerboseLogging (bool verbose)=0
 Turns on/off verbose logging. More...
 
virtual ParserBuilderwithOnNewNodeCallbacks (const std::vector< NewNodeCallback > &callbacks)=0
 Adds callback function. More...
 
virtual ParserBuilderwithParserManager (const std::shared_ptr< ParserManager > &inParserManager)=0
 Sets parser manager. More...
 
virtual ParserBuilderwithParameters (const ParserParameters &inParameters)=0
 Sets parser parameters. More...
 

Detailed Description

Abstract class to build parsers. Parser could be built from path to file or from data buffer.

Examples
example_9.cpp.

Definition at line 50 of file parser_builder.h.

Member Function Documentation

◆ build() [1/2]

virtual std::unique_ptr< Parser > doctotext::ParserBuilder::build ( const char *  buffer,
size_t  size 
) const
pure virtual

Builds new parser object.

Parameters
bufferraw data of file to be parsed
sizefile size
Returns
pointer to new parser object

Implemented in doctotext::ParserBuilderWrapper< ParserCreator >.

◆ build() [2/2]

virtual std::unique_ptr< Parser > doctotext::ParserBuilder::build ( const std::string &  inFileName) const
pure virtual

Builds new parser object.

Parameters
inFileNamepath to file
Returns
pointer to new parser object

Implemented in doctotext::ParserBuilderWrapper< ParserCreator >.

◆ withLogStream()

virtual ParserBuilder & doctotext::ParserBuilder::withLogStream ( std::ostream *  log_stream)
pure virtual

Sets log stream for parser.

Parameters
log_stream

Implemented in doctotext::ParserBuilderWrapper< ParserCreator >.

◆ withOnNewNodeCallbacks()

virtual ParserBuilder & doctotext::ParserBuilder::withOnNewNodeCallbacks ( const std::vector< NewNodeCallback > &  callbacks)
pure virtual

Adds callback function.

Parameters
callbacks

◆ withParameters()

virtual ParserBuilder & doctotext::ParserBuilder::withParameters ( const ParserParameters inParameters)
pure virtual

Sets parser parameters.

Parameters
inParameters

Implemented in doctotext::ParserBuilderWrapper< ParserCreator >.

◆ withParserManager()

virtual ParserBuilder & doctotext::ParserBuilder::withParserManager ( const std::shared_ptr< ParserManager > &  inParserManager)
pure virtual

Sets parser manager.

Parameters
inParserManager

◆ withVerboseLogging()

virtual ParserBuilder & doctotext::ParserBuilder::withVerboseLogging ( bool  verbose)
pure virtual

Turns on/off verbose logging.

Parameters
verbose

Implemented in doctotext::ParserBuilderWrapper< ParserCreator >.


The documentation for this class was generated from the following file: