DocWire DocToText - Powered by Silvercoders 5.0.5
A multifaceted, data extraction software development toolkit that converts all sorts of files to plain text and html. Written in C++, this data extraction tool has a parser able to convert PST & OST files along with a brand new API for better file processing. To enhance its utility, DocToText, as a data extraction tool, can be integrated with other data mining and data analytics applications. It comes equipped with a high grade, scriptable and trainable OCR that has LSTM neural networks based character recognition. This document parser is able to extract metadata along with annotations and supports a list of formats that include: DOC, XLS, XLSB, PPT, RTF, ODF (ODT, ODS, ODP), OOXML (DOCX, XLSX, PPTX), iWork (PAGES, NUMBERS, KEYNOTE), ODFXML (FODP, FODS, FODT), PDF, EML, HTML, Outlook (PST, OST), Image (JPG, JPEG, JFIF, BMP, PNM, PNG, TIFF, WEBP) and DICOM (DCM)
doctotext::Importer Class Reference

The Importer class. This class is used to import a file and parse it using available parsers. More...

#include <importer.h>

Public Member Functions

 Importer (const ParserParameters &parameters=ParserParameters(), const std::shared_ptr< ParserManager > &parser_manager=std::make_shared< ParserManager >())
 
 Importer (const std::string &file_name, const ParserParameters &parameters=ParserParameters(), const std::shared_ptr< ParserManager > &parser_manager=std::make_shared< ParserManager >())
 
 Importer (std::istream &input_stream, const ParserParameters &parameters=ParserParameters(), const std::shared_ptr< ParserManager > &parser_manager=std::make_shared< ParserManager >())
 
 Importer (const Importer &other)
 
Importeroperator= (const Importer &other)
 
void set_input_stream (std::istream &input_stream)
 Sets new input stream to parse. More...
 
bool is_valid () const
 Check if Importer contains valid input data (path to file or stream). More...
 
void add_callback (const NewNodeCallback &callback)
 Adds callback. Callbacks will execute when parser returns new node. More...
 
void add_parameters (const ParserParameters &parameters)
 Adds parser parameters. More...
 
void process () const
 Starts parsing process.
 
void disconnect_all ()
 Disconnects all listeners.
 

Detailed Description

The Importer class. This class is used to import a file and parse it using available parsers.

Importer(parser_manager, "file.pdf") | HtmlExporter() | std::cout; // Imports file.pdf and exports it to std::cout as HTML
Exporter class for HTML output.
Definition: exporter.h:124
Importer(const ParserParameters &parameters=ParserParameters(), const std::shared_ptr< ParserManager > &parser_manager=std::make_shared< ParserManager >())
See also
Parser
Examples
example_1.cpp, example_2.cpp, example_3.cpp, and example_4.cpp.

Definition at line 56 of file importer.h.

Constructor & Destructor Documentation

◆ Importer() [1/3]

doctotext::Importer::Importer ( const ParserParameters parameters = ParserParameters(),
const std::shared_ptr< ParserManager > &  parser_manager = std::make_shared< ParserManager >() 
)
explicit
Parameters
parametersparser parameters
parser_managerpointer to the parser manager

◆ Importer() [2/3]

doctotext::Importer::Importer ( const std::string &  file_name,
const ParserParameters parameters = ParserParameters(),
const std::shared_ptr< ParserManager > &  parser_manager = std::make_shared< ParserManager >() 
)
Parameters
file_namename of the file to parse
parametersparser parameters
parser_managerpointer to the parser manager

◆ Importer() [3/3]

doctotext::Importer::Importer ( std::istream &  input_stream,
const ParserParameters parameters = ParserParameters(),
const std::shared_ptr< ParserManager > &  parser_manager = std::make_shared< ParserManager >() 
)
Parameters
input_streaminput stream to parse
parametersparser parameters
parser_managerpointer to the parser manager

Member Function Documentation

◆ add_callback()

void doctotext::Importer::add_callback ( const NewNodeCallback &  callback)

Adds callback. Callbacks will execute when parser returns new node.

Parameters
listener

◆ add_parameters()

void doctotext::Importer::add_parameters ( const ParserParameters parameters)

Adds parser parameters.

Parameters
parametersparser parameters

◆ is_valid()

bool doctotext::Importer::is_valid ( ) const

Check if Importer contains valid input data (path to file or stream).

Returns
true if valid

◆ set_input_stream()

void doctotext::Importer::set_input_stream ( std::istream &  input_stream)

Sets new input stream to parse.

Parameters
input_streamnew input stream to parse

The documentation for this class was generated from the following file: