|
DocWire DocToText - Powered by Silvercoders 5.0.5
A multifaceted, data extraction software development toolkit that converts all sorts of files to plain text and html. Written in C++, this data extraction tool has a parser able to convert PST & OST files along with a brand new API for better file processing. To enhance its utility, DocToText, as a data extraction tool, can be integrated with other data mining and data analytics applications. It comes equipped with a high grade, scriptable and trainable OCR that has LSTM neural networks based character recognition. This document parser is able to extract metadata along with annotations and supports a list of formats that include: DOC, XLS, XLSB, PPT, RTF, ODF (ODT, ODS, ODP), OOXML (DOCX, XLSX, PPTX), iWork (PAGES, NUMBERS, KEYNOTE), ODFXML (FODP, FODS, FODT), PDF, EML, HTML, Outlook (PST, OST), Image (JPG, JPEG, JFIF, BMP, PNM, PNG, TIFF, WEBP) and DICOM (DCM)
|
The Importer class. This class is used to import a file and parse it using available parsers. More...
#include <importer.h>
Public Member Functions | |
| Importer (const ParserParameters ¶meters=ParserParameters(), const std::shared_ptr< ParserManager > &parser_manager=std::make_shared< ParserManager >()) | |
| Importer (const std::string &file_name, const ParserParameters ¶meters=ParserParameters(), const std::shared_ptr< ParserManager > &parser_manager=std::make_shared< ParserManager >()) | |
| Importer (std::istream &input_stream, const ParserParameters ¶meters=ParserParameters(), const std::shared_ptr< ParserManager > &parser_manager=std::make_shared< ParserManager >()) | |
| Importer (const Importer &other) | |
| Importer & | operator= (const Importer &other) |
| void | set_input_stream (std::istream &input_stream) |
| Sets new input stream to parse. More... | |
| bool | is_valid () const |
| Check if Importer contains valid input data (path to file or stream). More... | |
| void | add_callback (const NewNodeCallback &callback) |
| Adds callback. Callbacks will execute when parser returns new node. More... | |
| void | add_parameters (const ParserParameters ¶meters) |
| Adds parser parameters. More... | |
| void | process () const |
| Starts parsing process. | |
| void | disconnect_all () |
| Disconnects all listeners. | |
The Importer class. This class is used to import a file and parse it using available parsers.
Definition at line 56 of file importer.h.
|
explicit |
| parameters | parser parameters |
| parser_manager | pointer to the parser manager |
| doctotext::Importer::Importer | ( | const std::string & | file_name, |
| const ParserParameters & | parameters = ParserParameters(), |
||
| const std::shared_ptr< ParserManager > & | parser_manager = std::make_shared< ParserManager >() |
||
| ) |
| file_name | name of the file to parse |
| parameters | parser parameters |
| parser_manager | pointer to the parser manager |
| doctotext::Importer::Importer | ( | std::istream & | input_stream, |
| const ParserParameters & | parameters = ParserParameters(), |
||
| const std::shared_ptr< ParserManager > & | parser_manager = std::make_shared< ParserManager >() |
||
| ) |
| input_stream | input stream to parse |
| parameters | parser parameters |
| parser_manager | pointer to the parser manager |
| void doctotext::Importer::add_callback | ( | const NewNodeCallback & | callback | ) |
Adds callback. Callbacks will execute when parser returns new node.
| listener |
| void doctotext::Importer::add_parameters | ( | const ParserParameters & | parameters | ) |
Adds parser parameters.
| parameters | parser parameters |
| bool doctotext::Importer::is_valid | ( | ) | const |
Check if Importer contains valid input data (path to file or stream).
| void doctotext::Importer::set_input_stream | ( | std::istream & | input_stream | ) |
Sets new input stream to parse.
| input_stream | new input stream to parse |