DocWire DocToText - Powered by Silvercoders 5.0.5
A multifaceted, data extraction software development toolkit that converts all sorts of files to plain text and html. Written in C++, this data extraction tool has a parser able to convert PST & OST files along with a brand new API for better file processing. To enhance its utility, DocToText, as a data extraction tool, can be integrated with other data mining and data analytics applications. It comes equipped with a high grade, scriptable and trainable OCR that has LSTM neural networks based character recognition. This document parser is able to extract metadata along with annotations and supports a list of formats that include: DOC, XLS, XLSB, PPT, RTF, ODF (ODT, ODS, ODP), OOXML (DOCX, XLSX, PPTX), iWork (PAGES, NUMBERS, KEYNOTE), ODFXML (FODP, FODS, FODT), PDF, EML, HTML, Outlook (PST, OST), Image (JPG, JPEG, JFIF, BMP, PNM, PNG, TIFF, WEBP) and DICOM (DCM)
doctotext::Transformer Class Referenceabstract

The Transformer transforms data from Importer or from another Transformer. More...

#include <transformer.h>

Inheritance diagram for doctotext::Transformer:

Public Member Functions

virtual Transformerclone () const =0
 Creates clone of the transformer. More...
 
virtual void transform (doctotext::Info &info) const =0
 Transforms document from importer. More...
 

Detailed Description

The Transformer transforms data from Importer or from another Transformer.

auto reverse_text = [](doctotext::Info &info) {
std::reverse(info.plain_text.begin(), info.plain_text.end())}; // create function to reverse text in callback
TransformerFunc transformer(reverse_text); // wraps into transformer
Importer(parser_manager, "test.pdf") | transformer | PlainTextExporter | std::cout; // reverse text in pdf file
The Importer class. This class is used to import a file and parse it using available parsers.
Definition: importer.h:57
Exporter class for plain text output.
Definition: exporter.h:137
Wraps single function (doctotext::NewNodeCallback) into Transformer object.
Definition: transformer.h:87

Definition at line 58 of file transformer.h.

Member Function Documentation

◆ clone()

virtual Transformer * doctotext::Transformer::clone ( ) const
pure virtual

Creates clone of the transformer.

Returns
new transformer

Implemented in doctotext::TransformerFunc.

◆ transform()

virtual void doctotext::Transformer::transform ( doctotext::Info info) const
pure virtual

Transforms document from importer.

Parameters
infostructure from callback function

Implemented in doctotext::TransformerFunc.


The documentation for this class was generated from the following file: