DocWire DocToText - Powered by Silvercoders 5.0.5
A multifaceted, data extraction software development toolkit that converts all sorts of files to plain text and html. Written in C++, this data extraction tool has a parser able to convert PST & OST files along with a brand new API for better file processing. To enhance its utility, DocToText, as a data extraction tool, can be integrated with other data mining and data analytics applications. It comes equipped with a high grade, scriptable and trainable OCR that has LSTM neural networks based character recognition. This document parser is able to extract metadata along with annotations and supports a list of formats that include: DOC, XLS, XLSB, PPT, RTF, ODF (ODT, ODS, ODP), OOXML (DOCX, XLSX, PPTX), iWork (PAGES, NUMBERS, KEYNOTE), ODFXML (FODP, FODS, FODT), PDF, EML, HTML, Outlook (PST, OST), Image (JPG, JPEG, JFIF, BMP, PNM, PNG, TIFF, WEBP) and DICOM (DCM)
doctotext::TransformerFunc Class Reference

Wraps single function (doctotext::NewNodeCallback) into Transformer object. More...

#include <transformer.h>

Inheritance diagram for doctotext::TransformerFunc:
Collaboration diagram for doctotext::TransformerFunc:

Public Member Functions

 TransformerFunc (doctotext::NewNodeCallback transformer_function)
 
 TransformerFunc (const TransformerFunc &other)
 
void transform (doctotext::Info &info) const override
 Executes transform operation for given node data. More...
 
TransformerFuncclone () const override
 Creates clone of the transformer. More...
 
virtual Transformerclone () const =0
 Creates clone of the transformer. More...
 
virtual void transform (doctotext::Info &info) const =0
 Transforms document from importer. More...
 

Detailed Description

Wraps single function (doctotext::NewNodeCallback) into Transformer object.

auto reverse_text = [](doctotext::Info &info) {
std::reverse(info.plain_text.begin(), info.plain_text.end())}; // create function to reverse text in callback
TransformerFunc transformer(reverse_text); // wraps into transformer
Importer(parser_manager, "test.pdf") | transformer | PlainTextExporter | std::cout; // reverse text in pdf file
The Importer class. This class is used to import a file and parse it using available parsers.
Definition: importer.h:57
Exporter class for plain text output.
Definition: exporter.h:137
Wraps single function (doctotext::NewNodeCallback) into Transformer object.
Definition: transformer.h:87
Examples
example_3.cpp, and example_4.cpp.

Definition at line 86 of file transformer.h.

Constructor & Destructor Documentation

◆ TransformerFunc()

doctotext::TransformerFunc::TransformerFunc ( doctotext::NewNodeCallback  transformer_function)
Parameters
transformer_functioncallback function, which will be called in transform(). It should modify info structure.
See also
doctotext::Info

Member Function Documentation

◆ clone()

TransformerFunc * doctotext::TransformerFunc::clone ( ) const
overridevirtual

Creates clone of the transformer.

Returns
new transformer

Implements doctotext::Transformer.

◆ transform()

void doctotext::TransformerFunc::transform ( doctotext::Info info) const
overridevirtual

Executes transform operation for given node data.

See also
doctotext::Info
Parameters
info

Implements doctotext::Transformer.


The documentation for this class was generated from the following file: