DocWire DocToText - Powered by Silvercoders 5.0.5
A multifaceted, data extraction software development toolkit that converts all sorts of files to plain text and html. Written in C++, this data extraction tool has a parser able to convert PST & OST files along with a brand new API for better file processing. To enhance its utility, DocToText, as a data extraction tool, can be integrated with other data mining and data analytics applications. It comes equipped with a high grade, scriptable and trainable OCR that has LSTM neural networks based character recognition. This document parser is able to extract metadata along with annotations and supports a list of formats that include: DOC, XLS, XLSB, PPT, RTF, ODF (ODT, ODS, ODP), OOXML (DOCX, XLSX, PPTX), iWork (PAGES, NUMBERS, KEYNOTE), ODFXML (FODP, FODS, FODT), PDF, EML, HTML, Outlook (PST, OST), Image (JPG, JPEG, JFIF, BMP, PNM, PNG, TIFF, WEBP) and DICOM (DCM)
exporter.h
1/***************************************************************************************************************************************************/
2/* DocToText - A multifaceted, data extraction software development toolkit that converts all sorts of files to plain text and html. */
3/* Written in C++, this data extraction tool has a parser able to convert PST & OST files along with a brand new API for better file processing. */
4/* To enhance its utility, DocToText, as a data extraction tool, can be integrated with other data mining and data analytics applications. */
5/* It comes equipped with a high grade, scriptable and trainable OCR that has LSTM neural networks based character recognition. */
6/* */
7/* This document parser is able to extract metadata along with annotations and supports a list of formats that include: */
8/* DOC, XLS, XLSB, PPT, RTF, ODF (ODT, ODS, ODP), OOXML (DOCX, XLSX, PPTX), iWork (PAGES, NUMBERS, KEYNOTE), ODFXML (FODP, FODS, FODT), */
9/* PDF, EML, HTML, Outlook (PST, OST), Image (JPG, JPEG, JFIF, BMP, PNM, PNG, TIFF, WEBP) and DICOM (DCM) */
10/* */
11/* Copyright (c) SILVERCODERS Ltd */
12/* http://silvercoders.com */
13/* */
14/* Project homepage: */
15/* http://silvercoders.com/en/products/doctotext */
16/* https://www.docwire.io/ */
17/* */
18/* The GNU General Public License version 2 as published by the Free Software Foundation and found in the file COPYING.GPL permits */
19/* the distribution and/or modification of this application. */
20/* */
21/* Please keep in mind that any attempt to circumvent the terms of the GNU General Public License by employing wrappers, pipelines, */
22/* client/server protocols, etc. is illegal. You must purchase a commercial license if your program, which is distributed under a license */
23/* other than the GNU General Public License version 2, directly or indirectly calls any portion of this code. */
24/* Simply stop using the product if you disagree with this viewpoint. */
25/* */
26/* According to the terms of the license provided by SILVERCODERS and included in the file COPYING.COM, licensees in possession of */
27/* a current commercial license for this product may use this file. */
28/* */
29/* This program is provided WITHOUT ANY WARRANTY, not even the implicit warranty of merchantability or fitness for a particular purpose. */
30/* It is supplied in the hope that it will be useful. */
31/***************************************************************************************************************************************************/
32
33#ifndef EXPORTER_H
34#define EXPORTER_H
35
36#include <algorithm>
37#include <memory>
38
39#include "parser.h"
40#include "parser_builder.h"
41#include "parser_manager.h"
42#include "parser_parameters.h"
43#include "writer.h"
44#include "defines.h"
45
46namespace doctotext
47{
48
49class Importer;
50class Transformer;
51
58class DllExport Exporter
59{
60public:
64 Exporter(std::unique_ptr<Writer> writer);
65
70 Exporter(std::unique_ptr<Writer> writer, std::ostream &out_stream);
71
72 Exporter(const Exporter &other);
73
74 Exporter(const Exporter &&other);
75
76 virtual ~Exporter();
77
82 virtual Exporter* clone() const;
83
88 void set_out_stream(std::ostream &out_stream);
89
94 bool is_valid() const;
95
100 void export_to(doctotext::Info &info) const;
101
105 void begin() const;
106
110 void end() const;
111
112protected:
113 std::ostream& get_output() const;
114
115private:
116 class Implementation;
117 std::unique_ptr<Implementation> impl;
118};
119
123class DllExport HtmlExporter: public Exporter
124{
125public:
126 HtmlExporter();
130 HtmlExporter(std::ostream &out_stream);
131};
132
136class DllExport PlainTextExporter: public Exporter
137{
138public:
143 PlainTextExporter(std::ostream &out_stream);
144
145};
146
151class DllExport MetaDataExporter: public Exporter
152{
153public:
158 MetaDataExporter(std::ostream &out_stream);
159
160};
161
162} // namespace doctotext
163
164#endif //EXPORTER_H
Exporter class is responsible for exporting the parsed data from importer or transformer to an output...
Definition: exporter.h:59
virtual Exporter * clone() const
Creates clone of this exporter.
bool is_valid() const
Check if exporter contains valid output.
void export_to(doctotext::Info &info) const
Exxports data from Info structure to output stream.
void end() const
Ends writing.
Exporter(std::unique_ptr< Writer > writer, std::ostream &out_stream)
void begin() const
Sets writer to use.
void set_out_stream(std::ostream &out_stream)
Sets output stream.
Exporter(std::unique_ptr< Writer > writer)
Exporter class for HTML output.
Definition: exporter.h:124
HtmlExporter(std::ostream &out_stream)
Exporter class for meta data. Important: Exports only meta data as a plain text.
Definition: exporter.h:152
MetaDataExporter(std::ostream &out_stream)
Exporter class for plain text output.
Definition: exporter.h:137
PlainTextExporter(std::ostream &out_stream)