DocWire DocToText - Powered by Silvercoders 5.0.5
A multifaceted, data extraction software development toolkit that converts all sorts of files to plain text and html. Written in C++, this data extraction tool has a parser able to convert PST & OST files along with a brand new API for better file processing. To enhance its utility, DocToText, as a data extraction tool, can be integrated with other data mining and data analytics applications. It comes equipped with a high grade, scriptable and trainable OCR that has LSTM neural networks based character recognition. This document parser is able to extract metadata along with annotations and supports a list of formats that include: DOC, XLS, XLSB, PPT, RTF, ODF (ODT, ODS, ODP), OOXML (DOCX, XLSX, PPTX), iWork (PAGES, NUMBERS, KEYNOTE), ODFXML (FODP, FODS, FODT), PDF, EML, HTML, Outlook (PST, OST), Image (JPG, JPEG, JFIF, BMP, PNM, PNG, TIFF, WEBP) and DICOM (DCM)
doctotext::Info Struct Reference

Public Member Functions

 Info (const std::string &tagName="", const std::string &plainText="", const std::map< std::string, std::any > &attrs={})
 
template<typename T >
std::optional< T > getAttributeValue (const std::string &name) const
 

Public Attributes

std::string tag_name
 tag name More...
 
std::map< std::string, std::any > attributes
 tag attributes More...
 
bool cancel = false
 cancel flag. If set true then parsing process will be stopped. More...
 
bool skip = false
 skip flag. If set true then tag will be skipped. More...
 
std::string plain_text
 Stores text from last parsed node. More...
 

Detailed Description

Examples
example_3.cpp, example_4.cpp, example_5.cpp, example_6.cpp, and example_8.cpp.

Definition at line 98 of file parser.h.

Constructor & Destructor Documentation

◆ Info()

doctotext::Info::Info ( const std::string &  tagName = "",
const std::string &  plainText = "",
const std::map< std::string, std::any > &  attrs = {} 
)
inlineexplicit

Definition at line 106 of file parser.h.

Member Function Documentation

◆ getAttributeValue()

template<typename T >
std::optional< T > doctotext::Info::getAttributeValue ( const std::string &  name) const
inline

Definition at line 113 of file parser.h.

Member Data Documentation

◆ attributes

std::map<std::string, std::any> doctotext::Info::attributes

tag attributes

Definition at line 101 of file parser.h.

◆ cancel

bool doctotext::Info::cancel = false

cancel flag. If set true then parsing process will be stopped.

Definition at line 102 of file parser.h.

◆ plain_text

std::string doctotext::Info::plain_text

Stores text from last parsed node.

Definition at line 104 of file parser.h.

◆ skip

bool doctotext::Info::skip = false

skip flag. If set true then tag will be skipped.

Definition at line 103 of file parser.h.

◆ tag_name

std::string doctotext::Info::tag_name

tag name

Examples
example_3.cpp, example_4.cpp, example_6.cpp, and example_8.cpp.

Definition at line 100 of file parser.h.


The documentation for this struct was generated from the following file: