Class TextExtractorOptions

Info

Represents text extraction options for the Documentize.TextExtractor plugin.

public sealed class TextExtractorOptions : PdfExtractorOptions, IPluginOptions

Inheritance

objectPdfExtractorOptionsTextExtractorOptions

Implements

Inherited Members

Examples

The example demonstrates how to extract text content of PDF document.

// create TextExtractor object to extract PDF contents
using (TextExtractor extractor = new TextExtractor())
{
    // create TextExtractorOptions object to set TextFormattingMode (Pure,  or Raw - default)
    extractorOptions = new TextExtractorOptions(TextExtractorOptions.TextFormattingMode.Pure);

    // add input file path to data sources
    extractorOptions.AddInput(new FileDataSource(inputPath));

    // perform extraction process
    ResultContainer resultContainer = extractor.Process(extractorOptions);

    // get the extracted text from the ResultContainer object
    string textExtracted = resultContainer.ResultCollection[0].ToString();
}

Remarks

The Documentize.TextExtractorOptions object is used to set Documentize.TextExtractorOptions.TextFormattingMode and another options for the text extraction operation. Also, it inherits functions to add data (files, streams) representing input PDF documents.

Constructors

TextExtractorOptions(TextFormattingMode)

Initializes a new instance of the Documentize.TextExtractorOptions object for the specified text formatting mode.

public TextExtractorOptions(TextExtractorOptions.TextFormattingMode formattingMode)

Parameters

TextExtractorOptions()

Initializes a new instance of the Documentize.TextExtractorOptions object with ‘Raw’ (default) text formatting mode.

public TextExtractorOptions()

Properties

FormattingMode

Gets formatting mode.

public TextExtractorOptions.TextFormattingMode FormattingMode { get; }

Property Value

TextExtractorOptions.TextFormattingMode

OperationName

Returns name of the operation.

public override string OperationName { get; }

Property Value

string

Namespace: Documentize Assembly: Documentize.dll

 English