Class TextExtractor

Info

Represents Documentize.TextExtractor plugin. Used to extract text from PDF documents.

public class TextExtractor : PdfExtractor, IDisposable

Inheritance

objectPdfExtractorTextExtractor

Implements

Inherited Members

Examples

The example demonstrates how to extract text content of PDF document.

// create TextExtractor object to extract text in PDF contents
using (var plugin = new TextExtractor())
{
    // create TextExtractorOptions
    var opt = new TextExtractorOptions();

    // add input file path
    opt.AddInput(new FileDataSource(inputPath));

    // perform extraction process
    var resultContainer = plugin.Process(opt);

    // get the extracted text from the ResultContainer object
    var textExtracted = resultContainer.ResultCollection[0].ToString();
}

Constructors

TextExtractor()

public TextExtractor()

Namespace: Documentize Assembly: Documentize.dll

 English