Skip to main content

Extract Token Number

Synopsis

Extracts the number of tokens and adds it to the document's meta data.

Description

This operator counts the number of tokens of the document and will add it to the meta data of the document. Therefore the key is used that might be specified as parameter. If the key already exists, it will be overwritten. Please keep in mind, that the meta data might be added as attribute after the processing of the documents, depending on the parameter of the Process Documents operator.

Input

document

The document port.

Output

document

The document port.

Parameters

Metadata key

The number of tokens will be added under this key. The key will become the name of the attribute after document processing.

Condition

The condition a document must fulfill to be kept.

String

The string that should be compared to.

Regular expression

The regular expression for that should match.

Case sensitive

Specifies whether the comparison should be case-sensitive.

Invert condition

Specifies whether comparison outcome should be inverted.