Skip to main content

Extract Tokens

Synopsis

This operator extracts the tokens of a document.

Description

Documents can be splitted into 'parts', which are called tokens. The most common way to do this is to use the Tokenize operator. This operator takes these tokens and converts them into an ExampleSet or a collection of documents, where each document contains one token.

Input

doc

The document you want to export the tokens from.

Output

exa

Extracted tokens as ExampleSet.

col

Collection of documents. Each document has one token.

ori

Original document.