Skip to main content

Split Document into Collection

Synopsis

This operator splits a document (for example from Read Document) into a collection of documents, according to the

split string parameter.

Description

This operator receives a document at its input port and splits it into a collection of documents, according to the split string parameter. The input document can originate for example from a Read Document operator, which reads in a complete text file and provides the content of the file as one document. You can use the Split Document into Collection operator to split this document into a collection and process it one by one. For example if you want to process a file line by line, you can use the end of line character ("\n") as the split string.

The splitted documents are also converted into an ExampleSet with one attribute, containing the documents.

Input

document

The input document.

Output

collection

The resulting collection of documents.

example set

An ExampleSet containing the splitted documents as an attribute. Each document is one example.

Parameters

Split string

String on which the input document is splitted. The split string is not included in the resulting documents.