Split Document into Collection
Synopsis
This operator splits a document (for example from Read Document) into a collection of documents, according to the
split string parameter.
Description
This operator receives a document at its input port and splits it into a collection of documents, according to the split string parameter. The input document can originate for example from a Read Document operator, which reads in a complete text file and provides the content of the file as one document. You can use the Split Document into Collection operator to split this document into a collection and process it one by one. For example if you want to process a file line by line, you can use the end of line character ("\n") as the split string.
The splitted documents are also converted into an ExampleSet with one attribute, containing the documents.
Input
document
The input document.
Output
collection
The resulting collection of documents.
example set
An ExampleSet containing the splitted documents as an attribute. Each document is one example.
Parameters
Split string
String on which the input document is splitted. The split string is not included in the resulting documents.