Pre-Process Images
Synopsis
This operator allows for image transformations and tensor conversion of images. It builds a pre-processing pipeline for images and outputs it as a model that can be applied on other images.
Description
This operator takes either an ExampleSet containing meta data of several images on its input-port (example set) or it uses the parameters to internally create one.
It then converts the ExampleSet into a tensor object and a pre-processing model. It will not actually read the images and execute the transformation on given images, but store all relevant information for transforming the given images in the tensor object provided at the output port. To transform images add operators from the transform operator list to the inside of this operator. It will create a list of image transformations to apply (the pre-processing model). These changes will be applied when: 1) training / scoring images using a neural network, 2) explicitly using the image pre-processing model in combination with the "Apply Model (Generic)" operator, 3) writing images back to disk using the "Store images" operator. The created image transformation model outputs a tensor object when applied on image meta data, this can be combined with deep learning tensor models using the "Group Model (Generic)" operator. If not otherwise transformed, images will be processed as 100x100 with 3 color channels by default.
Input
example set
ExampleSet containing the image meta data (Path, Width, Height, Filename, Label).
Output
tensor
Tensor containing image meta data (Path, Width, Height, Filename, Label) + pre-processing steps.
preprocessing model
Contains image pre-processing steps + necessary details for building a tensor object.
throughput
ExampleSet containing the image meta data (Path, Width, Height, Filename, Label) as provided at the input port.
Parameters
Path
Path attribute from the input that contains the path to the images that should be used for transformation / tensor conversion.
Directory
Path to image containing directory, that should be used for image-meta data reading (may contain several sub-directories and multiple images). Supported image formats are: bmp, jpg, jpeg, jp2, pbm, pgm, ppm, pnm, png, tif, tiff, exr, webp
Use label
Use the parent directory name of the image as label. If you are working with images representing handwritten numbers (like with the famous MNIST data set), point the operator to a folder containing multiple sub-folders named after the number the images inside represent. For example: the main folder (set as the value of the directory parameter) could be "training". This folder could contain multiple sub-folders named "0", "1", ..., "9". Each of those sub-folders would contain images representing the number used for naming the sub-folder. Reading this would result in an ExampleSet, where the label column has the given folder name for each image inside the sub-folder.