Skip to main content

Read Image Meta Data

Synopsis

Reads the meta data of images (recursively) from a directory into an ExampleSet. This includes their path, filename, width, height and label (optional). Use this operator as a starting point for image transformations or training neural networks.

Description

This operator expects a directory path as parameter which it will then traverse recursively looking for images. It will not read the images, instead just keep meta-information about them (width, height, label, etc.).

Supported image formats are: bmp, jpg, jpeg, jp2, pbm, pgm, ppm, pnm, png, tif, tiff, exr, webp

To use referenced images for deep learning or transforming them, use the pre-process images operator. It has a tensor output port that can be used as an input for the Deep Learning (Tensor) operator of the Deep Learning Extension available on the RapidMiner Marketplace.

The image references can be used for creating image transformation pipelines or loading images dpathng training/application of neural networks. To change images use operators from the transform operator list inside the image pre-processing operator. The image pre-processing operator will create a list of image transformations to apply. These changes will be applied when: 1) training / scoring images using a neural network, 2) explicitly using the image pre-processing model in combination with the "Apply Model (Generic)" operator, 3) writing images back to disk using the "Store images" operator.

If you need to visualize image data in RapidMiner Studio use the "Read Image as ExampleSet" operator to convert it into ExampleSets (one per color channel) and use the two-dimensional heatmap.

Input

Output

output

ExampleSet containing the image meta data (Path, Width, Height, Filename, Label).

Parameters

Directory

Path to image containing directory, that should be used for image-meta data reading (may contain several sub-directories and multiple images). Supported image formats are: bmp, jpg, jpeg, jp2, pbm, pgm, ppm, pnm, png, tif, tiff, exr, webp

Use label

Use the parent directory name of the image as label. If you are working with images representing handwritten numbers (like with the famous MNIST data set), point the operator to a folder containing multiple sub-folders named after the number the images inside represent. For example: the main folder (set as the value of the directory parameter) could be "training". This folder could contain multiple sub-folders named "0", "1", ..., "9". Each of those sub-folders would contain images representing the number used for naming the sub-folder. Reading this would result in an ExampleSet, where the label column has the given folder name for each image inside the sub-folder.