Skip to main content

Read Parquet

Synopsis

This operator reads Apache Parquet files.

Description

Apache Parquet is a column-oriented data storage format of the Apache Hadoop ecosystem. It enables efficient data storage as well as processing. While the parquet format, does allow for more complex structures like lists, maps and others, this operator does not support these as of now. That means, repetition count should not be higher than 1, only the first data item will be considered.

Input

file

A Parquet file can be optionally passed in as a file object. This can be created with Operators having file output ports such as the Open File Operator.

Output

output

This port delivers a data table created from the Parquet file provided at the input port or loaded from the path given to the file parameter.

Parameters

Parquet file

The path of the Parquet file is specified here. It can also be selected using the 'Choose a file' button.