Read Parquet
Synopsis
This operator reads Apache Parquet files.
Description
Apache Parquet is a column-oriented data storage format of the Apache Hadoop ecosystem. It enables efficient data storage as well as processing. While the parquet format, does allow for more complex structures like lists, maps and others, this operator does not support these as of now. That means, repetition count should not be higher than 1, only the first data item will be considered.
Input
file
A Parquet file can be optionally passed in as a file object. This can be created with Operators having file output ports such as the Open File Operator.
Output
output
This port delivers a data table created from the Parquet file provided at the input port or loaded from the path given to the file parameter.
Parameters
Parquet file
The path of the Parquet file is specified here. It can also be selected using the 'Choose a file' button.