Get Pages
Synopsis
Gets pages from URLs in an attribute and stores them into a new attribute.
Description
This operator retrieves pages, whose URLs are contained in the input data set. For each row in the data set, the URL is extracted from the specified attribute. A GET request is sent and a page is acquired. This page is stored in a new attribute specified by the parameter page attribute.
Input
Example Set
The Example Set port.
Output
Example Set
The Example Set port.
Parameters
Link attribute
The attribute that contains the URLs.
Page attribute
The name of the attribute that should contain the pages.
Random user agent
Choose a user agent randomly from a set of 7000 user agents
User agent
The user agent property.
Connection timeout
The timeout (in ms) for the connection.
Read timeout
The timeout (in ms) for reading from the URL.
Follow redirects
Specifies, whether redirects should be followed.
Accept cookies
Specifies, whether cookies should be accepted.
Cookie scope
Specifies the scope of the cookies used
Request method
Specifies the request method.
Delay
Specifies whether execution should not be delayed, delayed by a fixed or random amount of time.
Delay amount
The delay amount in ms.
Min delay amount
The minimum delay amount in ms.
Max delay amount
The maximum delay amount in ms.