WebAutomation Extension

In the cloud age, JSON (JavaScript Object Notification) has prevailed as a format for flexible data exchange in most areas. Similarly expressive as XML, it is more efficient in terms of memory and, additionally, more convenient to use within the web infrastructure, as it can be directly processed in Javascript. Thus, most web services today reply in JSON format. The WebAutomation Extension for RapidMiner adds functionality to harness this extensive pool of information and functions. With it, any desired JSON data structure can be read and converted to one or more relational tables to make it usable for analyses. Furthermore, the extension adds support for connecting efficiently to web services for data enhancement and deployment.

JSON Import

The core functionality of the extension is the extremely efficient reading of JSON data into one or more tables. To this end, the user creates a RapidMiner process mirroring the JSON structure and specifying in which table and column the data is to be stored. This enables a dynamic and data-driven format design. If the JSON contains relational structures, it is possible to extract multiple tables connected by distinct IDs, ensuring no information is lost when the hierarchical nature of JSON is broken down into tables. 

The defined JSON structure can be utilized to process JSON data from various sources: data can either be found in a data table created by another source in the process, in a file represented as a FileObject within RapidMiner, or can come directly from a web service request. In combination with the core operators of RapidMiner, the extension covers virtually all possible sources for JSON data.

Accordingly, you can integrate the new operators into every RapidMiner process you need JSON data for. Every operator comes with a comprehensive info text with example processes to help you get started. Please also have a look at our blog, where you will find several tutorials for the WebAutomation extension examining the functions in detail:

Defining a format to extract a simple table from JSON data

Extracting two or more relational example sets from one JSON

Extracting Arrays of Scalar Values with the WebAutomation Extension

 

Web Services

The extension includes extended functionality to integrate web services into the process sequence, eliminating numerous shortcomings of the existing web extension.

The Operators

The extension provides three operators to access web services. Two of these send a single call, one for every row of a given data set. The individual results can be returned as FileObjects, making these operators the suitable choice for the retrieval of binary files such as .zip files and CSVs. Alternatively, the result can be immediately interpreted as JSON, streaming the data while interpreting it, avoiding unnecessary copies and thus reducing memory usage. With the third operator, the results of the request are added as new attributes into the data set, useful especially for data enrichment.

All three operators use connections in which the user cannot only provide authentication details, but also set a Rate Limit. The latter will be adhered to within the whole RapidMiner instance, so as to ensure observance of the rate limit should several processes or process parts be running simultaneously to request data from a server. With this, it is easier to prevent bothersome account blocks and IP bans on limited web services, which would otherwise take a lot of effort to rule out.

Internally, the extension utilizes a powerful modern HTTP library that holds connections and if possible reuses them, significantly reducing response time. With the extension it is now finally possible to dynamically control headers and cookies, as they are returned as data sets and can also be entered as data set for the next request. This enables you to very easily represent any HTTP logic through RapidMiner processes.

List of Operators

JSON Parsing

  • Parse JSON
  • Parse JSON from Data
  • Parse JSON from File
  • Process Object
  • Process Array
  • Extract Properties
  • Extract Scalar
  • Commit Row

 

Webservice Requests

  • Send Request
  • Send Request from Data
  • Send JSON Request

 

Pricing

Number of Users1-Year Subscription2-Year SubscriptionPerpetual LicenseMaintenance
1 named user
129.00€ *227.00€ *295.00€ *59.00€ *
5 named users
533.00€ *932.00€ *1,201.00€ *240.00€ *
Company license1,227.00€ *2,144.00€ *2,761.00€ *552.00€ *

* plus VAT

Do you have any questions, criticisms or suggestions about our extension?

Please do not hesitate to contact us.


By clicking on "Submit" you confirm that you have read and accept our privacy policy.