The core functionality of the extension is the extremely efficient reading of JSON data into one or more tables. To this end, the user creates a RapidMiner process mirroring the JSON structure and specifying in which table and column the data is to be stored. This enables a dynamic and data-driven format design. If the JSON contains relational structures, it is possible to extract multiple tables connected by distinct IDs, ensuring no information is lost when the hierarchical nature of JSON is broken down into tables.
The defined JSON structure can be utilized to process JSON data from various sources: data can either be found in a data table created by another source in the process, in a file represented as a FileObject within RapidMiner, or can come directly from a web service request. In combination with the core operators of RapidMiner, the extension covers virtually all possible sources for JSON data.
Accordingly, you can integrate the new operators into every RapidMiner process you need JSON data for. Every operator comes with a comprehensive info text with example processes to help you get started. Please also have a look at our blog, where you will find several tutorials for the WebAutomation extension examining the functions in detail:
The extension includes extended functionality to integrate web services into the process sequence, eliminating numerous shortcomings of the existing web extension.
The extension provides three operators to access web services. Two of these send a single call, one for every row of a given data set. The individual results can be returned as FileObjects, making these operators the suitable choice for the retrieval of binary files such as .zip files and CSVs. Alternatively, the result can be immediately interpreted as JSON, streaming the data while interpreting it, avoiding unnecessary copies and thus reducing memory usage. With the third operator, the results of the request are added as new attributes into the data set, useful especially for data enrichment.
All three operators use connections in which the user cannot only provide authentication details, but also set a Rate Limit. The latter will be adhered to within the whole RapidMiner instance, so as to ensure observance of the rate limit should several processes or process parts be running simultaneously to request data from a server. With this, it is easier to prevent bothersome account blocks and IP bans on limited web services, which would otherwise take a lot of effort to rule out.
Internally, the extension utilizes a powerful modern HTTP library that holds connections and if possible reuses them, significantly reducing response time. With the extension it is now finally possible to dynamically control headers and cookies, as they are returned as data sets and can also be entered as data set for the next request. This enables you to very easily represent any HTTP logic through RapidMiner processes.
List of Operators
- Parse JSON
- Parse JSON from Data
- Parse JSON from File
- Process Object
- Process Array
- Extract Properties
- Extract Scalar
- Commit Row
- Send Request
- Send Request from Data
- Send JSON Request
|Number of Users||1-Year Subscription||2-Year Subscription||Perpetual License||Maintenance|
|1 named user||129.00€ *||227.00€ *||295.00€ *||59.00€ *|
|5 named users||533.00€ *||932.00€ *||1,201.00€ *||240.00€ *|
|Company license||1,227.00€ *||2,144.00€ *||2,761.00€ *||552.00€ *|
* plus VAT
Do you have any questions, criticisms or suggestions about our extension?
Please do not hesitate to contact us.