Major Update for the Jackhammer Extension

We are pleased to announce the release of exciting and useful new features for our powerful Jackhammer Extension! With 17 new operators, the extension now makes the life of all data scientists easier still. Also included is a compatibility fix for RapidMiner Server 9.6.

 

The new operators at a glance:

Io:

Open Process — opens a process as a process object, which is also a file object. This operator allows for the analysis of processes and the automatic creation of reports, versioning etc.

Io/Compression:

Read and Write GZIP File – these two operators read or write a GZIP file. Data in GZIP-format are often used for web services. These operators facilitate their processing.

 

Blending:

Rename (Advanced) — simplifies the renaming of attributes. Possibility to use a table for convenient renaming of multiple attributes at once.

Blending/Generation:

Generate Hash – calculates the hash value of a data set row by row from a set of attributes and adds it as a new attribute.

 

Process Control:

Synchronize – prevents operators from running in parallel inside the subprocess. With this, you can force sequential execution of certain areas in generally parallelly running processes. Useful if otherwise bottlenecks might appear, as for example with parallel reading from hard drives.

Extract Macro from Collection — provides the size of a collection as a macro.

Extract Macro from Performance — provides the performance values of a performance vector as macro.

 

Generation:

Generate Description Data — creates a table with meta data about a selected ExampleSet.

Generate Data from Expressions —creates a new, one-row ExampleSet, the data of which is determined by analysis of the expressions.

 

Transformation:

Lag (Advanced) — moves selected attributes up or down relative to the rest of the table. This enables the creation of lags in time series available as equidistant rows.

 

Series:

Aggregate Windows – significantly improved version of the Moving Average operator. Supports several windows, can be flexibly defined via the data set and is significantly faster.

Aggregate Time Windows—like Aggregate Windows, but uses time indices, making it possible to use on non-equidistant data sets.

Define Windows – Operator for easy window definition for the two operators above, making it possible to optimize complex window definitions through the operator’s parameters.

 

Validation:

Split Validation (Advanced) – like the core operator but uses multithreading.

Sliding Window Validation (Advanced) – like the core operator but uses multithreading and time indexes instead of rows for splitting, enabling realistic validation for regular re-training in deployment.

 

Cleansing/Validity:

Declare Valid Values – marks invalid values as such by only showing the valid ones. Makes cleaning easier in cases where there are many invalid and only few valid values.