The New InDatabase Extension by Old World Computing is here!
We’re proud to present: Old World Computing’s new InDatabase Extension. The new extension for RapidMiner makes working with relational databases a piece of cake – but there’s more: it brings transaction safety to the data science tool. The included operators are based on established database query languages, without requiring specialist knowledge.
The Idea behind the Extension
Until now, there was no way to ensure transaction safety with exisiting database operations or extensions in RapidMiner. However, since we are often in need of this critical aspecct of a database system in our work in order to guarantee result safety, we went and put our development team to the task of programming an extension for it. Consequently, we wanted to make the new features available to the public.
Moreover, tasks concerning relational databases is fundamentally simplified by returning autogenerated IDs which can in processed directly in RapidMiner in the following process.
Who is the Extension for?
This extension is perfect for everyone who requires transaction safety when using databases in their data science project. It also has the advantage of being similar to known query languages such as SQL and facilitates working with relational databases. Another advantage is that the user is free in their decision whether to manually enter the SQL code or build the query with the extension’s operators, making it possible to easily and quickly work even without deeper knowledge.
Therefore, the extension aims both at skilled technicians who already have background knowledge about databases and are looking to use the expert functionalities, as well as RapidMiner users who would like to easily and conveniently build database queries using operators.
In our experience, transaction safety becomes relevant in practically all projects using databases, thus making this extension important for basically all such data science projects.
Transaction Safety: What Is It and Why Do I Need It?
Transaction safety is a crucial aspect of database systems. It ensures data is left in a consistent state. To this end, the database is locked for other, simultaneous transactions, i.e. overwriting the data, when a write is in process. Furthermore, the data are restored to their original – consistent – state in case of faulty transactions that can happen in the event of an unexpected stop or because of conflicts.
What Problems Can Occur When Lacking Transaction Safety?
Without transaction safety, this can lead to irregularities in the data. Wrong data, overwritten data, and operations influencing each other can occur. Especially for database systems on which read and write operations are executed in very short intervals, this can be problematic.
Let’s use an example to illustrate: In a factory, the machines send data about the number of products made, the number of operation cycles etc. These are regularly written into a database. During one of these write operations, a power outage occurs: the number of operation cycles is already entered into the database, however, the number of products made has not yet been written.
If you were to simply repeat the write operation without transaction once the system has been rebooted, it will lead to wrong values in the database: the number of operation cycles will be doubled, as there is already a value entered for it. This could for instance lead to apparent shortages as only half the number of products that should have been made in that amount of operation cycles has actually been produced, or lead to premature maintenance or unnecessary replacement of wear parts. Such errors in database can therefore entail costs.
Transaction safety in this case means that the incomplete write operation, interrupted by the power cut, will not lead to wrong entries, but will be undone entirely – strictly following an all-or-nothing mentality. The values will then be entered anew and correctly.
How Can I Carry Out Transactions with This Extension?
The extension includes three operators for transactions: Start Transaction, Commit Transaction und Rollback Transaction, with the Start Transaction operator acting as a subprocess for the write operations happening inside.
Start Transaction is the most basic of the transaction operators: all write operations executed in its subprocess are subject to transaction safety. Commit and Rollback Transaction are usable within said subprocess and offer even more exact possibilities to intervene to the user, enabling the completion of individual transaction parts or their rollback.
Beyond the known standard database functionalities such as drop table, truncate table, write table, and update table, the indatabase extension provides the option to define queries using operators and, if required, execute these batchwise.
Further, it is often the case that data is to be entered into a relational database, requiring an automatically generated ID assigned to the inserted rows. Up to now, reading the ID was only possible using complicated and cumbersome query constructs. With the InDatabase Extension, the automatically generated ID can be returned with a simple click of the mouse.
And should you ever be overcome by the desire to write real SQL queries, the extension supports you with Universal Quoting to ensure compatibility across the numerous quoting standards used by the common database systems.