Type of course: classroom training, duration: 2 days
Predictive Analytics in a Big Data Context is a two-day course focusing on data science technologies for handling large amounts of data. We start by giving an overview of the technologies available and put keywords like data lake, in-memory and Hadoop into a meaningful overall context. Participants will learn how big data technology can be used to solve data science problems.
In this training course, the scenario looked at in the basic training courses is continued and scaled to big data. RapidMiner Radoop is used for this purpose, which enables large amounts of data to be analytically processed in the familiar environment of RapidMiner Studio. The distributed execution over Hadoop clusters means that any amount of data can be processed.
After the course, participants will have an in-depth knowledge about the pros and cons of big data technologies and will know how large amounts of data can be processed using RapidMiner on Hadoop. In the course, the participants use their personal laptops, meaning they can take home the knowledge and example solutions from the course and use these as a basis for their own big data challenges.
The course is structured to provide constant alternation between the study of theoretical basics and proven best practices and the practical application of knowledge acquired. Participants will form a data science team that completes the tasks set by the course instructor together.
The skills acquired by participants of the training course include:
- Understanding of big data infrastructure with its possibilities and limitations
- Connecting a desktop computer with a Hadoop cluster
- Exploration of large volumes of data
- Extracting and loading data
- Producing big data analyses with RapidMiner
- Knowledge of methods to efficiently process large volumes of data
What is big data and when does it make sense?
When can analytics benefit from big data?
Introduction to Hadoop
- General infrastructure
- Hadoop integration with RapidMiner: Radoop
- Introduction to the Radoop user interface
- Connecting desktop computers or laptops with a Hadoop cluster
- Searching tables
- Statistics and data aggregation for high level information
Extracting and Loading Data
- Formulating queries
- Entering data in Hadoop
Analytics Processes on Hadoop
- In-Hadoop training
- Profiling and natural aggregation
- In-memory training, in-Hadoop scoring
Beyond Natural Aggregation
- In-Hadoop modelling
Batch-oriented Processing of Large Volumes of Data
Previous Knowledge Required
For this training course, you will need the knowledge from the previous courses Basics 1 & 2 and Deployment - Predictive Analytics in Live Use. If you have already acquired equivalent knowledge from comparable training courses, please contact us.
Data scientists, advanced analysts
After you have attended the training courses Deployment - Predictive Analytics in Live Use, Predictive Analytics in a Big Data Context and the course on Text and Web Mining, you can take an exam to acquire the “RapidMiner Expert” certificate and show off your new qualification.
Predictive Analytics in a Big Data Context € 2000 per participant (plus VAT)
Certificate Free of charge for participants*
*A later examination can be taken online at any time for an additional € 200.
If two or more participants register, a price of € 1440 per participant will be charged (plus VAT).