smott
Introduction to MRS for Analysts

Updated a year ago

Introduction

Introduction to MRS for Analysts is designed to help R users learn to process, query, transform and summarize, and build models on large datasets using Microsoft R Server's RevoScaleR package. This course takes a use-case-based approach by walking through a knowledge discovery and data mining example using MRS.

Pre-requisites

Ideally, this course is for intermediate or advanced R users who have a solid grounding in R basics (especially data types, writing functions, and using the apply family of functions) and experience in data analysis with R using third-party packages such as dplyr and ggplot2. Moreover, this course was written for users who come from a business analyst background, such as R, SAS, SPSS or other business analysts who are familiar with computer science and programming concepts, but are not necessarily experts in computer programming or distributed computing, and still want to learn how to use R for running analyses on big datasets and in the future be able to deploy their analytics workflow in a production environment such as Hadoop, Spark or SQL Server. Additionally, the course assumes some familiarity with a basic modeling workflow, i.e. ingesting data, preparing data for analysis, building and comparing models, choosing a good fit, and scoring new data.

Learning objective

After completing this course, participants will be able to use R and Microsoft R Server's RevoScaleR library in order to:

  1. Read and process flat files (CSV) efficiently
  2. Clean and prepare data for analysis
  3. Write complex transformations to add new features to the data
  4. Visualize, explore, and summarize data
  5. Build analytical models on large datasets and compare them
  6. Score new data with a model

Throughout this course, we provide enough code examples using RevoScaleR that the intermediate to advanced R user would learn how to integrate RevoScaleR into their R workflow and use it to build scalable solution to problems involving large datasets.

Please let us know how we can improve our content.

Created by a Microsoft Employee.