When you think about what data analysts and data scientists do on a day-to-day basis, you might have a general understanding of types of conclusions they make, but how do they arrive at those conclusions? The statistical programming language R is widely used in data science; understanding the basics of how it works can help you manipulate and visualize data in a quick, flexible manner, and it may improve your communication with data scientists on your team.

In this course, you will explore the basics of statistical programming and develop R skills. As you hone your ability to use commands in R, you will combine those basic skills to complete more complex tasks, such as data manipulation and visualization. Finally, you will examine how to repeat tasks in R, which makes it easier to manipulate large data sets. This course involves many hands-on coding exercises to help you gain confidence in your newfound programming skills.

System requirements: This course contains a virtual programming environment that does not support the use of Safari, Edge, tablets, or mobile devices. Please use Chrome, Firefox, or Internet Explorer on a computer for this course.

The real world is extremely complex, and revealing the patterns that underlie these complexities can be challenging. However, unlocking the power of a data set can provide you with remarkable insights and help guide decision-making. This course will prepare you to use summarization and visualization techniques to reveal patterns in real-world data, using examples from a variety of disciplines, including business and medicine.

In this course, Professor Basu will guide you as you begin to understand key data collection principles and how to make conclusions from data. Choosing which analyses to use depends on your question, so you will use a framework to help you choose which methods to use with your data. Then, you will use R to perform exploratory data analyses, which will allow you to identify key patterns and trends in a ready-to-analyze data set. You will also learn the importance of quantifying the uncertainty associated with your results, and how to measure variability in your data. This course involves many hands-on coding exercises in R to help you gain confidence in your programming skills.

System requirements: This course contains a virtual programming environment that does not support the use of Safari, Edge, tablets, or mobile devices. Please use Chrome, Firefox, or Internet Explorer on a computer for this course.

“Exploring Data Sets With R” must be completed prior to starting this course.

Data scientists make decisions by inferring the characteristics of a large population based on the characteristics of samples from that population. Basing a decision on samples is necessary since it would not be possible to measure every individual or unit in a population. However, it also means that data scientists need to consider the potential variability among samples before using those samples to make conclusions about the population. The variability across samples leads to uncertainty in decision-making, and understanding and quantifying that uncertainty is a key aspect of data science.

Throughout this course, Professor Basu will guide you through the nuances of understanding and quantifying the uncertainty around your results, and through making decisions in the face of that uncertainty. In data science, simulations offer a powerful framework with which to understand the uncertainty around your data, so you will learn to perform simulations in R and use a simulation-based framework to quantify uncertainty when studying the relationship between categorical variables. You will also use resampling techniques to understand numerical variables and compare their summary statistics across different levels of a categorical variable. Often, data scientists search for relationships between numerical variables and use one numerical variable to predict another numerical variable, and you will do this by building a prediction rule with linear regression while keeping the uncertainty of your results in mind. Finally, you will use the errors from linear regression to compare prediction rules and determine which prediction rules fit your data best. This course involves many hands-on coding exercises in R which will help you gain confidence in your programming skills.

System requirements: This course contains a virtual programming environment that does not support the use of Safari, Edge, tablets, or mobile devices. Please use Chrome, Firefox, or Internet Explorer on a computer for this course.

“Exploring Data Sets With R” and “Summarizing and Visualizing Data” must be completed prior to starting this course.

How It Works

Request Information Now by completing the form below.

Act today—courses are filling fast.
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.