Course list

By some estimates, 90% of the data that has ever existed has been created in the last two years. This is a staggering figure and has given rise to new challenges and opportunities in almost every industry: what kind of data do you need to collect to compete, and how can you make sense of it once you have collected it? As technology evolves and the volume of data increases, how can you make the best use of all this information? How can you use the data to help drive your decision-making? How can you make data work for you? How can you ensure your data accurately reflects the population in which you're interested?

In this course, you will determine the types of engineering and business questions you can answer, the kinds of problems you can solve, and the decisions you can make, all through using data analytics. You will explore best practices for collecting information so that you can make informed predictions, develop insights, and better inform organizational decision-making. You will see real-world examples that demonstrate how those tools work. Additionally, you will have a chance to apply some of the concepts to your own work. You will explore best practices for sampling and examine how different types of sampling are each suited for different situations. Finally, you will see real-world examples that demonstrate how those tools work and have a chance to practice sampling techniques in some case study scenarios.

Visualization is one of the most simple and effective ways to find patterns in data. These patterns include: What is the general range and shape of the data set? Are there any clusters of observations? Which variables correlate with each other? Are there any obvious outliers?

As your data set grows in terms of the number of data points and variables, however, it becomes increasingly difficult to visualize all this information at once. At most, you can plot data points on a three-dimensional axis and add further distinctions of size, color, shape, and so on. Yet this can easily become too busy and difficult to read. How, then, do we find patterns in really big data sets?

In this course, you will explore several powerful and commonly utilized techniques for distilling patterns from data. You will implement each of these techniques using the free and open-source statistical programming language R with real-world data sets. The focus will be on making these methods accessible for you in your own work.

You are required to have completed the following course or have equivalent experience before taking this course:

  • Understanding Data Analytics

When you have large groups of objects, it is often helpful to split them into meaningful groups or clusters. One example of this would be to identify different types of customers so that a company can more efficiently route their calls to a helpline. As a second example, suppose an automobile manufacturer wanted to segment their market to target the ads more carefully. One approach might be to take a database of recent car sales, including the social demographics associated with each customer, and segment the population purchasing each type of automobile into meaningful groups.

Specialized approaches exist if your data contains information that relates to time and geography. You can use this additional information to identify geographical and temporal hotspots. Hotspots are regions of high activity or a high value of a particular variable. These results can help you focus your attention on a particular region where a problem is occurring more than usual, such as the incidence of asthma in a large city. In both cluster and hotspot analysis, the results can help you discover new and interesting features, problems, and red flags regarding the data being analyzed.

In this course, you will explore several powerful and commonly utilized techniques for performing both cluster and hotspot analysis. You will implement these techniques using the free and open-source statistical programming language R with real-world data sets. The focus will be on making these methods accessible and applicable to your work.

You are required to have completed the following courses or have equivalent experience before taking this course:

  • Understanding Data Analytics
  • Finding Patterns in Data Using Association Rules, PCA, and Factor Analysis

A story can play an important role in understanding data. It can help distill complex information into something manageable- something we can think about easily, relate to, and use to make decisions. For many problems that we encounter globally, however, a story that describes what already happened is not enough precision for the job we want to perform. Often, we would like to use available data to make numerically accurate predictions about what might happen in the future. This task requires the construction of mathematical models that are well suited to our real-world problems.

In this course, you will explore several types of statistical models used with data to make predictions. These models bring with them a whole batch of important concerns, such as estimation and validation, that make the entire process into both an art and a science. You will implement each of these techniques using the free and open-source statistical programming language R with real-world data sets. The focus will be on making these methods accessible for you in your own work.

You are required to have completed the following courses or have equivalent experience before taking this course:

  • Understanding Data Analytics
  • Finding Patterns in Data Using Association Rules, PCA, and Factor Analysis
  • Finding Patterns in Data Using Cluster and Hotspot Analysis

Supervised learning is a general term for any machine learning technique that attempts to discover the relationship between a data set and some associated labels for prediction. In regression, the labels are continuous numbers. This course will focus on classification, where the labels are taken from a finite set of numbers or characters. The prototypical and perhaps most well-known example of classification is image recognition. The goal is to take an image (represented by its pixel values) and determine what objects are in the image. Is it a dog? A grapefruit? A stop sign?

There are many practical classification tasks, such as determining whether an individual's financial history makes them high risk for a loan, whether there is a defect in a material based on some sensor readings, or whether a new email is spam or not. These problems share the same basic form and can be solved with many different types of mathematical, statistical, and probabilistic models developed by the machine learning community.

In this course, you will explore several powerful and commonly utilized techniques for supervised learning. You will implement each of these techniques using the free and open-source statistical programming language R with real-world data sets. The focus will be on making these methods accessible for you in your own work.

You are required to have completed the following courses or have equivalent experience before taking this course:

  • Understanding Data Analytics
  • Finding Patterns in Data Using Association Rules, PCA, and Factor Analysis
  • Finding Patterns in Data Using Cluster and Hotspot Analysis
  • Regression Analysis and Discrete Choice Models

Neural networks, a nonlinear supervised learning modeling tool, have become hugely popular within the last two decades because they have been successfully applied to a wide range of problems, including automatic language processing, image classification, object detection, speech recognition, and pattern recognition. They are mathematical models that are loosely built up based on an analogy to the interconnected neuron in the brain. They take in a vector or matrix of input data and output either a classification value or an approximation to a functional value. The beauty is that the relationships between the inputs and outputs can be highly non-linear and complex.

In this course, you will explore the mechanics of neural networks and the intricacies involved in fitting them to data for prediction. Using packages in the free and open-source statistical programming language R with real-world data sets, you will implement these techniques. The focus will be on making these methods accessible for you in your own work.

You are required to have completed the following courses or have equivalent experience before taking this course:

  • Understanding Data Analytics
  • Finding Patterns in Data Using Association Rules, PCA, and Factor Analysis
  • Finding Patterns in Data Using Cluster and Hotspot Analysis
  • Regression Analysis and Discrete Choice Models
  • Supervised Learning Techniques

How It Works

I like to think outside of the box, and this program from eCornell helped me conceptualize how I want to approach data problems going forward. I was able to actually apply new course concepts to my work, rather than simply repeat steps with different values.
‐ Mark T.
Mark T.

Request Information Now by completing the form below.

Act today—courses are filling fast.
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.