Linda Nozick is Professor and Director of Civil and Environmental Engineering at Cornell University. She is co-founder and a past director of the College Program in Systems Engineering and has been the recipient of several awards, including a CAREER award from the National Science Foundation and a Presidential Early Career Award for Scientists and Engineers from President Clinton for “the development of innovative solutions to problems associated with the transportation of hazardous waste.” Dr. Nozick has authored over 60 peer-reviewed publications, many focused on transportation, the movement of hazardous materials, and the modeling of critical infrastructure systems. She has been an associate editor for Naval Research Logistics and a member of the editorial board of Transportation Research Part A. Dr. Nozick has served on two National Academy Committees to advise the U.S. Department of Energy on renewal of their infrastructure. During the 1998-1999 academic year, she was a Visiting Associate Professor in the Operations Research Department at the Naval Postgraduate School in Monterey, California. Dr. Nozick holds a B.S. in Systems Analysis and Engineering from the George Washington University and an MSE and Ph.D. in Systems Engineering from the University of Pennsylvania.
Finding Patterns in Data Using Association Rules, PCA, and
Factor AnalysisCornell Course
Visualization is one of the most simple and effective ways to find patterns in data. These patterns include: What is the general range and shape of the data set? Are there any clusters of observations? Which variables correlate with each other? Are there any obvious outliers?
As your data set grows in terms of the number of data points and variables, however, it becomes increasingly difficult to visualize all this information at once. At most, you can plot data points on a three-dimensional axis and add further distinctions of size, color, shape, and so on. Yet this can easily become too busy and difficult to read. How, then, do we find patterns in really big data sets?
In this course, you will explore several powerful and commonly utilized techniques for distilling patterns from data. You will implement each of these techniques using the free and open-source statistical programming language R with real-world data sets. The focus will be on making these methods accessible for you in your own work.
You are required to have completed the following course or have equivalent experience before taking this course:
- Understanding Data Analytics
Key Course Takeaways
- Use data to identify useful association rules
- Identify the principal components in a data set
- Use the principal components to draw insights from the data
- Use factor analysis to draw insights from a data set
How It Works
Who Should Enroll
- Current and aspiring data scientists
- Technical managers