Sumanta Basu is an Assistant Professor in the Department of Statistics and Data Science at Cornell University. Broadly, his research interests are structure learning and the prediction of large systems from data, with a particular emphasis on developing learning algorithms for time series data. Professor Basu also collaborates with biological and social scientists on a wide range of problems, including genomics, large-scale metabolomics, and systemic risk monitoring in financial markets. His research is supported by multiple awards from the National Science Foundation and the National Institutes of Health. At Cornell, Professor Basu teaches “Introductory Statistics” for graduate students outside the Statistics Department and “Computational Statistics” for Statistics Ph.D. students. He also serves as a faculty consultant at Cornell Statistical Consulting Unit, which assists the broader Cornell community with various aspects of analyzing empirical research. Professor Basu received his Ph.D. from the University of Michigan and was a postdoctoral scholar at the University of California, Berkeley, and Lawrence Berkeley National Laboratory. Before he received his Ph.D, Professor Basu was a business analyst, working with large retail companies on the design and data analysis of their promotional campaigns.
The real world is extremely complex, and revealing the patterns that underlie these complexities can be challenging. However, unlocking the power of a data set can provide you with remarkable insights and help guide decision-making. This course will prepare you to use summarization and visualization techniques to reveal patterns in real-world data, using examples from a variety of disciplines, including business and medicine.
In this course, Professor Basu will guide you as you begin to understand key data collection principles and how to make conclusions from data. Choosing which analyses to use depends on your question, so you will use a framework to help you choose which methods to use with your data. Then, you will use R to perform exploratory data analyses, which will allow you to identify key patterns and trends in a ready-to-analyze data set. You will also learn the importance of quantifying the uncertainty associated with your results, and how to measure variability in your data. This course involves many hands-on coding exercises in R to help you gain confidence in your programming skills.
System requirements: This course contains a virtual programming environment that does not support the use of Safari, Edge, tablets, or mobile devices. Please use Chrome, Firefox, or Internet Explorer on a computer for this course.
“Exploring Data Sets With R” must be completed prior to starting this course.
Key Course Takeaways
- Understand data collection concepts and why correlation is not the same as causation
- Summarize a data set with the appropriate visualizations and statistics
- Use summarization techniques to interpret data and develop conclusions
- Determine the uncertainty of your conclusions about a data set
How It Works
Who Should Enroll
- Current and aspiring data scientists and analysts
- Business decision makers
- Marketing analysts
- Anyone seeking to gain deeper exposure to data science