Sumanta Basu is an Assistant Professor in the Department of Statistics and Data Science at Cornell University. Broadly, his research interests are structure learning and the prediction of large systems from data, with a particular emphasis on developing learning algorithms for time series data. Professor Basu also collaborates with biological and social scientists on a wide range of problems, including genomics, large-scale metabolomics, and systemic risk monitoring in financial markets. His research is supported by multiple awards from the National Science Foundation and the National Institutes of Health. At Cornell, Professor Basu teaches “Introductory Statistics” for graduate students outside the Statistics Department and “Computational Statistics” for Statistics Ph.D. students. He also serves as a faculty consultant at Cornell Statistical Consulting Unit, which assists the broader Cornell community with various aspects of analyzing empirical research. Professor Basu received his Ph.D. from the University of Michigan and was a postdoctoral scholar at the University of California, Berkeley, and Lawrence Berkeley National Laboratory. Before he received his Ph.D, Professor Basu was a business analyst, working with large retail companies on the design and data analysis of their promotional campaigns.
In this course, you will explore the steps associated with testing a hypothesis and use a variety of simulation methods to test hypotheses in R; these different methods will allow you to test hypotheses for various possible scenarios. As you perform hypothesis tests, you will discover how to assess the uncertainty associated with your data set and the test. You will also analyze the relationship between two or more variables using linear regression analysis and determine how to assess these relationships with simple diagnostic tools.
Throughout this course, you will perform hands-on coding exercises to practice simulations in R, which will help you gain confidence in both your programming and statistical skills. After completing this course, you will be able to test hypotheses that involve two or more variables in a ready-to-analyze data set using simulations in the programming language R. You will also understand the uncertainty associated with your hypothesis tests and how it impacts your conclusions.
System requirements: This course contains a virtual programming environment that does not support the use of Safari, Edge, tablets, or mobile devices. Please use Chrome, Firefox, or Internet Explorer on a computer for this course.
The following courses are required to be completed before taking this course:
- Exploring Data Sets With R
- Summarizing and Visualizing Data
- Answer questions by formulating hypotheses and testing them with real data
- Use simulation techniques to compare two groups
- Assess uncertainty in your results using simulation techniques
- Measure the strength of association among variables with linear regression
How It Works
Who Should Enroll
- Current and aspiring data scientists and analysts
- Business decision makers
- Marketing analysts
- Anyone seeking to gain deeper exposure to data science