Sumanta Basu is an Assistant Professor in the Department of Statistics and Data Science at Cornell University. Broadly, his research interests are structure learning and the prediction of large systems from data, with a particular emphasis on developing learning algorithms for time series data. Professor Basu also collaborates with biological and social scientists on a wide range of problems, including genomics, large-scale metabolomics, and systemic risk monitoring in financial markets. His research is supported by multiple awards from the National Science Foundation and the National Institutes of Health. At Cornell, Professor Basu teaches “Introductory Statistics” for graduate students outside the Statistics Department and “Computational Statistics” for Statistics Ph.D. students. He also serves as a faculty consultant at Cornell Statistical Consulting Unit, which assists the broader Cornell community with various aspects of analyzing empirical research. Professor Basu received his Ph.D. from the University of Michigan and was a postdoctoral scholar at the University of California, Berkeley, and Lawrence Berkeley National Laboratory. Before he received his Ph.D, Professor Basu was a business analyst, working with large retail companies on the design and data analysis of their promotional campaigns.
In this course, you will encounter examples of how data science can be used for business, including key concepts in data collection and analysis. You will address these questions through a simulation framework in R that uses a data science approach. You will also summarize data sets by making visualizations and calculating summary statistics, such as the mean, standard deviation, and correlation.
After you complete this course, you will be able to use the basic tools in R to perform exploratory data analysis, which will allow you to identify key patterns and trends in a ready-to-analyze data set. You will also discover how to judge the uncertainty surrounding your data and identify methods to determine whether you can trust your results. This course involves many hands-on coding exercises in R to help you gain confidence in your programming skills.
System requirements: This course contains a virtual programming environment that does not support the use of Safari, Edge, tablets, or mobile devices. Please use Chrome, Firefox, or Internet Explorer on a computer for this course.
“Exploring Data Sets With R” must be completed prior to starting this course.
- Understand data collection concepts and why correlation is not the same as causation
- Summarize a data set with the appropriate visualizations and statistics
- Use summarization techniques to interpret data and develop conclusions
- Determine the uncertainty of your conclusions about a data set
How It Works
Who Should Enroll
- Current and aspiring data scientists and analysts
- Business decision makers
- Marketing analysts
- Anyone seeking to gain deeper exposure to data science