Jeremy Entner, Ph.D., joined Cornell’s Department of Statistics and Data Science as a Lecturer in 2019, where he teaches several courses including “Biological Statistics,” “The Theory of Interest,” and “Statistics for Risk Modeling.” He previously spent six years at the University of Tennessee at Martin teaching courses on mathematics and statistics. Dr. Entner holds a B.S. and M.A. in Mathematics from SUNY Brockport. He earned his Ph.D. in Mathematics with an Emphasis on Statistics from Syracuse University.
Data Science EssentialsCornell Certificate Program
Request More Info
Overview and Courses
In recent years, the field of data science has taken off, as every industry and function increasingly relies on data-driven insights to make decisions.
The statistical programming language R is widely used in data science and understanding the fundamentals of how it works can be helpful, whether you’re considering a career in data science or looking to better communicate with data scientists on your team. In this certificate program, you will develop an essential foundation in R programming skills, then use those skills to understand and summarize data.
In the first course, you will study R programming principles and use R for data manipulation, visualization, and sampling. Building on your skills, you will summarize and visualize real data sets, draw conclusions from those data, and evaluate the uncertainty surrounding those conclusions. Throughout the process, you will develop hypotheses about your data, then use simulations and statistical techniques to evaluate your hypotheses. You will also practice using the Tidyverse open-source R packages to clean and organize your data sets. Finally, you will have the opportunity to manipulate and visualize data using more advanced techniques.
This certificate will ultimately introduce you to the fundamentals of data science and enhance your ability to draw meaningful conclusions from data.
System requirements: This course contains a virtual programming environment that is compatible with Chrome, Firefox, or Internet Explorer.
The courses in this certificate program are required to be completed in the order that they appear.Course list
When you think about what data analysts and data scientists do on a day-to-day basis, you might have a general understanding of types of conclusions they make, but how do they arrive at those conclusions? The statistical programming language R is widely used in data science; understanding the basics of how it works can help you manipulate and visualize data in a quick, flexible manner, and it may improve your communication with data scientists on your team.
In this course, you will explore the basics of statistical programming and develop R skills. As you hone your ability to use commands in R, you will combine those basic skills to complete more complex tasks, such as data manipulation and visualization. Finally, you will examine how to repeat tasks in R, which makes it easier to manipulate large data sets. This course involves many hands-on coding exercises to help you gain confidence in your newfound programming skills.
System requirements: This course contains a virtual programming environment that does not support the use of Safari, Edge, tablets, or mobile devices. Please use Chrome, Firefox, or Internet Explorer on a computer for this course.
- Apr 29, 2026
- Jun 24, 2026
- Aug 19, 2026
- Oct 14, 2026
- Dec 9, 2026
- Feb 3, 2027
- Mar 31, 2027
The real world is extremely complex, and revealing the patterns that underlie these complexities can be challenging. However, unlocking the power of a data set can provide you with remarkable insights and help guide decision-making. This course will prepare you to use summarization and visualization techniques to reveal patterns in real-world data, using examples from a variety of disciplines, including business and medicine.
In this course, Professor Basu will guide you as you begin to understand key data collection principles and how to make conclusions from data. Choosing which analyses to use depends on your question, so you will use a framework to help you choose which methods to use with your data. Then, you will use R to perform exploratory data analyses, which will allow you to identify key patterns and trends in a ready-to-analyze data set. You will also learn the importance of quantifying the uncertainty associated with your results, and how to measure variability in your data. This course involves many hands-on coding exercises in R to help you gain confidence in your programming skills.
System requirements: This course contains a virtual programming environment that does not support the use of Safari, Edge, tablets, or mobile devices. Please use Chrome, Firefox, or Internet Explorer on a computer for this course.
“Exploring Data Sets With R” must be completed prior to starting this course.
- May 13, 2026
- Jul 8, 2026
- Sep 2, 2026
- Oct 28, 2026
- Dec 23, 2026
- Feb 17, 2027
- Apr 14, 2027
Data scientists make decisions by inferring the characteristics of a large population based on the characteristics of samples from that population. Basing a decision on samples is necessary since it would not be possible to measure every individual or unit in a population. However, it also means that data scientists need to consider the potential variability among samples before using those samples to make conclusions about the population. The variability across samples leads to uncertainty in decision-making, and understanding and quantifying that uncertainty is a key aspect of data science.
Throughout this course, Professor Basu will guide you through the nuances of understanding and quantifying the uncertainty around your results, and through making decisions in the face of that uncertainty. In data science, simulations offer a powerful framework with which to understand the uncertainty around your data, so you will learn to perform simulations in R and use a simulation-based framework to quantify uncertainty when studying the relationship between categorical variables. You will also use resampling techniques to understand numerical variables and compare their summary statistics across different levels of a categorical variable. Often, data scientists search for relationships between numerical variables and use one numerical variable to predict another numerical variable, and you will do this by building a prediction rule with linear regression while keeping the uncertainty of your results in mind. Finally, you will use the errors from linear regression to compare prediction rules and determine which prediction rules fit your data best. This course involves many hands-on coding exercises in R which will help you gain confidence in your programming skills.
System requirements: This course contains a virtual programming environment that does not support the use of Safari, Edge, tablets, or mobile devices. Please use Chrome, Firefox, or Internet Explorer on a computer for this course.
“Exploring Data Sets With R” and “Summarizing and Visualizing Data” must be completed prior to starting this course.
- May 27, 2026
- Jul 22, 2026
- Sep 16, 2026
- Nov 11, 2026
- Jan 6, 2027
- Mar 3, 2027
- Apr 28, 2027
Data scientists use data collected from the real world to answer questions and solve problems that would otherwise be intractable. But since the world is complex, data collected to describe the world can also be complex, which makes it messy and difficult to work with. To successfully analyze data, data scientists need to spend time cleaning — or organizing and manipulating — their data to put it into a form that is easier to work with and understand.
In this course, you will delve into the world of data cleaning by presenting and manipulating your data with the Tidyverse in R. You will organize data by selecting only the variables you're interested in, creating new groups of data, and summarizing data in a way that makes sense for the questions you're trying to ask. You will also create high-quality plots to quickly summarize complex data. You will become familiar with the concept of tidy data and organize data sets in a way that allows for the most efficient analysis. Finally, you will work with data types of more complexity so that you can answer increasingly difficult questions as you take your new skills into your workplace. You will practice all these skills by working with four real-world, complex data sets. This course involves many hands-on coding exercises that will help you take your programming skills to the next level.
System requirements: This course contains a virtual programming environment that does not support the use of Safari, Edge, tablets, or mobile devices. Please use Chrome, Firefox, or Internet Explorer on a computer for this course.
“Exploring Data Sets With R” and “Measuring Relationships and Uncertainty” must be completed prior to starting this course.
- Jun 10, 2026
- Aug 5, 2026
- Sep 30, 2026
- Nov 25, 2026
- Jan 20, 2027
- Mar 17, 2027
- May 12, 2027
eCornell Online Workshops are live, interactive 3-hour learning experiences led by Cornell faculty experts. These premium short-format sessions focus on AI topics and are designed for busy professionals who want to gain immediately applicable skills and strategic perspectives. Workshops include faculty presentations, breakout discussions, and guided hands-on practice.
The AI Workshops All-Access Pass provides you with unlimited participation for 6 months from your date of purchase. Whether you choose to attend one workshop per month, or several per week, the All-Access Pass will allow you to customize your AI journey and stay on top of the latest AI trends.
Workshops cover a range of cutting-edge AI topics applicable across industries, hosted by Cornell faculty at the forefront of their fields. Whether you are just getting started with AI, seeking to build your AI skillset, or exploring advanced applications of AI, Workshops will provide you with an action-oriented learning experience for immediate application in your career. Sample Workshops include:
- Work Smarter with AI Agents: Individual and Team Effectiveness
- Leading AI Transformation: Bigger Than You Imagine, Harder Than You Expect
- Using AI at Work: Practical Choices and Better Results
- Search & Discoverability in the Era of AI
- Don't Just Prompt AI - Govern it
- AI-Powered Product Manager
- Leverage AI and Human Connection to Lead through Uncertainty
Request more Info by completing the form below.
How It Works
- View slide #1
- View slide #2
- View slide #3
- View slide #4
- View slide #5
- View slide #6
- View slide #7
- View slide #8
- View slide #9
Faculty Authors
Sumanta Basu is an Assistant Professor in the Department of Statistics and Data Science at Cornell University. Broadly, his research interests are structure learning and the prediction of large systems from data, with a particular emphasis on developing learning algorithms for time series data. Professor Basu also collaborates with biological and social scientists on a wide range of problems, including genomics, large-scale metabolomics, and systemic risk monitoring in financial markets. His research is supported by multiple awards from the National Science Foundation and the National Institutes of Health. At Cornell, Professor Basu teaches “Introductory Statistics” for graduate students outside the Statistics Department and “Computational Statistics” for Statistics Ph.D. students. He also serves as a faculty consultant at Cornell Statistical Consulting Unit, which assists the broader Cornell community with various aspects of analyzing empirical research. Professor Basu received his Ph.D. from the University of Michigan and was a postdoctoral scholar at the University of California, Berkeley, and Lawrence Berkeley National Laboratory. Before he received his Ph.D, Professor Basu was a business analyst, working with large retail companies on the design and data analysis of their promotional campaigns.
Kara Karpman is an Adjunct Assistant Professor of Data Science and Statistics at Cornell University, as well as an Assistant Professor of Mathematics at Middlebury College in Middlebury, VT. Her research focuses on statistical modeling techniques for studying biological and financial data. Professor Karpman holds a B.S. in Mathematics from Duke University and an M.S. and Ph.D. in Applied Mathematics from Cornell University.

Jeremy Entner, Ph.D., joined Cornell’s Department of Statistics and Data Science as a Lecturer in 2019, where he teaches several courses including “Biological Statistics,” “The Theory of Interest,” and “Statistics for Risk Modeling.” He previously spent six years at the University of Tennessee at Martin teaching courses on mathematics and statistics. Dr. Entner holds a B.S. and M.A. in Mathematics from SUNY Brockport. He earned his Ph.D. in Mathematics with an Emphasis on Statistics from Syracuse University.

Sumanta Basu is an Assistant Professor in the Department of Statistics and Data Science at Cornell University. Broadly, his research interests are structure learning and the prediction of large systems from data, with a particular emphasis on developing learning algorithms for time series data. Professor Basu also collaborates with biological and social scientists on a wide range of problems, including genomics, large-scale metabolomics, and systemic risk monitoring in financial markets. His research is supported by multiple awards from the National Science Foundation and the National Institutes of Health. At Cornell, Professor Basu teaches “Introductory Statistics” for graduate students outside the Statistics Department and “Computational Statistics” for Statistics Ph.D. students. He also serves as a faculty consultant at Cornell Statistical Consulting Unit, which assists the broader Cornell community with various aspects of analyzing empirical research. Professor Basu received his Ph.D. from the University of Michigan and was a postdoctoral scholar at the University of California, Berkeley, and Lawrence Berkeley National Laboratory. Before he received his Ph.D, Professor Basu was a business analyst, working with large retail companies on the design and data analysis of their promotional campaigns.

Kara Karpman is an Adjunct Assistant Professor of Data Science and Statistics at Cornell University, as well as an Assistant Professor of Mathematics at Middlebury College in Middlebury, VT. Her research focuses on statistical modeling techniques for studying biological and financial data. Professor Karpman holds a B.S. in Mathematics from Duke University and an M.S. and Ph.D. in Applied Mathematics from Cornell University.
- View slide #1
- View slide #2
- View slide #3
Key Course Takeaways
- Use R to perform mathematical operations, create sets of data, and perform functions on data
- Summarize a data set with the appropriate visualizations and statistics
- Use summarization techniques to interpret data, develop conclusions, and measure the uncertainty of those conclusions
- Answer questions by formulating hypotheses and testing them with real data
- Use simulation techniques to assess uncertainty
- Use linear regression to measure the strength of association between variables
- Clean a data set to answer specific questions
- Create data visualizations for exploratory data analysis and presentations

Download a Brochure
Not ready to enroll but want to learn more? Download the certificate brochure to review program details.

What You'll Earn
- Data Science Essentials Certificate from Cornell Ann S. Bowers College of Computing and Information Science
- 64 Professional Development Hours (6.4 CEUs)
Watch the Video
Who Should Enroll
- Current and aspiring data scientists and analysts
- Business decision makers
- Marketing analysts
- Consultants
- Executives
- Anyone seeking to gain deeper exposure to data science
Frequently Asked Questions
Data-driven decision making now touches nearly every role, and the professionals who stand out are the ones who can move from a messy dataset to a clear, defensible takeaway. Cornell’s Data Science Essentials Certificate helps you build that capability by teaching you how to work in R, explore real datasets, visualize patterns, and communicate what the data is actually saying.
In this certificate program, authored by faculty from the Cornell Bowers College of Computing and Information Science, you will practice hands-on coding in a browser-based RStudio environment as you learn how to manipulate data, summarize categorical and numerical variables, quantify uncertainty, and build simple predictive models using regression. The learning experience is designed to stick, with short lessons followed by practice exercises, quizzes, and applied projects that reinforce the same workflow you use on the job.
You will also learn to clean and restructure real-world data using tidyverse tools, create high-quality visualizations with ggplot2, and work with common “messy” formats including missing values, text, and date-time fields. Throughout, expert facilitation and a small-cohort learning model help you stay on track and apply what you learn to questions that matter in your work.
If you want practical R fluency, confident statistical thinking, and the ability to turn raw data into clear, usable insights, you should choose Cornell's Data Science Essentials Certificate.
Many online data science courses rely on passive videos and isolated practice. Cornell’s Data Science Essentials Certificate is built to help you develop working capability through an applied, supported experience that mirrors how data work happens in practice.
You learn by doing, with frequent hands-on coding in R and immediate reinforcement through exercises, quizzes, and graded projects. The Data Science Essentials Certificate goes beyond “how to run code” by teaching you how to choose appropriate summaries and visualizations, avoid common inference mistakes like confusing correlation with causation, and quantify uncertainty when drawing conclusions from samples.
The learning model is also deliberately human centered. You will move through the program in a small cohort with an expert facilitator who guides discussions and provides feedback on your work. Live sessions offer opportunities to talk through questions and implementation challenges with peers. Inside the coding activities, an optional AI Coach can help you interpret error messages so you can debug more efficiently while keeping expert support at the center of the experience.
By the end of Cornell's Data Science Essentials Certificate, you are not just collecting syntax; you’re building a repeatable workflow for cleaning, exploring, and explaining data using R and tidyverse tools.
Enrolling in this certificate also provides you with a 6-month All-Access Pass to eCornell's live online AI Workshops, interactive sessions led by world-class Cornell faculty that combine Ivy League insight with practical applications for busy professionals. Each 3-hour Workshop features structured instruction, guided practice, and real tools to build competitive AI capabilities, plus the opportunity to connect with a global cohort of growth-oriented peers. While AI Workshops are not required, they enhance certificate programs through:
- Integrating AI perspectives across most curricula
- Responding to emerging AI developments and trends
- Offering direct engagement with Cornell faculty at the forefront of AI research
Cornell’s Data Science Essentials Certificate is designed for professionals who want an applied introduction to data science using R, whether you are building new analytics skills or strengthening your ability to collaborate with data scientists.
The Data Science Essentials Certificate is a strong fit if you:
- Are a current or aspiring data analyst or data scientist who wants practical experience working with data in R
- Make business decisions and want to interpret data summaries, visualizations, and uncertainty with more confidence
- Work in marketing, consulting, operations, or another function where you need to turn data into clear recommendations
- Want a structured learning experience with hands-on coding practice and feedback rather than a purely self-directed course
Because Cornell's Data Science Essentials Certificate program starts with R fundamentals and builds toward data cleaning, visualization, and regression-based prediction, it works well for motivated learners who want to develop a solid foundation and a repeatable workflow they can use on the job.
Across Cornell’s Data Science Essentials Certificate, your project work is designed to help you practice the core workflow you will use in real analysis: import data, clean and reshape it, explore patterns with summaries and visualizations, and draw conclusions while accounting for uncertainty.
You will complete applied, multi-part projects such as:
- Using R scripts to compute and compare statistical measures (for example, variance) as a way to build reproducible analysis habits
- Manipulating real datasets in R and answering specific questions by filtering, subsetting, and exporting analysis-ready tables
- Creating and customizing data visualizations, including comparing trends across multiple groups and saving plots for communication
- Designing a representative sample from a large dataset by grouping observations and sampling in a way that preserves key proportions
- Using simulations and resampling techniques to test hypotheses and quantify uncertainty in conclusions
- Building simple and multiple linear regression prediction rules, then evaluating model fit using residuals and R-squared
- Cleaning messy, real-world data with tidyverse tools, including reshaping wide and long data, joining datasets on a key, and addressing missing values
- Working with text and date-time fields to extract usable information for analysis and visualization
The result is a portfolio of practical outputs that reflect day-to-day analytics tasks, not just end-of-chapter exercises.
Cornell’s Data Science Essentials Certificate equips you to contribute more confidently to data-driven work by building practical skill in R, data cleaning, visualization, and statistical reasoning, so you can move from raw data to defensible insights.
After completing the Data Science Essentials Certificate, you will be prepared to:
- Use R to perform mathematical operations, create sets of data, and perform functions on data
- Summarize a data set with the appropriate visualizations and statistics
- Use summarization techniques to interpret data, develop conclusions, and measure the uncertainty of those conclusions
- Answer questions by formulating hypotheses and testing them with real data
- Use simulation techniques to assess uncertainty
- Use linear regression to measure the strength of association between variables
- Clean a data set to answer specific questions
- Create data visualizations for exploratory data analysis and presentations
Students often describe long-term benefits that show up quickly at work: stronger confidence writing and debugging R code, a more “data scientist” way of thinking about questions and evidence, and a smoother path from learning concepts to applying them through hands-on practice, quizzes, and projects. Many also highlight that the program’s real-world datasets, downloadable reference tools, and integrated coding environment make it easier to reuse what you learned after the courses end. Learners frequently note that the pacing works alongside busy schedules and that support from facilitators and teaching staff helps them stay engaged while building job-relevant analytics capability.
What truly sets eCornell apart is how our programs unlock genuine career transformation. Learners earn promotions to senior positions, enjoy meaningful salary growth, build valuable professional networks, and navigate successful career transitions.
Cornell’s Data Science Essentials Certificate, which consists of 4 short courses, is designed to be completed in 2 months. Each course runs for 2 weeks, with a typical weekly time commitment of 5 to 8 hours.
Designed for working professionals, each course is intensive and focused. The core work (videos, readings, coding exercises, quizzes, and projects) is largely asynchronous, so you can learn when it fits your schedule.
To keep you moving forward, the Data Science Essentials Certificate also includes cohort pacing, facilitator-guided discussions, and opportunities for live sessions. That structure adds accountability and support while still giving you flexibility week to week.
Students in Cornell's Data Science Essentials Certificate consistently describe a practical, job-relevant learning experience that helps them build real capability in data analysis and coding with R, even within a busy work and life schedule. They often highlight how the program blends clear instruction with immediate application so concepts stick and confidence grows.
What students commonly emphasize includes:
- Hands-on R coding practice that builds foundational data science skills
- Real-world datasets and examples that mirror workplace analytics tasks
- A learn-then-do structure with short lessons followed by practice, quizzes, and projects
- Tools and downloadable reference materials that are useful after the course ends
- An integrated coding environment that makes it easy to test and refine code as you learn
- A logical progression across modules that helps students think more like a data scientist
- Clear, concise teaching that explains not just how to do something, but why it works
- Flexible, self-paced pacing that fits well for working adults and busy parents
- A well-organized, easy-to-navigate online platform with a professional learning experience
- Responsive facilitators and teaching support with timely, actionable feedback
- A challenging but achievable level of rigor that keeps learners engaged and progressing
- Skills students report applying quickly to day-to-day work, with increased confidence in analytics
Prior programming experience is not required to begin Cornell’s Data Science Essentials Certificate, but you should be ready to learn by writing code frequently.
You will start with R fundamentals such as arithmetic, variables, functions, scripts, and importing data, then build toward data manipulation, visualization, simulation-based inference, regression, and tidyverse-based data cleaning. Because the Data Science Essentials Certificate includes many hands-on exercises and projects in R, comfort with focused practice and troubleshooting will help you succeed.
For additional support while you learn, the coding activities include an optional AI Coach (also called Coding Coach) that can help explain error messages, and your facilitator can help you work through concepts and application questions.
Cornell’s Data Science Essentials Certificate uses a virtual programming environment for hands-on work in R. Plan to use a computer with a compatible browser so you can complete the coding exercises smoothly.
For the built-in R environment, use Chrome, Firefox, or Internet Explorer on a desktop or laptop. Safari, Microsoft Edge, tablets, and mobile devices are not supported for the virtual coding environment in this certificate. A reliable internet connection is also important for accessing course materials and interactive activities.
A major focus of Cornell’s Data Science Essentials Certificate is learning how to take messy data and make it usable for analysis. You will practice a repeatable cleaning workflow in R using tidyverse tools that data scientists rely on in real projects.
In the Data Science Essentials Certificate, you will learn how to filter and select variables, create new variables, group and summarize data, and produce clear visualizations for quick exploratory analysis. You’ll also work on the kinds of structural problems that slow teams down, including reshaping data between wide and long formats, joining related datasets on a key, and addressing missing values.
Because real work often includes nonstandard fields, you will also practice cleaning and extracting meaning from text and date-time data so you can analyze and visualize patterns that would otherwise stay hidden.
Explore Related Programs
Request Information Now by completing the form below.

Data Science Essentials
| Select Payment Method | Cost |
|---|---|
| $3,750 | |


























