Overview and Courses

In today's data-driven world, advanced data modeling techniques are essential for enabling informed decision making and strategic planning.

This certificate program is designed to help you understand predictive modeling, with a focus on making accurate predictions using various types of data. Throughout this program, you will explore models such as polynomial regression, splines, and generalized additive models. These models are used to analyze complex relationships within datasets that may include both numerical and categorical variables. You’ll also gain practical skills in building models using R, which will allow you to examine how different types of information can be combined to make predictions.

You will have the opportunity to practice modeling interactions between different types of data, such as categories and numbers, and use decision trees to understand complex relationships that linear models are unable to capture. By the end of the program, you’ll be able to create and evaluate predictive models, equipping you with valuable skills for decision making in a variety of industries.

To be successful in this course, you should have a foundation in R programming and be able to leverage those skills to create and summarize datasets with visualizations, interpret data, employ simulations, use linear regression, clean data, and create visualizations. Experience with R will be critical to success as we don't explicitly teach how to use R in this certificate. High school-level or college-level math and algebra are also recommended. If you do not have this experience, start with the Data Science Essentials certificate program.

You’ll have six months to complete the required elements for this certificate program, but this flexible approach allows you to finish sooner based on your schedule.

COURSE 1: Nonlinear Regression Models

Nonlinear regression models are essential for capturing complex relationships between predictor and response variables that linear regression cannot adequately describe. In this course, you will engage with the theoretical foundations of these models, gain practical experience in their application, and develop the skills necessary to interpret and evaluate their results. This course is designed to equip you with a comprehensive understanding of nonlinear regression models, with a focus on polynomial regression, splines, and generalized additive models (GAMs).

COURSE 2: Modeling Interactions Between Predictors

In this course, you will explore strategies for incorporating categorical predictors in a regression model, including using dummy variables to represent different categories. You will inspect binary and nonbinary categorical variables and discover how to interpret the estimated coefficients of dummy variables.

As you progress through the course, you will practice modeling and interpreting interactions between categorical and quantitative predictors in a linear model. Finally, you’ll focus on defining and implementing decision trees, which are advantageous for capturing complex interactions between predictors that linear models may be unable to capture. By the end of the course, you’ll be equipped to transform categorical variables into numerical variables, fit regression models with categorical predictors, interpret dummy variable coefficients, and use decision trees for modeling complex relationships between predictors.

COURSE 3: Foundations of Predictive Modeling

The goal of this course is to introduce you to the fundamental concepts and techniques used in predictive modeling. You will evaluate the balance between model flexibility and interpretability, examine how to select the best parameters using cross-validation, and practice building models that generalize well to new data. You’ll also explore techniques for splitting datasets, selecting tuning parameters, and fitting models using loss functions. By the end of the course, you’ll have a solid understanding of model flexibility, interpretability, and the bias-variance trade-off, equipping you to effectively build and evaluate predictive models.

COURSE 4: Ensemble Methods

When working with real-world datasets, more than a single model may be required to capture the complexity of the data. Ensemble methods prove to be extremely useful with complex datasets by allowing us to combine simpler models to fully grasp the patterns in the data, thereby improving the predictive power of the models.

In this course, you will discover how to use two ensemble methods: random forests and boosted decision trees. You’ll practice these ensemble methods with datasets in R and apply the ensemble techniques you’ve learned to build robust predictive models. You’ll also practice improving decision tree performance using random forest models and practice interpreting those models. You’ll then use another technique and apply boosting to reduce errors and aggregate predictions to decision trees.

AI Workshops All-Access Pass

LIVE

6-month subscription included

eCornell online Workshops are live, interactive learning experiences lasting 1 to 4 hours and led by Cornell faculty experts. These premium, short-format sessions focus on AI topics and are designed for busy professionals who want to gain immediately applicable skills and strategic perspectives. Workshops may include faculty presentations, breakout discussions, and guided hands-on practice.

The AI Workshops All-Access Pass provides you with unlimited participation for 6 months from your date of purchase. Whether you choose to attend one Workshop per month or several per week, the All-Access Pass allows you to customize your AI journey and stay on top of the latest AI trends.

Hosted by Cornell faculty at the forefront of their fields, Workshops cover a range of cutting-edge AI topics applicable across industries. Workshops are offered at three levels to allow you to choose topics that match your experience.

AI Foundations
- These Workshops introduce core AI concepts, terminology, capabilities, limitations, and practical applications. No prior AI experience is required.
- Best for: Beginners, AI-curious professionals, and teams starting their AI journey.
AI in Practice
- These Workshops focus on practical skills, workflows, and strategies that help participants use AI more effectively in their day-to-day work. Some familiarity with AI tools is recommended.
- Best for: Professionals who have experimented with AI and want to build confidence and capability.
AI Leadership and Transformation
- These advanced Workshops explore emerging technologies, strategic implementation, governance, organizational impact, and specialized applications. AI fluency is expected, and some Workshops may have prerequisites.
- Best for: AI leaders, transformation teams, executives, and advanced practitioners.

How It Works

Format

Mentored Learning
All online

Time Commitment

64 hours with 6 months of access at your own pace

Engagement

100% self-paced

Power Your Career

Gain today’s most in-demand skills to stand apart.

Flexibility Fits Your Life

Learn on your schedule without stepping out of your job.

Personalized Facilitation

Receive expert feedback and guidance.

Real-world Projects

Apply learning and insights to your work to make an impact right away.

Learn From Top Minds

Courses are developed by Cornell faculty.

Format

Mentored Learning
All online

Time Commitment

64 hours with 6 months of access at your own pace

Engagement

100% self-paced

Power Your Career

Gain today’s most in-demand skills to stand apart.

Flexibility Fits Your Life

Learn on your schedule without stepping out of your job.

Personalized Facilitation

Receive expert feedback and guidance.

Real-world Projects

Apply learning and insights to your work to make an impact right away.

Learn From Top Minds

Courses are developed by Cornell faculty.

View slide #1
View slide #2
View slide #3
View slide #4
View slide #5
View slide #6
View slide #7
View slide #8

Faculty Author

view details

Sumanta Basu

Assistant Professor

Cornell Bowers Computing and Information Science

Assistant Professor, Cornell Bowers CIS; Shayegani Bruno Family Faculty Fellow, Cornell Department of Computational Biology

Sumanta Basu is an Assistant Professor in the Department of Statistics and Data Science at Cornell University. Broadly, his research interests are structure learning and the prediction of large systems from data, with a particular emphasis on developing learning algorithms for time series data. Professor Basu also collaborates with biological and social scientists on a wide range of problems, including genomics, large-scale metabolomics, and systemic risk monitoring in financial markets. His research is supported by multiple awards from the National Science Foundation and the National Institutes of Health. At Cornell, Professor Basu teaches “Introductory Statistics” for graduate students outside the Statistics Department and “Computational Statistics” for Statistics Ph.D. students. He also serves as a faculty consultant at Cornell Statistical Consulting Unit, which assists the broader Cornell community with various aspects of analyzing empirical research. Professor Basu received his Ph.D. from the University of Michigan and was a postdoctoral scholar at the University of California, Berkeley, and Lawrence Berkeley National Laboratory. Before he received his Ph.D, Professor Basu was a business analyst, working with large retail companies on the design and data analysis of their promotional campaigns.

Key Course Takeaways

Select an optimal model based on modeling goals and characteristics of a dataset
Identify when a nonlinear model is necessary based on data characteristics and how to implement it
Identify or detect when an interaction between predictors would improve a model
Improve predictive accuracy by combining different models into an ensemble

Enroll Now

Download a Brochure

Not ready to enroll but want to learn more? Download the certificate brochure to review program details.

Download Now

“

I like to think outside of the box, and this program from eCornell helped me conceptualize how I want to approach data problems going forward. I was able to actually apply new course concepts to my work, rather than simply repeat steps with different values.

‐ Mark T.

What You'll Earn

Data Science Modeling Certificate from Cornell’s Ann S. Bowers College of Computing and Information Science
64 Professional Development Hours (6.4 CEUs)

Start Now

Watch the Video

Hear eCornell students share their stories.

Discover More

Who Should Enroll

Current and aspiring data scientists and analysts
Business decision makers
Marketing analysts
Consultants
Executives
Anyone seeking to gain deeper exposure to data science

Request Information Now by completing the form below.

Act today—courses are filling fast.

Do you wish to communicate with our team by text message?

I'm most interested in programs about: *

Online Professional Certificates
Online Workshops -
Live
Custom Executive Education
Online Master's Degree
Master's Degree Ithaca
or NYC
Programs
for Teams

Data Science for Machine LearningCornell Certificate Program

Overview and Courses

How It Works

Faculty Author

Key Course Takeaways

Download a Brochure

What You'll Earn

Watch the Video

Who Should Enroll

Explore Related Programs

Data Science

Business Analytics

Data Management in SQL

Supply Chain Analytics

Big Data for Policy

Data Science With SQL and Tableau

Data Analytics

Python for Data Science

Data Analytics 360

Data Science Essentials

Business Statistics

Data Science and Decision Making

Data Analytics in R

Applied Statistics

Data Ethics

Operations Analytics

Request Information Now by completing the form below.

Address:	950 Danby Rd.
	Suite 150
	Ithaca, NY 14850