Natural Language Processing
With PythonCornell Certificate Program
Overview and Courses
There’s an abundance of textual information in the world, and more is being created each day. Working with this vast amount of text is a significant challenge for humans, as it would be impossible for individuals to read millions of web search queries, product descriptions, emails, and articles. The answer is natural language processing (NLP). NLP solutions continue to expand, with more and more applications in machine learning and beyond being discovered every day. Organizations employ NLP for textual analysis and classification as well as more advanced tasks such as writing, coding, and reasoning.
In this certificate program, you’ll cover the fundamentals of NLP, including how to teach a computer where a word starts and ends, as well as more advanced skills like how to program a computer to determine what sentences mean. Throughout the courses, you’ll have the opportunity to implement numerous string and text processing techniques, work with machine learning algorithms to determine how similar documents are to one another, and train machine learning models to optimize the extraction of meaningful data from documents. While gaining valuable practice with Python functions and expressions, you will also master the ability to process text using NLP-specific packages, including Natural Language Tool Kit (NLTK), Gensim, spaCy, regex, and SentenceTransformers, that can be used to extend Python’s power. By the end of the program, you will have the theoretical basis and technical expertise to apply NLP in the workplace, to your innovations, and beyond.
In order to be successful in this program, students should have a working knowledge of Python programming as well as college-level knowledge of linear algebra and statistics.
The courses in this certificate program are required to be completed in the order that they appear.
Course list
- Mar 26, 2025
- Jun 4, 2025
- Aug 13, 2025
- Oct 22, 2025
- Dec 31, 2025
If you want to compare two large bodies of text with each other, you can do that by making comparisons with the text itself: Turn the text into tokens then compare the overlap in tokens. Sometimes, however, you don't just want to know that two texts are different (a binary comparison), but you want to know how different, which is a fuzzy comparison. In this course, you will transform text into numeric vectors, which allows us to perform arithmetic operations on textual information to calculate similarity. This is a classical natural language processing (NLP) technique, and it begins by creating different kinds of vectors. You will create both sparse and dense vectors, and you will compare vectors of different sizes to see how information is captured. Finally, you will measure similarity among document vectors, which is the real power of turning text into vectors. The ability to determine how similar two or more documents are is a common use of NLP, and you will practice this technique through hands-on exercises and projects.
You are required to have completed the following course or have equivalent experience before taking this course:
- Natural Language Processing Fundamentals
- Apr 16, 2025
- Jun 25, 2025
- Sep 3, 2025
- Nov 12, 2025
In this course, you will start to use machine learning methods to further your exploration of document term matrices (DTM). You will use a DTM to create train and test sets with the scikit-learn package in Python — an important first step in categorizing different documents. You will also examine different models, determining how to select the most appropriate model for your particular natural language processing task. Finally, after you have chosen a model, trained it, and tested it, you will work with several evaluation metrics to measure how well your model performed. The technical skills and evaluation processes you study in the course will provide valuable experience for the workplace and beyond.
You are required to have completed the following courses or have equivalent experience before taking this course:
- Natural Language Processing Fundamentals
- Transforming Text Into Numeric Vectors
- May 7, 2025
- Jul 16, 2025
- Sep 24, 2025
- Dec 3, 2025
Can a computer tell the difference between an article on “jaguar” the animal and “Jaguar” the car? It can if we teach it how. In this course, you will extract key phrases or words from a document, which is a key step in the process of text summarization. Part of what makes natural language processing (NLP) so powerful is that it processes text at scale, when a human would simply take too long to perform the same task given the sheer number of text documents to be read and processed. A classic use of NLP, then, is to summarize long documents, whether they are articles or books, in order to create a more easily readable abstract, or summary.
Extracting keywords or key phrases is a first step in this direction, which is where you will start in this course. Once you train a computer what the most important words in a document might be, you have to train it to identify the most important sentences. This is the second step in extracting information from a document to help create an abstract, and you will perform this step on larger text documents as well. Finally, you will calculate and interpret similarity metrics to compute the degree of similarity among documents that are possibly related to one another. The techniques you use throughout this course will prove useful in specific situations at work and beyond as you support your team or achieve your personal goals.
You are required to have completed the following courses or have equivalent experience before taking this course:
- Natural Language Processing Fundamentals
- Transforming Text Into Numeric Vectors
- Classifying Documents With Supervised Machine Learning
- May 28, 2025
- Aug 6, 2025
- Oct 15, 2025
- Dec 24, 2025
In this course, you will focus on measuring distance — the dissimilarity of various documents. The goal is to discover how alike or unlike various groups of text documents are to one another. At scale, this is a problem you might encounter if you need to group thousands of products together purely by using their product description or if you would like to recommend a movie to someone based on whether they liked a different movie. You will work with several different data sets and use both hierarchical and k-means clustering to create clusters, and you will practice with several distance measures to analyze document similarity. Finally, you will create visualizations that help to convey similarity in powerful ways so stakeholders can easily understand the key takeaways of any clustering or distance measure that you create.
You are required to have completed the following courses or have equivalent experience before taking this course:
- Natural Language Processing Fundamentals
- Transforming Text Into Numeric Vectors
- Classifying Documents With Supervised Machine Learning
- Topic Modeling With Unsupervised Machine Learning
- Apr 9, 2025
- Jun 18, 2025
- Aug 27, 2025
- Nov 5, 2025
We have all been misunderstood when sending a text message or email, as tone often does not translate well in written communication. Similarly, computers can have a hard time discerning the meaning of words if they are being used sarcastically, such as when we say “Great weather” when it's raining. If you are automatically processing reviews of your product, a negative review will have many of the same key words as a positive one, so you will need to be able to train a model to distinguish between a good review and a bad review. This is where semantic and sentiment analysis come in.
In this course, you will examine many kinds of semantic relationships that words can have (such as hypernyms, hyponyms, or meronyms), which go a long way toward extracting the meaning of documents at scale. You will also implement named entity recognition to identify proper nouns within a document and use several techniques to determine the sentiment of text: Is the tone positive or negative? These invaluable skills can easily turn the tide in a difficult project for your team at work or on the path toward achieving your personal goals.
You are required to have completed the following courses or have equivalent experience before taking this course:
- Natural Language Processing Fundamentals
- Transforming Text Into Numeric Vectors
- Classifying Documents With Supervised Machine Learning
- Topic Modeling With Unsupervised Machine Learning
- Clustering Documents With Unsupervised Machine Learning
How It Works
- View slide #1
- View slide #2
- View slide #3
- View slide #4
- View slide #5
- View slide #6
- View slide #7
- View slide #8
Key Course Takeaways
- Apply classic NLP techniques to text in order to identify patterns and make processing tasks more computationally efficient
- Transform words into numeric vectors that carry similar semantic information to perform calculations on textual information
- Apply supervised machine learning classification models in order to assign categories to text
- Apply unsupervised machine learning models to summarize text as a short paragraph or set of keywords and assign topics to text
- Apply unsupervised machine learning models to relate similar groups of documents and apply different metrics to determine text similarity
- Conduct semantic and sentiment analysis on text in order to extract meaning from documents

Download a Brochure
Not ready to enroll but want to learn more? Download the certificate brochure to review program details.
What You'll Earn
- Natural Language Processing With Python Certificate from Cornell Bowers College of Computing and Information Science
- 144 Professional Development Hours (14.4 CEUs)
Who Should Enroll
- Engineers
- Software developers
- Computer scientists new to NLP
- Data scientists
- Analysts
- Researchers
- Linguists

“Completing a program from eCornell really has allowed me to think outside the box at work. It gave me the confidence I needed to take a seat at that table and say I am ready.”
Request Information Now by completing the form below.

Natural Language Processing With Python
Select Payment Method | Cost |
---|---|
$3,750 | |