What you will learn
- Fundamental R programming skills
- Statistical concepts such as probability, inference, and modeling and how to apply them in practice
- Gain experience with the tidyverse, including data visualization with ggplot2 and data wrangling with dplyr
- Become familiar with essential tools for practicing data scientists such as Unix/Linux, git and GitHub, and RStudio
- Implement machine learning algorithms
- In-depth knowledge of fundamental data science concepts through motivating real-world case studies
Program Class List
1Data Science: R Basics
Build a foundation in R and learn how to wrangle, analyze, and visualize data.
2Data Science: Visualization
Learn basic data visualization principles and how to apply them using ggplot2.
3Data Science: Probability
Learn probability theory -- essential for a data scientist -- using a case study on the financial crisis of 2007-2008.
4Data Science: Inference and Modeling
Learn inference and modeling, two of the most widely used statistical tools in data analysis.
5Data Science: Productivity Tools
Keep your projects organized and produce reproducible reports using GitHub, git, Unix/Linux, and RStudio.
6Data Science: Wrangling
Learn to process and convert raw data into formats needed for analysis.
7Data Science: Linear Regression
Learn how to use R to implement linear regression, one of the most common statistical modeling approaches in data science.
8Data Science: Machine Learning
Build a movie recommendation system and learn the science behind one of the most popular and successful data science techniques.
9Data Science: Capstone
Show what you've learned from the Professional Certificate Program in Data Science.
Meet your instructor
Professor of Biostatistics at Harvard University
Rafael Irizarry is a Professor of Biostatistics at the Harvard T.H. Chan School of Public Health and a Professor of Biostatistics and Computational Biology at the Dana Farber Cancer Institute. For the past 15 years, Dr. Irizarry’s research has focused on the analysis of genomics data. During this time, he has also has taught several classes, all related to applied statistics. Dr. Irizarry is one of the founders of the Bioconductor Project, an open source and open development software project for the analysis of genomic data. His publications related to these topics have been highly cited and his software implementations widely downloaded.