Data Science: Inferential Thinking through Simulations

Course 2 of 3: Professional Certificate® in the Foundations of Data Science 5 Weeks 4–6 hours per week
1354

Please select the start dates for your courses below.

Scheduled Start:

About this course

This course will teach you the power of statistical inference: given a random sample, how do we predict some quantity that we cannot observe directly?

Using real-world examples from a wide array of domains including law, medicine and football, you’ll learn how data scientists make conclusions about unknowns based on the data available. Often, the data we have is not complete, yet we’d still like to draw inferences about the world and quantify the uncertainty in our conclusions. This is called statistical inference. In this course, you will learn the framework for statistical inference and apply them to real-world data sets.

Notably, you will develop the practice of hypothesis testing—comparing theoretical predictions to actual data, and choosing whether to accept those predictions. This method allows us to evaluate theories or hypotheses about how the world works.

You will also learn how to quantify the uncertainty in the conclusions you draw from hypothesis testing. This helps assess whether patterns that appear to be present in the data actually represent a true relationship in the world, or whether they might merely reflect random fluctuations due to noise. Throughout this course, we will go over multiple methods for estimation and hypothesis testing, based on simulations and the bootstrap method. Finally, you will learn about randomized controlled experiments and how to draw conclusions about causality.

The course emphasizes the conceptual basis of inference, the logic of the decision-making process, and the sound interpretation of results.

What you’ll learn

  • The logical and conceptual frameworks of statistical inference
  • How to conduct hypothesis testing, permutation testing, and A/B testing
  • The purpose and power of resampling methods
  • Relations between sample size and accuracy
  • P-values, quantifying uncertainty, and generating confidence intervals using the bootstrap method
  • How to interpret the results from hypothesis testing

Prerequisites

Foundations of Data Science: Computational Thinking with Python

Meet Your Instructors

Ani Adhikari

Teaching Professor of Statistics at UC Berkeley Ani Adhikari, Senior Lecturer in Statistics at UC Berkeley, has received the Distinguished Teaching Award at Berkeley and the Dean's Award for Distinguished Teaching at Stanford University. While her research interests are centered on applications of statistics in the natural sciences, her primary focus has always been on teaching and mentoring students. She teaches courses at all levels and has a particular affinity for teaching statistics to students who have little mathematical preparation. She received her undergraduate degree from the Indian Statistical Institute, and her Ph.D. in Statistics from Berkeley.

John DeNero

Giancarlo Teaching Fellow in the EECS Department at UC Berkeley John DeNero is the Giancarlo Teaching Fellow in the UC Berkeley EECS Department. He joined the Cal faculty in 2014 to focus on undergraduate education in computer science and data science. He teaches and co-develops two of the largest courses on campus: introductory computer science for majors (3000 students per year) and introductory data science (1500 students per year).

David Wagner

Professor of Computer Science at UC Berkeley David Wagner is Professor of Computer Science at the University of California at Berkeley. He has published over 100 peer-reviewed papers in the scientific literature and has co-authored two books on encryption and computer security. His research has analyzed and contributed to the security of cellular networks, 802.11 wireless networks, electronic voting systems, and other widely deployed systems.
1354

Experience Level

Introductory

Learning Partner

University of California, Berkeley

Program Type

Professional Certificate

Subject

Data Analysis & Statistics Data Science IT
Advanced Business Harvard X-Series