3149

Please select the start dates for your courses below.

1
Statistical Thinking for Data Science and Analytics

Scheduled Start

2
Machine Learning for Data Science and Analytics

Scheduled Start

3
Enabling Technologies for Data Science and Analytics: The Internet of Things

Scheduled Start

What you will learn

  • The history of data science, tangible illustrations of how data science and analytics are used in decision making across multiple sectors today, and expert opinion on what the future might hold
  • A practical understanding of the fundamental methods used by data scientists including; statistical thinking and conditional probability, machine learning and algorithms, and effective approaches for data visualization
  • The major components of the Internet of Things (IoT) and the potential of IoT to totally transform the way in which we live and work in the not-to-distant future
  • How data scientists are using natural language processing (NLP), audio and video processing to extract useful information from books, scientific articles, twitter feeds, voice recordings, YouTube videos and much more

Program Class List

1
Statistical Thinking for Data Science and Analytics

Course Details
Learn how statistics plays a central role in the data science approach.

2
Machine Learning for Data Science and Analytics

Course Details
Learn the principles of machine learning and the importance of algorithms.

3
Enabling Technologies for Data Science and Analytics: The Internet of Things

Course Details
Discover the relationship between Big Data and the Internet of Things (IoT).

Meet your instructors

Tian Zheng

About Me

Tian Zheng is associate professor of Statistics at Columbia University. She obtained her PhD from Columbia in 2002. Her research is to develop novel methods and improve existing methods for exploring and analyzing interesting patterns in complex data from different application domains. Her current projects are in the fields of statistical genetics, bioinformatics and computational biology, feature selection and classification for high dimensional data, and network analysis. Especially, Dr. Zheng have been developing statistical and computational tools for high dimensional data, searching for genetic interactions associated with complex human disorders, quantifying social structure and studying hard-to-reach populations using survey questions, with more than 40 peer-reviewed publications in journals including JASA, AOAS and PNAS. Her work was recognized with the 2008 Outstanding Statistical Application Award from the American Statistical Association, The Mitchell Prize from ISBA and a Google research award. She is on the editorial board of Statistical Analysis and Data Mining and Frontier in Genetics. She was Associate Editor for JASA from 2007 to 2013.

Kathy McKeown

About Me

A leading scholar and researcher in the field of natural language processing, McKeown focuses her research on big data; her interests include text summarization, question answering, natural language generation, multimedia explanation, digital libraries, and multilingual applications. Her research group's Columbia Newsblaster, which has been live since 2001, is an online system that automatically tracks the day's news, and demonstrates the group's new technologies for multi-document summarization, clustering, and text categorization, among others. Currently, she leads a large research project involving prediction of technology emergence from a large collection of journal articles. McKeown joined Columbia in 1982, immediately after earning her Ph.D. from University of Pennsylvania. In 1989, she became the first woman professor in the school to receive tenure, and later the first woman to serve as a department chair (1998-2003).

Ansaf Salleb-Aouissi

About Me

Ansaf is a Lecturer in discipline of the Computer Science Department at the School of Engineering and Applied Science at Columbia University. She received her her BS in Computer Science in 1996 from the University of Science and Technology (USTHB), Algeria. She earned her masters and Ph.D. degrees in Computer Science from the University of Orleans (France) in 1999 and 2003 respectively.

Cliff Stein

About Me

His research interests include the design and analysis of algorithms, combinatorial optimization, operations research, network algorithms, scheduling, algorithm engineering and computational biology. Professor Stein has published many influential papers in the leading conferences and journals in his field, and has occupied a variety of editorial positions including the journals ACM Transactions on Algorithms, Mathematical Programming, Journal of Algorithms, SIAM Journal on Discrete Mathematics and Operations Research Letters. His work has been supported by the National Science Foundation and Sloan Foundation. He is the winner of several prestigious awards including an NSF Career Award, an Alfred Sloan Research Fellowship and the Karen Wetterhahn Award for Distinguished Creative or Scholarly Achievement. He is also the co-author of the two textbooks. Introduction to Algorithms, with T. Cormen, C. Leiserson and R. Rivest is currently the best-selling textbook in algorithms and has sold over half a million copies and been translated into 15 languages. Discrete Math for Computer Scientists , with Ken Bogart and Scot Drysdale, is a new text book which covers discrete math at an undergraduate level.

David Blei

About Me

David Blei joined Columbia in Fall 2014 as a Professor of Computer Science and Statistics. His research involves probabilistic topic models, Bayesian nonparametric methods, and approximate posterior inference. He works on a variety of applications, including text, images, music, social networks, user behavior, and scientific data. Professor Blei earned his Bachelor's degree in Computer Science and Mathematics from Brown University (1997) and his PhD in Computer Science from the University of California, Berkeley (2004). Before arriving to Columbia, he was an Associate Professor of Computer Science at Princeton University. He has received several awards for his research, including a Sloan Fellowship (2010), Office of Naval Research Young Investigator Award (2011), Presidential Early Career Award for Scientists and Engineers (2011), and Blavatnik Faculty Award (2013).

Itsik Peer

About Me

Itsik Pe’er is an associate professor in the Department of Computer Science. His laboratory develops and applies computational methods for the analysis of high-throughput data in germline human genetics. Specifically, he has a strong interest in isolated populations such as Pacific Islanders and Ashkenazi Jews. The Pe’er Lab has developed methodology to identify hidden relatives — primarily in such isolated populations — that involves inferring their past demography, detecting associations between phenotypes and genetic segments co-inherited from the joint ancestors of hidden relatives, and establishing the exceptional utility of whole-genome sequencing in population genetics. With the arrival of high-throughput sequencing methods, Pe’er has focused on characterizing genetic variation that is unique to isolated populations, including the effects of such variation on phenotype.

Mihalis Yannakakis

About Me

He studied at the National Technical University of Athens (Diploma in Electrical Engineering, 1975), and at Princeton University (PhD in Computer Science, 1979). He worked at Bell Labs Research from 1978 until 2001, as Member of Technical Staff (1978-1991) and as Head of the Computing Principles Research Department (1991-2001). He was Director of Computing Principles Research at Avaya Labs (2001-2002), and Professor of Computer Science at Stanford University (2002-2003). He joined Columbia University in 2004. His research interests include design and analysis of algorithms, complexity theory, combinatorial optimization, game theory, databases, and modeling, verification and testing of reactive systems.

Peter Orbanz

About Me

Before coming to New York, he was a Research Fellow in the Machine Learning Group of Zoubin Ghahramani at the University of Cambridge, and previously a graduate student of Joachim M. Buhmann at ETH Zurich. His main research interests are the statistics of discrete objects and structures: permutations, graphs, partitions, and binary sequences. Most of his recent work concerns representation problems and latent variable algorithms in Bayesian nonparametrics. More generally, he is interested in all mathematical aspects of machine learning and artificial intelligence.
Fred Jiang

Fred Jiang

Assistant Professor in the Electrical Engineering Department at Columbia University
Fred received his B.Sc. (2004) and M.Sc. (2007) in Electrical Engineering and Computer Science, and his Ph.D. (2010) in Computer Science, all from UC Berkeley. Before joining SEAS, he was Senior Staff Researcher and Director of Analytics and IoT Research at Intel Labs China. Fred’s research interests include cyber physical systems and data analytics, smart and sustainable buildings, mobile and wearable systems, environmental monitoring and control, and connected health & fitness. His ACme building energy platform has been widely adopted by universities and industries, including Lawrence Berkeley National Laboratory, National Taiwan University, and several commercial companies. His project on wearable and mobile fitness, in collaboration with University of Virginia, was featured on New Scientist and the Economist magazine. His air-quality monitoring project has been featured on China Central Television and People’s Daily, and was successfully incubated into a startup. He is actively serving on several technical and organizing committees including ACM SenSys, ACM/IEEE IPSN, and ACM BuildSys. He was a National Science Foundation (NSF) Graduate Fellow and a Vodafone-US Foundation Fellow.
Julia Hirschberg

Julia Hirschberg

Percy K. and Vida LW Hudson Professor of Computer Science at Columbia University
Julia Hirschberg does research in prosody, spoken dialogue systems, and emotional and deceptive speech. She received her PhD in Computer Science from the University of Pennsylvania in 1985. She worked at Bell Laboratories and AT&T Laboratories -- Research from 1985-2003 as a Member of Technical Staff and as a Department Head, creating the Human-Computer Interface Research Department at Bell Labs and moving with it to AT&T Labs. She served as editor-in-chief of Computational Linguistics from 1993-2003 and as an editor-in-chief of Speech Communication from 2003-2006. She is on the Editorial Board of Speech Communication and of the Journal of Pragmatics. She was on the Executive Board of the Association for Computational Linguistics (ACL) from 1993-2003, have been on the Permanent Council of International Conference on Spoken Language Processing (ICSLP) since 1996, and served on the board of the International Speech Communication Association (ISCA) from 1999-2007 (as President 2005-2007). She is currently the chair of the ISCA Distinguished Lecturers selection committee. She is on the IEEE SLTC, the executive board of the North American chapter of the Association for Computational Linguistics, the CRA Board of Directors, and the board of the CRA-W. She has been active in working for diversity at AT&T and at Columbia. She has been a fellow of the American Association for Artificial Intelligence since 1994, an ISCA Fellow since 2008, and became an ACL Fellow in the founding group in 2012. She received a Columbia Engineering School Alumni Association (CESAA) Distinguished Faculty Teaching Award in 2009, received an honorary doctorate (hedersdoktor) from KTH in 2007, is the 2011 recipient of the IEEE James L. Flanagan Speech and Audio Processing Award and, also received the ISCA Medal for Scientific Achievement in the same year.
Michael Collins

Michael Collins

Vikram S. Pandit Professor of Computer Science at Columbia University
Michael J. Collins is a researcher in the field of computational linguistics. His research interests are in natural language processing as well as machine learning and he has made important contributions in statistical parsing and in statistical machine learning. One notable contribution is a state-of-the-art parser for the Penn Wall Street Journal corpus. His research covers a wide range of topics such as parse re-ranking, tree kernels, semi-supervised learning, machine translation and exponentiated gradient algorithms with a general focus on discriminative models and structured prediction.
Shih-Fu Chang

Shih-Fu Chang

Richard Dicker Chair Professor at Columbia University
Shih-Fu Chang’s research interest is focused on multimedia retrieval, computer vision, signal processing, and machine learning. He and his students have developed some of the earliest image/video search engines, such as VisualSEEk, VideoQ, and WebSEEk, contributing to the foundation of the vibrant field of content-based visual search and commercial systems for Web image search. Recognized by many best paper awards and high citation impacts, his scholarly work set trends in several important areas, such as compressed-domain video manipulation, video structure parsing, image authentication, large-scale indexing, and video content analysis. His group demonstrated the best performance in video annotation (2008) and multimedia event detection (2010) in the international video retrieval evaluation forum TRECVID. The video concept classifier library, ontology, and annotated video corpora released by his group have been used by more than 100 groups. He co-led the ADVENT university-industry research consortium with the participation of more than 25 industry sponsors. He has received IEEE Signal Processing Society Technical Achievement Award, ACM SIGMM Technical Achievement Award, IEEE Kiyo Tomiyasu award, IBM Faculty award, and Service Recognition Awards from IEEE and ACM. He served as the general co-chair of ACM Multimedia conference in 2000 and 2010, Editor-in-Chief of the IEEE Signal Processing Magazine (2006-8), Chairman of Columbia Electrical Engineering Department (2007-2010), Senior Vice Dean of Columbia Engineering School (2012-date), and advisor for several companies and research institutes. His research has been broadly supported by government agencies as well as many industry sponsors. He is a Fellow of IEEE and the American Association for the Advancement of Science.

Zoran Kostic

About Me

Zoran Kostic completed his Ph.D. in Electrical Engineering at the University of Rochester and his Dipl. Ing. degree at the University of Novi Sad. He spent most of his career in industry where he worked in research, product development and in leadership positions. Zoran's expertise spans mobile data systems, wireless communications, signal processing, multimedia, system-on-chip development and applications of parallel computing. His work comprises a mix of research, system architecture and software/hardware development, which resulted in a notable publication record, three dozen patents, and critical contributions to successful products. He has experience in Intellectual Property consulting. Dr. Kostic is an active member of the IEEE, and he has served as an associate editor of the IEEE Transactions on Communications and IEEE Communications Letters.

Andrew Gelman

About Me

Andrew Gelman is a professor of statistics and political science and director of the Applied Statistics Center at Columbia University. He has received the Outstanding Statistical Application award from the American Statistical Association, the award for best article published in the American Political Science Review, and the Council of Presidents of Statistical Societies award for outstanding contributions by a person under the age of 40. Andrew has done research on a wide range of topics, including: why it is rational to vote; why campaign polls are so variable when elections are so predictable; why redistricting is good for democracy; reversals of death sentences; police stops in New York City, the statistical challenges of estimating small effects; the probability that your vote will be decisive; seats and votes in Congress; social network structure; arsenic in Bangladesh; radon in your basement; toxicology; medical imaging; and methods in surveys, experimental design, statistical inference, computation, and graphics.

David Madigan

About Me

David Madigan received a bachelor’s degree in Mathematical Sciences and a Ph.D. in Statistics, both from Trinity College Dublin. He has previously worked for AT&T Inc., Soliloquy Inc., the University of Washington, Rutgers University, and SkillSoft, Inc. He has over 100 publications in such areas as Bayesian statistics, text mining, Monte Carlo methods, pharmacovigilance and probabilistic graphical models. He is an elected Fellow of the American Statistical Association and of the Institute of Mathematical Statistics. He recently completed a term as Editor-in-Chief of Statistical Science.

Lauren Hannah

About Me

Lauren Hannah is an Assistant Professor in the Department of Statistics at Columbia University. Dr. Hannah received a Ph.D. in Operations Research and Financial Engineering from Princeton University, and an A.B. in Classics, again from Princeton University. After completing her Ph.D., Dr. Hannah completed a postdoc at Duke in the Statistical Science Department. Her interests include machine learning, Bayesian statistics, and energy applications.

Eva Ascarza

About Me

Eva Ascarza is an Assistant Professor of Marketing at Columbia Business School. She is a marketing modeler who uses tools from statistics and economics to answer marketing questions. Her main research areas are customer analytics and pricing in the context of subscription businesses. She specializes in understanding and predicting changes in customer behavior, such as customer retention and usage. Another stream of her research focuses on developing statistical methodologies to be used by marketing practitioners. She received her PhD from London Business School (UK) and a MS in Economics and Finance from Universidad de Navarra (Spain).

James Curley

About Me

Dr. Curley has very broad interests in behavioral development. He has conducted and published research at molecular, systems, organismal and evolutionary levels of analysis in both animals and humans. The focus of Dr. Curley’s lab at Columbia is on the development of social behavior. Dr. Curley is interested in how both inherited genetic variability and social experiences during development can shift individual differences in various aspects of social behavior and what the neuroendocrinological basis of these differences may be. He also researches the reliability and validity of social behavioral tests conducted in the laboratory and whether it is possible to utilize alternative statistical and methodological approaches to more appropriately assess social behavior. Dr Curley believes that it is critical to understand how the 'social brains' of humans and other animals have been differentially shaped by evolution and to acknowledge how this should better inform translational research.
3149

Duration

4 months

Experience Level

Introductory

Learning Partner

Columbia University

Program Type

Professional Certificate

Subject

Data Science