From 5a90f1446e6b3639712ea866cbdd53a062104ed6 Mon Sep 17 00:00:00 2001 From: zmx721 Date: Fri, 5 Jul 2024 13:57:53 -0700 Subject: [PATCH 1/3] update 2024-25 instructors --- descriptions.md | 53 +++++++++++++++++++++++++------------------------ 1 file changed, 27 insertions(+), 26 deletions(-) diff --git a/descriptions.md b/descriptions.md index 4b05b27b04aa..d3d20ff502c9 100644 --- a/descriptions.md +++ b/descriptions.md @@ -4,30 +4,31 @@ title: MDS Courses css: /css/course_desc.css --- -Course Number | Block | Course Title | Short Description |Expanded Description| 2023-24 Lecture Instructor | 2023-24 Lab Instructor | ----------------|---------|----------------------------------------------|------------------------|------------|------------|------------| -[DSCI 511](https://github.com/UBC-MDS/DSCI_511_prog-dsci) | 1 | Programming for Data Science | Pseudo-code. Program design and structure. Flow control. Iteration. Lists (arrays). Functions. File I/O. Classes, objects, methods, and libraries. | Program design and data manipulation with Python. Overview of data structures, iteration, flow control, and program design relevant to data exploration and analysis. When and how to exploit pre-existing libraries. | [Quan Nguyen](https://quannguyen.rbind.io/) | [Quan Nguyen](https://quannguyen.rbind.io/) -[DSCI 521](https://github.com/UBC-MDS/DSCI_521_platforms-dsci) | 1 | Computing Platforms for Data Science | Introduction to software, shells, tools, and file systems for use in the Data Science program. Installation, configuration, and use of statistical and programming software including Integrated Development Environments (IDEs). Problem resolution skills. | How to install, maintain, and use the data scientific software stack. The Unix shell, version control, and problem solving strategies. Literate programming documents. | [Daniel Chen](https://daniel.rbind.io/) | [Daniel Chen](https://daniel.rbind.io/) -[DSCI 523](https://github.com/UBC-MDS/DSCI_523_data-wrangling) | 1 | Programming for Data Manipulation | Program design and data manipulation using industry-standard software tools designed for statistical work. Organizing, filtering, sorting, grouping, reformatting, converting, and cleaning data to prepare it for further analysis. | Program design and data manipulation with R. Organizing, filtering, sorting, grouping, reformatting, converting, and cleaning data to prepare it for further analysis. | [Tiffany Timbers](http://tiffanytimbers.com/) | [Tiffany Timbers](http://tiffanytimbers.com/) -[DSCI 551](https://github.com/UBC-MDS/DSCI_551_stat-prob-dsci) | 1 | Descriptive Statistics and Probability for Data Science | Descriptive statistics including measures of location and spread. Random variables, distributions, and parameters. Categorical variables. Uncertainty. Missing data. | Fundamental concepts in probability including conditional, joint, and marginal distributions. Statistical view of data coming from a probability distribution. | [Alexi Rodríguez-Arelis](https://alexrod.netlify.app/), [Katie Burak](https://www.linkedin.com/in/katie-burak-a9a12615b/) | [Alexi Rodríguez-Arelis](https://alexrod.netlify.app/), [Katie Burak](https://www.linkedin.com/in/katie-burak-a9a12615b/) -[DSCI 512](https://github.com/UBC-MDS/DSCI_512_alg-data-struct) | 2 | Algorithms and Data Structures | Basic algorithms. Recursion. Data structures including linked lists, queues, stacks, trees, graphs, and hash tables. Searching and sorting. Introduction to complexity including Big-O notation, efficiency, and scalability. | How to choose and use appropriate algorithms and data structures to help solve data science problems. Key concepts such as recursion and algorithmic complexity (e.g., efficiency, scalability). | [Jordan Schalm](https://www.jordanschalm.com/) | [Jordan Schalm](https://www.jordanschalm.com/) -[DSCI 531](https://github.com/UBC-MDS/DSCI_531_viz-1) | 2 | Data Visualization I | Descriptive plots using statistical and programming software. Basics, mechanics, and principles of data visualization. | Exploratory data analysis. Design of effective static visualizations. Plotting tools in R and Python. | [Joel Östblom](https://joelostblom.com/) | [Joel Östblom](https://joelostblom.com/) -[DSCI 552](https://github.com/UBC-MDS/DSCI_552_stat-inf-1) | 2 | Statistical Inference and Computation I | Random variables, parameters, observed data, statistics (distinctions and connections). Estimation: point and interval. Two-group comparisons, frequentist version. Simulation-based approaches. | The statistical and probabilistic foundations of inference. Large sample results. The frequentist paradigm. | [Tiffany Timbers](http://tiffanytimbers.com/), [Katie Burak](https://www.linkedin.com/in/katie-burak-a9a12615b/) | [Tiffany Timbers](http://tiffanytimbers.com/), [Katie Burak](https://www.linkedin.com/in/katie-burak-a9a12615b/) -[DSCI 571](https://github.com/UBC-MDS/DSCI_571_sup-learn-1) | 2 | Supervised Learning I | Decision trees. k-th nearest neighbour classifiers. Naive Bayes classifiers. Logistic regression. | Introduction to supervised machine learning. Basic machine learning concepts such as generalization error and overfitting. Various approaches such as K-NN, decision trees, linear classifiers. | [Varada Kolhatkar](https://kvarada.github.io/) | [Varada Kolhatkar](https://kvarada.github.io/) -[DSCI 513](https://github.com/UBC-MDS/DSCI_513_database-data-retr) | 3 | Databases and Data Retrieval | Relational schemas. SQL queries. Database programming using embedded SQL. XML and XQuery. | How to work with data stored in relational database systems. Storage structures and schemas, data relationships, and ways to query and aggregate such data. | [Gittu George](https://www.linkedin.com/in/georgegit/) | [Gittu George](https://www.linkedin.com/in/georgegit/) -[DSCI 522](https://github.com/UBC-MDS/DSCI_522_dsci-workflows) | 3 | Data Science Workflows | Interactive and non-interactive data analysis. Scripting. Dynamic reporting. Reproducibility. Project and file management. Version control. Automated workflows. | Interactive vs. scripted/unattended analyses and how to move fluidly between them. Reproducibility through automation and containerization. | [Tiffany Timbers](http://tiffanytimbers.com/) | [Tiffany Timbers](http://tiffanytimbers.com/) -[DSCI 561](https://github.com/UBC-MDS/DSCI_561_regr-1) | 3 | Regression I | Linear models: continuous response; one or more categorical covariates and/or one or more continuous covariates. | Linear models for a quantitative response variable, with multiple categorical and/or quantitative predictors. Matrix formulation of linear regression. Model assessment and prediction. | [Katie Burak](https://www.linkedin.com/in/katie-burak-a9a12615b/) | [Katie Burak](https://www.linkedin.com/in/katie-burak-a9a12615b/) -[DSCI 573](https://github.com/UBC-MDS/DSCI_573_feat-model-select) | 3 | Feature and Model Selection | Performance of a classification model. Generalization error, overfitting of training data. Shrinkage, feature selection, Akaike Information Criterion, Bayesian Information Criterion. k-fold cross validation. Bootstrapping. Receiver Operating Characteristic curve. Elastic nets, regularization. | How to evaluate and select features and models. Cross-validation, ROC curves, feature engineering, and regularization. | [Joel Östblom](https://joelostblom.com/) | [Joel Östblom](https://joelostblom.com/) -[DSCI 524](https://github.com/UBC-MDS/DSCI_524_collab-sw-dev) | 4 | Collaborative Software Development | Software life cycle. Unit testing. Continuous integration. Submission to a relevant repository for distribution. Packaging for installation and use by others. Software licenses. Classes and abstraction. | How to exploit practices from collaborative software development techniques in data scientific workflows. Appropriate use of abstraction, the software life cycle, unit testing / continuous integration, and packaging for use by others. | [Tiffany Timbers](http://tiffanytimbers.com/) | [Tiffany Timbers](http://tiffanytimbers.com/) -[DSCI 541](https://github.com/UBC-MDS/DSCI_541_priv-eth-sec) | 4 | Privacy, Ethics, and Security | Privacy and data. Ethics boards, legal issues, licensing. Physical and logical data security, social engineering. Encryption, data anonymization, privacy-preserving techniques. Case studies. | The legal, ethical, and security issues concerning data, including aggregated data. Proactive compliance with rules and, in their absence, principles for the responsible management of sensitive data. Case studies. | [Joel Östblom](https://joelostblom.com/) | [Joel Östblom](https://joelostblom.com/) -[DSCI 562](https://ubc-mds.github.io/DSCI_562_regr-2/) | 4 | Regression II | Non-parametric regression and smoothing. Data-driven parameter selection. Robust regression. Mixed effects. | Useful extensions to basic regression, e.g., generalized linear models, mixed effects, smoothing, robust regression, and techniques for dealing with missing data. | [Alexi Rodríguez-Arelis](https://alexrod.netlify.app/) | [Alexi Rodríguez-Arelis](https://alexrod.netlify.app/) -[DSCI 572](https://github.com/UBC-MDS/DSCI_572_sup-learn-2) | 4 | Supervised Learning II | Support Vector Machines. Random Forests. Ensemble Classifiers. Graphical models. | Introduction to numerical optimization (e.g., gradient descent). Neural networks and deep learning. | [Varada Kolhatkar](https://kvarada.github.io/) | [Varada Kolhatkar](https://kvarada.github.io/) -[DSCI 525](https://github.com/UBC-MDS/DSCI_525_web-cloud-comp) | 5 | Web and Cloud Computing | Networks and the Internet, scraping data, APIs, cloud computing, Web services for scalable computing, Web hosting, Web publication platforms, introduction to parallel computing. | How to use the web as a platform for data collection, computation, and publishing. Accessing data via scraping and APIs. Using the cloud for tasks that are beyond the capability of your local computing resources. | [Gittu George](https://www.linkedin.com/in/georgegit/) | [Gittu George](https://www.linkedin.com/in/georgegit/) -[DSCI 553](https://github.com/UBC-MDS/DSCI_553_stat-inf-2) | 5 | Statistical Inference and Computation II | Multiple hypothesis testing, false discovery rate. Two-group comparisons, Bayesian paradigm. | Bayesian reasoning for data science. How to formulate and implement inference using the prior-to-posterior paradigm. | [Alexi Rodríguez-Arelis](https://alexrod.netlify.app/) | [Alexi Rodríguez-Arelis](https://alexrod.netlify.app/) -[DSCI 563](https://github.com/UBC-MDS/DSCI_563_unsup-learn) | 5 | Unsupervised Learning | Unsupervised learning. K-means/medoids. Model-based clustering. Expectation-maximization algorithm. Hierarchical clustering. Dimension reduction. Matrix decomposition. Heatmaps, contour plots, dendograms. | How to find groups and other structure in unlabeled, possibly high dimensional data. Dimension reduction for visualization and data analysis. Clustering, association rules, model fitting via the EM algorithm. | [Varada Kolhatkar](https://kvarada.github.io/) | [Varada Kolhatkar](https://kvarada.github.io/) -[DSCI 574](https://github.com/UBC-MDS/DSCI_574_spat-temp-mod) | 5 | Spatial and Temporal Models | Time series. State space and change point detection. Hidden Markov Models. Gaussian processes. | Model fitting and prediction in the presence of correlation due to temporal and/or spatial association. ARIMA models. | [Katie Burak](https://www.linkedin.com/in/katie-burak-a9a12615b/) | [Katie Burak](https://www.linkedin.com/in/katie-burak-a9a12615b/) -[DSCI 532](https://github.com/UBC-MDS/DSCI_532_viz-2) | 6 | Data Visualization II | Interactive visualization, design choices, dynamic change over time, multiple views, data reduction, dealing with complexity. | How to make principled and effective choices with respect to marks, spatial arrangement, and colour. Analysis, design, and implementation of interactive figures. How to provide multiple views, deal with complexity, and make difficult decisions about data reduction. | [Joel Östblom](https://joelostblom.com/) | [Joel Östblom](https://joelostblom.com/) -[DSCI 542](https://github.com/UBC-MDS/DSCI_542_comm-arg) | 6 | Communication and Argumentation | Claims, reasons, and evidence. Strengths and weaknesses of models. Effective oral and written presentation of scientific results, including interpretation of data and recognition of assumptions, bias, validity, and reliability. Citations, references, and peer-review. | How to interpret and present data science findings to a variety of audiences. Written and spoken presentation skills. | [Quan Nguyen](https://quannguyen.rbind.io/) | [Quan Nguyen](https://quannguyen.rbind.io/) -[DSCI 554](https://github.com/UBC-MDS/DSCI_554_exper-causal-inf) | 6 | Experimentation and Causal Inference | Randomization. A/B testing. Blocked designs. Orthogonality. Batch effects, confounding. Causality. Contemporary examples. Simulations. | Statistical evidence from randomized experiments versus observational studies. Applications of randomization, e.g., A/B testing for website optimization. Methods for dealing with the multiple testing problem. | [Alexi Rodríguez-Arelis](https://alexrod.netlify.app/) | [Alexi Rodríguez-Arelis](https://alexrod.netlify.app/) -[DSCI 575](https://github.com/UBC-MDS/DSCI_575_adv-mach-learn) | 6 | Advanced Machine Learning | Neural networks trained with backpropagation. Deep learning. Overfitting and underfitting. Active data acquisition. Hyperparameter optimization. | Advanced machine learning methods in the context of natural language processing (NLP) applications. Bag of words, recommender systems, topic models, natural language as sequence data, Markov chains, and recurrent neural networks. | [Varada Kolhatkar](https://kvarada.github.io/) | [Varada Kolhatkar](https://kvarada.github.io/) + +Course Number | Block | Course Title | Short Description |Expanded Description| Section 1 Lecture Instructor | Section 1 Lab Instructor | Section 2 Lecture Instructor | Section 2 Lab Instructor | +---------------|---------|----------------------------------------------|------------------------|------------|------------|------------|------------|------------| +[DSCI 511](https://github.com/UBC-MDS/DSCI_511_prog-dsci) | 1 | Programming for Data Science | Pseudo-code. Program design and structure. Flow control. Iteration. Lists (arrays). Functions. File I/O. Classes, objects, methods, and libraries. | Program design and data manipulation with Python. Overview of data structures, iteration, flow control, and program design relevant to data exploration and analysis. When and how to exploit pre-existing libraries. | [Prajeet Bajpai]() | [Prajeet Bajpai]() | [Tiffany Timbers](http://tiffanytimbers.com/) | [Tiffany Timbers](http://tiffanytimbers.com/) +[DSCI 521](https://github.com/UBC-MDS/DSCI_521_platforms-dsci) | 1 | Computing Platforms for Data Science | Introduction to software, shells, tools, and file systems for use in the Data Science program. Installation, configuration, and use of statistical and programming software including Integrated Development Environments (IDEs). Problem resolution skills. | How to install, maintain, and use the data scientific software stack. The Unix shell, version control, and problem solving strategies. Literate programming documents. | [Daniel Chen](https://daniel.rbind.io/) | [Daniel Chen](https://daniel.rbind.io/) | [Andy Tai]() | [Andy Tai]() +[DSCI 523](https://github.com/UBC-MDS/DSCI_523_data-wrangling) | 1 | Programming for Data Manipulation | Program design and data manipulation using industry-standard software tools designed for statistical work. Organizing, filtering, sorting, grouping, reformatting, converting, and cleaning data to prepare it for further analysis. | Program design and data manipulation with R. Organizing, filtering, sorting, grouping, reformatting, converting, and cleaning data to prepare it for further analysis. | [Tiffany Timbers](http://tiffanytimbers.com/) | [Tiffany Timbers](http://tiffanytimbers.com/) | [Gittu George](https://www.linkedin.com/in/georgegit/) | [Gittu George](https://www.linkedin.com/in/georgegit/) +[DSCI 551](https://github.com/UBC-MDS/DSCI_551_stat-prob-dsci) | 1 | Descriptive Statistics and Probability for Data Science | Descriptive statistics including measures of location and spread. Random variables, distributions, and parameters. Categorical variables. Uncertainty. Missing data. | Fundamental concepts in probability including conditional, joint, and marginal distributions. Statistical view of data coming from a probability distribution. | [Vincent Liu](https://vincentliu3.github.io/) | [Vincent Liu](https://vincentliu3.github.io/) | [Alexi Rodríguez-Arelis](https://alexrod.netlify.app/) | [Alexi Rodríguez-Arelis](https://alexrod.netlify.app/) +[DSCI 512](https://github.com/UBC-MDS/DSCI_512_alg-data-struct) | 2 | Algorithms and Data Structures | Basic algorithms. Recursion. Data structures including linked lists, queues, stacks, trees, graphs, and hash tables. Searching and sorting. Introduction to complexity including Big-O notation, efficiency, and scalability. | How to choose and use appropriate algorithms and data structures to help solve data science problems. Key concepts such as recursion and algorithmic complexity (e.g., efficiency, scalability). | [Vincent Liu](https://vincentliu3.github.io/) | [Vincent Liu](https://vincentliu3.github.io/) | [Hedayat Zarkoob]() | [Hedayat Zarkoob]() +[DSCI 531](https://github.com/UBC-MDS/DSCI_531_viz-1) | 2 | Data Visualization I | Descriptive plots using statistical and programming software. Basics, mechanics, and principles of data visualization. | Exploratory data analysis. Design of effective static visualizations. Plotting tools in R and Python. | [Payman]() | [Payman]() | [Joel Östblom](https://joelostblom.com/) | [Joel Östblom](https://joelostblom.com/) +[DSCI 552](https://github.com/UBC-MDS/DSCI_552_stat-inf-1) | 2 | Statistical Inference and Computation I | Random variables, parameters, observed data, statistics (distinctions and connections). Estimation: point and interval. Two-group comparisons, frequentist version. Simulation-based approaches. | The statistical and probabilistic foundations of inference. Large sample results. The frequentist paradigm. | [Katie Burak](https://www.linkedin.com/in/katie-burak-a9a12615b/) | [Katie Burak](https://www.linkedin.com/in/katie-burak-a9a12615b/) | [Alexi Rodríguez-Arelis](https://alexrod.netlify.app/) | [Alexi Rodríguez-Arelis](https://alexrod.netlify.app/) +[DSCI 571](https://github.com/UBC-MDS/DSCI_571_sup-learn-1) | 2 | Supervised Learning I | Decision trees. k-th nearest neighbour classifiers. Naive Bayes classifiers. Logistic regression. | Introduction to supervised machine learning. Basic machine learning concepts such as generalization error and overfitting. Various approaches such as K-NN, decision trees, linear classifiers. | [Prajeet Bajpai]() | [Prajeet Bajpai]() | [Varada Kolhatkar](https://kvarada.github.io/) | [Varada Kolhatkar](https://kvarada.github.io/) +[DSCI 513](https://github.com/UBC-MDS/DSCI_513_database-data-retr) | 3 | Databases and Data Retrieval | Relational schemas. SQL queries. Database programming using embedded SQL. XML and XQuery. | How to work with data stored in relational database systems. Storage structures and schemas, data relationships, and ways to query and aggregate such data. | [Gittu George](https://www.linkedin.com/in/georgegit/) | [Prajeet Bajpai]() | [Gittu George](https://www.linkedin.com/in/georgegit/) | [Andy Tai]() +[DSCI 522](https://github.com/UBC-MDS/DSCI_522_dsci-workflows) | 3 | Data Science Workflows | Interactive and non-interactive data analysis. Scripting. Dynamic reporting. Reproducibility. Project and file management. Version control. Automated workflows. | Interactive vs. scripted/unattended analyses and how to move fluidly between them. Reproducibility through automation and containerization. | [Tiffany Timbers](http://tiffanytimbers.com/) | [Tiffany Timbers](http://tiffanytimbers.com/) | [Daniel Chen](https://daniel.rbind.io/) | [Daniel Chen](https://daniel.rbind.io/) +[DSCI 561](https://github.com/UBC-MDS/DSCI_561_regr-1) | 3 | Regression I | Linear models: continuous response; one or more categorical covariates and/or one or more continuous covariates. | Linear models for a quantitative response variable, with multiple categorical and/or quantitative predictors. Matrix formulation of linear regression. Model assessment and prediction. | [Katie Burak](https://www.linkedin.com/in/katie-burak-a9a12615b/) | [Katie Burak](https://www.linkedin.com/in/katie-burak-a9a12615b/) | [Payman]() | [Payman]() +[DSCI 573](https://github.com/UBC-MDS/DSCI_573_feat-model-select) | 3 | Feature and Model Selection | Performance of a classification model. Generalization error, overfitting of training data. Shrinkage, feature selection, Akaike Information Criterion, Bayesian Information Criterion. k-fold cross validation. Bootstrapping. Receiver Operating Characteristic curve. Elastic nets, regularization. | How to evaluate and select features and models. Cross-validation, ROC curves, feature engineering, and regularization. | [Vincent Liu](https://vincentliu3.github.io/) | [Vincent Liu](https://vincentliu3.github.io/) | [Joel Östblom](https://joelostblom.com/) | [Joel Östblom](https://joelostblom.com/) +[DSCI 524](https://github.com/UBC-MDS/DSCI_524_collab-sw-dev) | 4 | Collaborative Software Development | Software life cycle. Unit testing. Continuous integration. Submission to a relevant repository for distribution. Packaging for installation and use by others. Software licenses. Classes and abstraction. | How to exploit practices from collaborative software development techniques in data scientific workflows. Appropriate use of abstraction, the software life cycle, unit testing / continuous integration, and packaging for use by others. | [Tiffany Timbers](http://tiffanytimbers.com/) | [Tiffany Timbers](http://tiffanytimbers.com/) | [Daniel Chen](https://daniel.rbind.io/) | [Daniel Chen](https://daniel.rbind.io/) +[DSCI 541](https://github.com/UBC-MDS/DSCI_541_priv-eth-sec) | 4 | Privacy, Ethics, and Security | Privacy and data. Ethics boards, legal issues, licensing. Physical and logical data security, social engineering. Encryption, data anonymization, privacy-preserving techniques. Case studies. | The legal, ethical, and security issues concerning data, including aggregated data. Proactive compliance with rules and, in their absence, principles for the responsible management of sensitive data. Case studies. | [Hedayat Zarkoob]() | [Hedayat Zarkoob]() | [Joel Östblom](https://joelostblom.com/) | [Joel Östblom](https://joelostblom.com/) +[DSCI 562](https://ubc-mds.github.io/DSCI_562_regr-2/) | 4 | Regression II | Non-parametric regression and smoothing. Data-driven parameter selection. Robust regression. Mixed effects. | Useful extensions to basic regression, e.g., generalized linear models, mixed effects, smoothing, robust regression, and techniques for dealing with missing data. | [Payman]() | [Payman]() | [Alexi Rodríguez-Arelis](https://alexrod.netlify.app/) | [Alexi Rodríguez-Arelis](https://alexrod.netlify.app/) +[DSCI 572](https://github.com/UBC-MDS/DSCI_572_sup-learn-2) | 4 | Supervised Learning II | Support Vector Machines. Random Forests. Ensemble Classifiers. Graphical models. | Introduction to numerical optimization (e.g., gradient descent). Neural networks and deep learning. | [Prajeet Bajpai]() | [Prajeet Bajpai]() | [Varada Kolhatkar](https://kvarada.github.io/) | [Varada Kolhatkar](https://kvarada.github.io/) +[DSCI 525](https://github.com/UBC-MDS/DSCI_525_web-cloud-comp) | 5 | Web and Cloud Computing | Networks and the Internet, scraping data, APIs, cloud computing, Web services for scalable computing, Web hosting, Web publication platforms, introduction to parallel computing. | How to use the web as a platform for data collection, computation, and publishing. Accessing data via scraping and APIs. Using the cloud for tasks that are beyond the capability of your local computing resources. | [Ilya Musabirov]() | [Ilya Musabirov]() | [Gittu George](https://www.linkedin.com/in/georgegit/) | [Gittu George](https://www.linkedin.com/in/georgegit/) +[DSCI 553](https://github.com/UBC-MDS/DSCI_553_stat-inf-2) | 5 | Statistical Inference and Computation II | Multiple hypothesis testing, false discovery rate. Two-group comparisons, Bayesian paradigm. | Bayesian reasoning for data science. How to formulate and implement inference using the prior-to-posterior paradigm. | [Hedayat Zarkoob]() | [Hedayat Zarkoob]() | [Alexi Rodríguez-Arelis](https://alexrod.netlify.app/) | [Alexi Rodríguez-Arelis](https://alexrod.netlify.app/) +[DSCI 563](https://github.com/UBC-MDS/DSCI_563_unsup-learn) | 5 | Unsupervised Learning | Unsupervised learning. K-means/medoids. Model-based clustering. Expectation-maximization algorithm. Hierarchical clustering. Dimension reduction. Matrix decomposition. Heatmaps, contour plots, dendograms. | How to find groups and other structure in unlabeled, possibly high dimensional data. Dimension reduction for visualization and data analysis. Clustering, association rules, model fitting via the EM algorithm. | [Prajeet Bajpai]() | [Prajeet Bajpai]() | [Varada Kolhatkar](https://kvarada.github.io/) | [Varada Kolhatkar](https://kvarada.github.io/) +[DSCI 574](https://github.com/UBC-MDS/DSCI_574_spat-temp-mod) | 5 | Spatial and Temporal Models | Time series. State space and change point detection. Hidden Markov Models. Gaussian processes. | Model fitting and prediction in the presence of correlation due to temporal and/or spatial association. ARIMA models. | [Katie Burak](https://www.linkedin.com/in/katie-burak-a9a12615b/) | [Katie Burak](https://www.linkedin.com/in/katie-burak-a9a12615b/) | [Vincent Liu](https://vincentliu3.github.io/) | [Vincent Liu](https://vincentliu3.github.io/) +[DSCI 532](https://github.com/UBC-MDS/DSCI_532_viz-2) | 6 | Data Visualization II | Interactive visualization, design choices, dynamic change over time, multiple views, data reduction, dealing with complexity. | How to make principled and effective choices with respect to marks, spatial arrangement, and colour. Analysis, design, and implementation of interactive figures. How to provide multiple views, deal with complexity, and make difficult decisions about data reduction. | [Daniel Chen](https://daniel.rbind.io/) | [Daniel Chen](https://daniel.rbind.io/) | [Joel Östblom](https://joelostblom.com/) | [Joel Östblom](https://joelostblom.com/) +[DSCI 542](https://github.com/UBC-MDS/DSCI_542_comm-arg) | 6 | Communication and Argumentation | Claims, reasons, and evidence. Strengths and weaknesses of models. Effective oral and written presentation of scientific results, including interpretation of data and recognition of assumptions, bias, validity, and reliability. Citations, references, and peer-review. | How to interpret and present data science findings to a variety of audiences. Written and spoken presentation skills. | [Andy Tai]() | [Andy Tai]() | [Hedayat Zarkoob]() | [Hedayat Zarkoob]() +[DSCI 554](https://github.com/UBC-MDS/DSCI_554_exper-causal-inf) | 6 | Experimentation and Causal Inference | Randomization. A/B testing. Blocked designs. Orthogonality. Batch effects, confounding. Causality. Contemporary examples. Simulations. | Statistical evidence from randomized experiments versus observational studies. Applications of randomization, e.g., A/B testing for website optimization. Methods for dealing with the multiple testing problem. | [Katie Burak](https://www.linkedin.com/in/katie-burak-a9a12615b/) | [Katie Burak](https://www.linkedin.com/in/katie-burak-a9a12615b/) | [Payman]() | [Payman]() +[DSCI 575](https://github.com/UBC-MDS/DSCI_575_adv-mach-learn) | 6 | Advanced Machine Learning | Neural networks trained with backpropagation. Deep learning. Overfitting and underfitting. Active data acquisition. Hyperparameter optimization. | Advanced machine learning methods in the context of natural language processing (NLP) applications. Bag of words, recommender systems, topic models, natural language as sequence data, Markov chains, and recurrent neural networks. | [Vincent Liu](https://vincentliu3.github.io/) | [Vincent Liu](https://vincentliu3.github.io/) | [Varada Kolhatkar](https://kvarada.github.io/) | [Varada Kolhatkar](https://kvarada.github.io/) [DSCI 591](https://github.com/UBC-MDS/DSCI_591_capstone-proj) | 7 | Capstone Project | A capstone design project designed to give students experience in leading complex multidisciplinary projects relevant to data science. | A mentored group project based on real data and questions from a partner within or outside the university. Students will formulate questions and design and execute a suitable analysis plan. The group will work collaboratively to produce a reproducible analysis pipeline, project report, presentation and possibly other products, such as a dashboard. | MDS teaching team | MDS teaching team From 0416bb376119614a16d28dac7836693653f2c5f2 Mon Sep 17 00:00:00 2001 From: zmx721 Date: Fri, 5 Jul 2024 14:08:52 -0700 Subject: [PATCH 2/3] update instructors --- descriptions.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/descriptions.md b/descriptions.md index d3d20ff502c9..55d94f91bab7 100644 --- a/descriptions.md +++ b/descriptions.md @@ -12,16 +12,16 @@ Course Number | Block | Course Title | Short [DSCI 523](https://github.com/UBC-MDS/DSCI_523_data-wrangling) | 1 | Programming for Data Manipulation | Program design and data manipulation using industry-standard software tools designed for statistical work. Organizing, filtering, sorting, grouping, reformatting, converting, and cleaning data to prepare it for further analysis. | Program design and data manipulation with R. Organizing, filtering, sorting, grouping, reformatting, converting, and cleaning data to prepare it for further analysis. | [Tiffany Timbers](http://tiffanytimbers.com/) | [Tiffany Timbers](http://tiffanytimbers.com/) | [Gittu George](https://www.linkedin.com/in/georgegit/) | [Gittu George](https://www.linkedin.com/in/georgegit/) [DSCI 551](https://github.com/UBC-MDS/DSCI_551_stat-prob-dsci) | 1 | Descriptive Statistics and Probability for Data Science | Descriptive statistics including measures of location and spread. Random variables, distributions, and parameters. Categorical variables. Uncertainty. Missing data. | Fundamental concepts in probability including conditional, joint, and marginal distributions. Statistical view of data coming from a probability distribution. | [Vincent Liu](https://vincentliu3.github.io/) | [Vincent Liu](https://vincentliu3.github.io/) | [Alexi Rodríguez-Arelis](https://alexrod.netlify.app/) | [Alexi Rodríguez-Arelis](https://alexrod.netlify.app/) [DSCI 512](https://github.com/UBC-MDS/DSCI_512_alg-data-struct) | 2 | Algorithms and Data Structures | Basic algorithms. Recursion. Data structures including linked lists, queues, stacks, trees, graphs, and hash tables. Searching and sorting. Introduction to complexity including Big-O notation, efficiency, and scalability. | How to choose and use appropriate algorithms and data structures to help solve data science problems. Key concepts such as recursion and algorithmic complexity (e.g., efficiency, scalability). | [Vincent Liu](https://vincentliu3.github.io/) | [Vincent Liu](https://vincentliu3.github.io/) | [Hedayat Zarkoob]() | [Hedayat Zarkoob]() -[DSCI 531](https://github.com/UBC-MDS/DSCI_531_viz-1) | 2 | Data Visualization I | Descriptive plots using statistical and programming software. Basics, mechanics, and principles of data visualization. | Exploratory data analysis. Design of effective static visualizations. Plotting tools in R and Python. | [Payman]() | [Payman]() | [Joel Östblom](https://joelostblom.com/) | [Joel Östblom](https://joelostblom.com/) +[DSCI 531](https://github.com/UBC-MDS/DSCI_531_viz-1) | 2 | Data Visualization I | Descriptive plots using statistical and programming software. Basics, mechanics, and principles of data visualization. | Exploratory data analysis. Design of effective static visualizations. Plotting tools in R and Python. | [Payman Nickchi]() | [Payman Nickchi]() | [Joel Östblom](https://joelostblom.com/) | [Joel Östblom](https://joelostblom.com/) [DSCI 552](https://github.com/UBC-MDS/DSCI_552_stat-inf-1) | 2 | Statistical Inference and Computation I | Random variables, parameters, observed data, statistics (distinctions and connections). Estimation: point and interval. Two-group comparisons, frequentist version. Simulation-based approaches. | The statistical and probabilistic foundations of inference. Large sample results. The frequentist paradigm. | [Katie Burak](https://www.linkedin.com/in/katie-burak-a9a12615b/) | [Katie Burak](https://www.linkedin.com/in/katie-burak-a9a12615b/) | [Alexi Rodríguez-Arelis](https://alexrod.netlify.app/) | [Alexi Rodríguez-Arelis](https://alexrod.netlify.app/) [DSCI 571](https://github.com/UBC-MDS/DSCI_571_sup-learn-1) | 2 | Supervised Learning I | Decision trees. k-th nearest neighbour classifiers. Naive Bayes classifiers. Logistic regression. | Introduction to supervised machine learning. Basic machine learning concepts such as generalization error and overfitting. Various approaches such as K-NN, decision trees, linear classifiers. | [Prajeet Bajpai]() | [Prajeet Bajpai]() | [Varada Kolhatkar](https://kvarada.github.io/) | [Varada Kolhatkar](https://kvarada.github.io/) [DSCI 513](https://github.com/UBC-MDS/DSCI_513_database-data-retr) | 3 | Databases and Data Retrieval | Relational schemas. SQL queries. Database programming using embedded SQL. XML and XQuery. | How to work with data stored in relational database systems. Storage structures and schemas, data relationships, and ways to query and aggregate such data. | [Gittu George](https://www.linkedin.com/in/georgegit/) | [Prajeet Bajpai]() | [Gittu George](https://www.linkedin.com/in/georgegit/) | [Andy Tai]() [DSCI 522](https://github.com/UBC-MDS/DSCI_522_dsci-workflows) | 3 | Data Science Workflows | Interactive and non-interactive data analysis. Scripting. Dynamic reporting. Reproducibility. Project and file management. Version control. Automated workflows. | Interactive vs. scripted/unattended analyses and how to move fluidly between them. Reproducibility through automation and containerization. | [Tiffany Timbers](http://tiffanytimbers.com/) | [Tiffany Timbers](http://tiffanytimbers.com/) | [Daniel Chen](https://daniel.rbind.io/) | [Daniel Chen](https://daniel.rbind.io/) -[DSCI 561](https://github.com/UBC-MDS/DSCI_561_regr-1) | 3 | Regression I | Linear models: continuous response; one or more categorical covariates and/or one or more continuous covariates. | Linear models for a quantitative response variable, with multiple categorical and/or quantitative predictors. Matrix formulation of linear regression. Model assessment and prediction. | [Katie Burak](https://www.linkedin.com/in/katie-burak-a9a12615b/) | [Katie Burak](https://www.linkedin.com/in/katie-burak-a9a12615b/) | [Payman]() | [Payman]() +[DSCI 561](https://github.com/UBC-MDS/DSCI_561_regr-1) | 3 | Regression I | Linear models: continuous response; one or more categorical covariates and/or one or more continuous covariates. | Linear models for a quantitative response variable, with multiple categorical and/or quantitative predictors. Matrix formulation of linear regression. Model assessment and prediction. | [Katie Burak](https://www.linkedin.com/in/katie-burak-a9a12615b/) | [Katie Burak](https://www.linkedin.com/in/katie-burak-a9a12615b/) | [Payman Nickchi]() | [Payman Nickchi]() [DSCI 573](https://github.com/UBC-MDS/DSCI_573_feat-model-select) | 3 | Feature and Model Selection | Performance of a classification model. Generalization error, overfitting of training data. Shrinkage, feature selection, Akaike Information Criterion, Bayesian Information Criterion. k-fold cross validation. Bootstrapping. Receiver Operating Characteristic curve. Elastic nets, regularization. | How to evaluate and select features and models. Cross-validation, ROC curves, feature engineering, and regularization. | [Vincent Liu](https://vincentliu3.github.io/) | [Vincent Liu](https://vincentliu3.github.io/) | [Joel Östblom](https://joelostblom.com/) | [Joel Östblom](https://joelostblom.com/) [DSCI 524](https://github.com/UBC-MDS/DSCI_524_collab-sw-dev) | 4 | Collaborative Software Development | Software life cycle. Unit testing. Continuous integration. Submission to a relevant repository for distribution. Packaging for installation and use by others. Software licenses. Classes and abstraction. | How to exploit practices from collaborative software development techniques in data scientific workflows. Appropriate use of abstraction, the software life cycle, unit testing / continuous integration, and packaging for use by others. | [Tiffany Timbers](http://tiffanytimbers.com/) | [Tiffany Timbers](http://tiffanytimbers.com/) | [Daniel Chen](https://daniel.rbind.io/) | [Daniel Chen](https://daniel.rbind.io/) [DSCI 541](https://github.com/UBC-MDS/DSCI_541_priv-eth-sec) | 4 | Privacy, Ethics, and Security | Privacy and data. Ethics boards, legal issues, licensing. Physical and logical data security, social engineering. Encryption, data anonymization, privacy-preserving techniques. Case studies. | The legal, ethical, and security issues concerning data, including aggregated data. Proactive compliance with rules and, in their absence, principles for the responsible management of sensitive data. Case studies. | [Hedayat Zarkoob]() | [Hedayat Zarkoob]() | [Joel Östblom](https://joelostblom.com/) | [Joel Östblom](https://joelostblom.com/) -[DSCI 562](https://ubc-mds.github.io/DSCI_562_regr-2/) | 4 | Regression II | Non-parametric regression and smoothing. Data-driven parameter selection. Robust regression. Mixed effects. | Useful extensions to basic regression, e.g., generalized linear models, mixed effects, smoothing, robust regression, and techniques for dealing with missing data. | [Payman]() | [Payman]() | [Alexi Rodríguez-Arelis](https://alexrod.netlify.app/) | [Alexi Rodríguez-Arelis](https://alexrod.netlify.app/) +[DSCI 562](https://ubc-mds.github.io/DSCI_562_regr-2/) | 4 | Regression II | Non-parametric regression and smoothing. Data-driven parameter selection. Robust regression. Mixed effects. | Useful extensions to basic regression, e.g., generalized linear models, mixed effects, smoothing, robust regression, and techniques for dealing with missing data. | [Payman Nickchi]() | [Payman Nickchi]() | [Alexi Rodríguez-Arelis](https://alexrod.netlify.app/) | [Alexi Rodríguez-Arelis](https://alexrod.netlify.app/) [DSCI 572](https://github.com/UBC-MDS/DSCI_572_sup-learn-2) | 4 | Supervised Learning II | Support Vector Machines. Random Forests. Ensemble Classifiers. Graphical models. | Introduction to numerical optimization (e.g., gradient descent). Neural networks and deep learning. | [Prajeet Bajpai]() | [Prajeet Bajpai]() | [Varada Kolhatkar](https://kvarada.github.io/) | [Varada Kolhatkar](https://kvarada.github.io/) [DSCI 525](https://github.com/UBC-MDS/DSCI_525_web-cloud-comp) | 5 | Web and Cloud Computing | Networks and the Internet, scraping data, APIs, cloud computing, Web services for scalable computing, Web hosting, Web publication platforms, introduction to parallel computing. | How to use the web as a platform for data collection, computation, and publishing. Accessing data via scraping and APIs. Using the cloud for tasks that are beyond the capability of your local computing resources. | [Ilya Musabirov]() | [Ilya Musabirov]() | [Gittu George](https://www.linkedin.com/in/georgegit/) | [Gittu George](https://www.linkedin.com/in/georgegit/) [DSCI 553](https://github.com/UBC-MDS/DSCI_553_stat-inf-2) | 5 | Statistical Inference and Computation II | Multiple hypothesis testing, false discovery rate. Two-group comparisons, Bayesian paradigm. | Bayesian reasoning for data science. How to formulate and implement inference using the prior-to-posterior paradigm. | [Hedayat Zarkoob]() | [Hedayat Zarkoob]() | [Alexi Rodríguez-Arelis](https://alexrod.netlify.app/) | [Alexi Rodríguez-Arelis](https://alexrod.netlify.app/) @@ -29,6 +29,6 @@ Course Number | Block | Course Title | Short [DSCI 574](https://github.com/UBC-MDS/DSCI_574_spat-temp-mod) | 5 | Spatial and Temporal Models | Time series. State space and change point detection. Hidden Markov Models. Gaussian processes. | Model fitting and prediction in the presence of correlation due to temporal and/or spatial association. ARIMA models. | [Katie Burak](https://www.linkedin.com/in/katie-burak-a9a12615b/) | [Katie Burak](https://www.linkedin.com/in/katie-burak-a9a12615b/) | [Vincent Liu](https://vincentliu3.github.io/) | [Vincent Liu](https://vincentliu3.github.io/) [DSCI 532](https://github.com/UBC-MDS/DSCI_532_viz-2) | 6 | Data Visualization II | Interactive visualization, design choices, dynamic change over time, multiple views, data reduction, dealing with complexity. | How to make principled and effective choices with respect to marks, spatial arrangement, and colour. Analysis, design, and implementation of interactive figures. How to provide multiple views, deal with complexity, and make difficult decisions about data reduction. | [Daniel Chen](https://daniel.rbind.io/) | [Daniel Chen](https://daniel.rbind.io/) | [Joel Östblom](https://joelostblom.com/) | [Joel Östblom](https://joelostblom.com/) [DSCI 542](https://github.com/UBC-MDS/DSCI_542_comm-arg) | 6 | Communication and Argumentation | Claims, reasons, and evidence. Strengths and weaknesses of models. Effective oral and written presentation of scientific results, including interpretation of data and recognition of assumptions, bias, validity, and reliability. Citations, references, and peer-review. | How to interpret and present data science findings to a variety of audiences. Written and spoken presentation skills. | [Andy Tai]() | [Andy Tai]() | [Hedayat Zarkoob]() | [Hedayat Zarkoob]() -[DSCI 554](https://github.com/UBC-MDS/DSCI_554_exper-causal-inf) | 6 | Experimentation and Causal Inference | Randomization. A/B testing. Blocked designs. Orthogonality. Batch effects, confounding. Causality. Contemporary examples. Simulations. | Statistical evidence from randomized experiments versus observational studies. Applications of randomization, e.g., A/B testing for website optimization. Methods for dealing with the multiple testing problem. | [Katie Burak](https://www.linkedin.com/in/katie-burak-a9a12615b/) | [Katie Burak](https://www.linkedin.com/in/katie-burak-a9a12615b/) | [Payman]() | [Payman]() +[DSCI 554](https://github.com/UBC-MDS/DSCI_554_exper-causal-inf) | 6 | Experimentation and Causal Inference | Randomization. A/B testing. Blocked designs. Orthogonality. Batch effects, confounding. Causality. Contemporary examples. Simulations. | Statistical evidence from randomized experiments versus observational studies. Applications of randomization, e.g., A/B testing for website optimization. Methods for dealing with the multiple testing problem. | [Katie Burak](https://www.linkedin.com/in/katie-burak-a9a12615b/) | [Katie Burak](https://www.linkedin.com/in/katie-burak-a9a12615b/) | [Payman Nickchi]() | [Payman Nickchi]() [DSCI 575](https://github.com/UBC-MDS/DSCI_575_adv-mach-learn) | 6 | Advanced Machine Learning | Neural networks trained with backpropagation. Deep learning. Overfitting and underfitting. Active data acquisition. Hyperparameter optimization. | Advanced machine learning methods in the context of natural language processing (NLP) applications. Bag of words, recommender systems, topic models, natural language as sequence data, Markov chains, and recurrent neural networks. | [Vincent Liu](https://vincentliu3.github.io/) | [Vincent Liu](https://vincentliu3.github.io/) | [Varada Kolhatkar](https://kvarada.github.io/) | [Varada Kolhatkar](https://kvarada.github.io/) [DSCI 591](https://github.com/UBC-MDS/DSCI_591_capstone-proj) | 7 | Capstone Project | A capstone design project designed to give students experience in leading complex multidisciplinary projects relevant to data science. | A mentored group project based on real data and questions from a partner within or outside the university. Students will formulate questions and design and execute a suitable analysis plan. The group will work collaboratively to produce a reproducible analysis pipeline, project report, presentation and possibly other products, such as a dashboard. | MDS teaching team | MDS teaching team From d59b7593b6cea90b7b0b7d0adcc2eb5653c38d76 Mon Sep 17 00:00:00 2001 From: zmx721 Date: Tue, 16 Jul 2024 09:59:10 -0700 Subject: [PATCH 3/3] update 2024-25 course descriptions --- descriptions.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/descriptions.md b/descriptions.md index 55d94f91bab7..2b426cc48b3c 100644 --- a/descriptions.md +++ b/descriptions.md @@ -7,28 +7,28 @@ css: /css/course_desc.css Course Number | Block | Course Title | Short Description |Expanded Description| Section 1 Lecture Instructor | Section 1 Lab Instructor | Section 2 Lecture Instructor | Section 2 Lab Instructor | ---------------|---------|----------------------------------------------|------------------------|------------|------------|------------|------------|------------| -[DSCI 511](https://github.com/UBC-MDS/DSCI_511_prog-dsci) | 1 | Programming for Data Science | Pseudo-code. Program design and structure. Flow control. Iteration. Lists (arrays). Functions. File I/O. Classes, objects, methods, and libraries. | Program design and data manipulation with Python. Overview of data structures, iteration, flow control, and program design relevant to data exploration and analysis. When and how to exploit pre-existing libraries. | [Prajeet Bajpai]() | [Prajeet Bajpai]() | [Tiffany Timbers](http://tiffanytimbers.com/) | [Tiffany Timbers](http://tiffanytimbers.com/) -[DSCI 521](https://github.com/UBC-MDS/DSCI_521_platforms-dsci) | 1 | Computing Platforms for Data Science | Introduction to software, shells, tools, and file systems for use in the Data Science program. Installation, configuration, and use of statistical and programming software including Integrated Development Environments (IDEs). Problem resolution skills. | How to install, maintain, and use the data scientific software stack. The Unix shell, version control, and problem solving strategies. Literate programming documents. | [Daniel Chen](https://daniel.rbind.io/) | [Daniel Chen](https://daniel.rbind.io/) | [Andy Tai]() | [Andy Tai]() +[DSCI 511](https://github.com/UBC-MDS/DSCI_511_prog-dsci) | 1 | Programming for Data Science | Pseudo-code. Program design and structure. Flow control. Iteration. Lists (arrays). Functions. File I/O. Classes, objects, methods, and libraries. | Program design and data manipulation with Python. Overview of data structures, iteration, flow control, and program design relevant to data exploration and analysis. When and how to exploit pre-existing libraries. | [Prajeet Bajpai](https://p-bajpai.github.io) | [Prajeet Bajpai](https://p-bajpai.github.io) | [Tiffany Timbers](http://tiffanytimbers.com/) | [Tiffany Timbers](http://tiffanytimbers.com/) +[DSCI 521](https://github.com/UBC-MDS/DSCI_521_platforms-dsci) | 1 | Computing Platforms for Data Science | Introduction to software, shells, tools, and file systems for use in the Data Science program. Installation, configuration, and use of statistical and programming software including Integrated Development Environments (IDEs). Problem resolution skills. | How to install, maintain, and use the data scientific software stack. The Unix shell, version control, and problem solving strategies. Literate programming documents. | [Daniel Chen](https://daniel.rbind.io/) | [Daniel Chen](https://daniel.rbind.io/) | [Andy Tai](https://andytai7.github.io/Andy-Tai/) | [Andy Tai](https://andytai7.github.io/Andy-Tai/) [DSCI 523](https://github.com/UBC-MDS/DSCI_523_data-wrangling) | 1 | Programming for Data Manipulation | Program design and data manipulation using industry-standard software tools designed for statistical work. Organizing, filtering, sorting, grouping, reformatting, converting, and cleaning data to prepare it for further analysis. | Program design and data manipulation with R. Organizing, filtering, sorting, grouping, reformatting, converting, and cleaning data to prepare it for further analysis. | [Tiffany Timbers](http://tiffanytimbers.com/) | [Tiffany Timbers](http://tiffanytimbers.com/) | [Gittu George](https://www.linkedin.com/in/georgegit/) | [Gittu George](https://www.linkedin.com/in/georgegit/) [DSCI 551](https://github.com/UBC-MDS/DSCI_551_stat-prob-dsci) | 1 | Descriptive Statistics and Probability for Data Science | Descriptive statistics including measures of location and spread. Random variables, distributions, and parameters. Categorical variables. Uncertainty. Missing data. | Fundamental concepts in probability including conditional, joint, and marginal distributions. Statistical view of data coming from a probability distribution. | [Vincent Liu](https://vincentliu3.github.io/) | [Vincent Liu](https://vincentliu3.github.io/) | [Alexi Rodríguez-Arelis](https://alexrod.netlify.app/) | [Alexi Rodríguez-Arelis](https://alexrod.netlify.app/) [DSCI 512](https://github.com/UBC-MDS/DSCI_512_alg-data-struct) | 2 | Algorithms and Data Structures | Basic algorithms. Recursion. Data structures including linked lists, queues, stacks, trees, graphs, and hash tables. Searching and sorting. Introduction to complexity including Big-O notation, efficiency, and scalability. | How to choose and use appropriate algorithms and data structures to help solve data science problems. Key concepts such as recursion and algorithmic complexity (e.g., efficiency, scalability). | [Vincent Liu](https://vincentliu3.github.io/) | [Vincent Liu](https://vincentliu3.github.io/) | [Hedayat Zarkoob]() | [Hedayat Zarkoob]() [DSCI 531](https://github.com/UBC-MDS/DSCI_531_viz-1) | 2 | Data Visualization I | Descriptive plots using statistical and programming software. Basics, mechanics, and principles of data visualization. | Exploratory data analysis. Design of effective static visualizations. Plotting tools in R and Python. | [Payman Nickchi]() | [Payman Nickchi]() | [Joel Östblom](https://joelostblom.com/) | [Joel Östblom](https://joelostblom.com/) [DSCI 552](https://github.com/UBC-MDS/DSCI_552_stat-inf-1) | 2 | Statistical Inference and Computation I | Random variables, parameters, observed data, statistics (distinctions and connections). Estimation: point and interval. Two-group comparisons, frequentist version. Simulation-based approaches. | The statistical and probabilistic foundations of inference. Large sample results. The frequentist paradigm. | [Katie Burak](https://www.linkedin.com/in/katie-burak-a9a12615b/) | [Katie Burak](https://www.linkedin.com/in/katie-burak-a9a12615b/) | [Alexi Rodríguez-Arelis](https://alexrod.netlify.app/) | [Alexi Rodríguez-Arelis](https://alexrod.netlify.app/) -[DSCI 571](https://github.com/UBC-MDS/DSCI_571_sup-learn-1) | 2 | Supervised Learning I | Decision trees. k-th nearest neighbour classifiers. Naive Bayes classifiers. Logistic regression. | Introduction to supervised machine learning. Basic machine learning concepts such as generalization error and overfitting. Various approaches such as K-NN, decision trees, linear classifiers. | [Prajeet Bajpai]() | [Prajeet Bajpai]() | [Varada Kolhatkar](https://kvarada.github.io/) | [Varada Kolhatkar](https://kvarada.github.io/) -[DSCI 513](https://github.com/UBC-MDS/DSCI_513_database-data-retr) | 3 | Databases and Data Retrieval | Relational schemas. SQL queries. Database programming using embedded SQL. XML and XQuery. | How to work with data stored in relational database systems. Storage structures and schemas, data relationships, and ways to query and aggregate such data. | [Gittu George](https://www.linkedin.com/in/georgegit/) | [Prajeet Bajpai]() | [Gittu George](https://www.linkedin.com/in/georgegit/) | [Andy Tai]() +[DSCI 571](https://github.com/UBC-MDS/DSCI_571_sup-learn-1) | 2 | Supervised Learning I | Decision trees. k-th nearest neighbour classifiers. Naive Bayes classifiers. Logistic regression. | Introduction to supervised machine learning. Basic machine learning concepts such as generalization error and overfitting. Various approaches such as K-NN, decision trees, linear classifiers. | [Prajeet Bajpai](https://p-bajpai.github.io) | [Prajeet Bajpai](https://p-bajpai.github.io) | [Varada Kolhatkar](https://kvarada.github.io/) | [Varada Kolhatkar](https://kvarada.github.io/) +[DSCI 513](https://github.com/UBC-MDS/DSCI_513_database-data-retr) | 3 | Databases and Data Retrieval | Relational schemas. SQL queries. Database programming using embedded SQL. XML and XQuery. | How to work with data stored in relational database systems. Storage structures and schemas, data relationships, and ways to query and aggregate such data. | [Gittu George](https://www.linkedin.com/in/georgegit/) | [Prajeet Bajpai](https://p-bajpai.github.io) | [Gittu George](https://www.linkedin.com/in/georgegit/) | [Andy Tai](https://andytai7.github.io/Andy-Tai/) [DSCI 522](https://github.com/UBC-MDS/DSCI_522_dsci-workflows) | 3 | Data Science Workflows | Interactive and non-interactive data analysis. Scripting. Dynamic reporting. Reproducibility. Project and file management. Version control. Automated workflows. | Interactive vs. scripted/unattended analyses and how to move fluidly between them. Reproducibility through automation and containerization. | [Tiffany Timbers](http://tiffanytimbers.com/) | [Tiffany Timbers](http://tiffanytimbers.com/) | [Daniel Chen](https://daniel.rbind.io/) | [Daniel Chen](https://daniel.rbind.io/) [DSCI 561](https://github.com/UBC-MDS/DSCI_561_regr-1) | 3 | Regression I | Linear models: continuous response; one or more categorical covariates and/or one or more continuous covariates. | Linear models for a quantitative response variable, with multiple categorical and/or quantitative predictors. Matrix formulation of linear regression. Model assessment and prediction. | [Katie Burak](https://www.linkedin.com/in/katie-burak-a9a12615b/) | [Katie Burak](https://www.linkedin.com/in/katie-burak-a9a12615b/) | [Payman Nickchi]() | [Payman Nickchi]() [DSCI 573](https://github.com/UBC-MDS/DSCI_573_feat-model-select) | 3 | Feature and Model Selection | Performance of a classification model. Generalization error, overfitting of training data. Shrinkage, feature selection, Akaike Information Criterion, Bayesian Information Criterion. k-fold cross validation. Bootstrapping. Receiver Operating Characteristic curve. Elastic nets, regularization. | How to evaluate and select features and models. Cross-validation, ROC curves, feature engineering, and regularization. | [Vincent Liu](https://vincentliu3.github.io/) | [Vincent Liu](https://vincentliu3.github.io/) | [Joel Östblom](https://joelostblom.com/) | [Joel Östblom](https://joelostblom.com/) [DSCI 524](https://github.com/UBC-MDS/DSCI_524_collab-sw-dev) | 4 | Collaborative Software Development | Software life cycle. Unit testing. Continuous integration. Submission to a relevant repository for distribution. Packaging for installation and use by others. Software licenses. Classes and abstraction. | How to exploit practices from collaborative software development techniques in data scientific workflows. Appropriate use of abstraction, the software life cycle, unit testing / continuous integration, and packaging for use by others. | [Tiffany Timbers](http://tiffanytimbers.com/) | [Tiffany Timbers](http://tiffanytimbers.com/) | [Daniel Chen](https://daniel.rbind.io/) | [Daniel Chen](https://daniel.rbind.io/) [DSCI 541](https://github.com/UBC-MDS/DSCI_541_priv-eth-sec) | 4 | Privacy, Ethics, and Security | Privacy and data. Ethics boards, legal issues, licensing. Physical and logical data security, social engineering. Encryption, data anonymization, privacy-preserving techniques. Case studies. | The legal, ethical, and security issues concerning data, including aggregated data. Proactive compliance with rules and, in their absence, principles for the responsible management of sensitive data. Case studies. | [Hedayat Zarkoob]() | [Hedayat Zarkoob]() | [Joel Östblom](https://joelostblom.com/) | [Joel Östblom](https://joelostblom.com/) [DSCI 562](https://ubc-mds.github.io/DSCI_562_regr-2/) | 4 | Regression II | Non-parametric regression and smoothing. Data-driven parameter selection. Robust regression. Mixed effects. | Useful extensions to basic regression, e.g., generalized linear models, mixed effects, smoothing, robust regression, and techniques for dealing with missing data. | [Payman Nickchi]() | [Payman Nickchi]() | [Alexi Rodríguez-Arelis](https://alexrod.netlify.app/) | [Alexi Rodríguez-Arelis](https://alexrod.netlify.app/) -[DSCI 572](https://github.com/UBC-MDS/DSCI_572_sup-learn-2) | 4 | Supervised Learning II | Support Vector Machines. Random Forests. Ensemble Classifiers. Graphical models. | Introduction to numerical optimization (e.g., gradient descent). Neural networks and deep learning. | [Prajeet Bajpai]() | [Prajeet Bajpai]() | [Varada Kolhatkar](https://kvarada.github.io/) | [Varada Kolhatkar](https://kvarada.github.io/) +[DSCI 572](https://github.com/UBC-MDS/DSCI_572_sup-learn-2) | 4 | Supervised Learning II | Support Vector Machines. Random Forests. Ensemble Classifiers. Graphical models. | Introduction to numerical optimization (e.g., gradient descent). Neural networks and deep learning. | [Prajeet Bajpai](https://p-bajpai.github.io) | [Prajeet Bajpai](https://p-bajpai.github.io) | [Varada Kolhatkar](https://kvarada.github.io/) | [Varada Kolhatkar](https://kvarada.github.io/) [DSCI 525](https://github.com/UBC-MDS/DSCI_525_web-cloud-comp) | 5 | Web and Cloud Computing | Networks and the Internet, scraping data, APIs, cloud computing, Web services for scalable computing, Web hosting, Web publication platforms, introduction to parallel computing. | How to use the web as a platform for data collection, computation, and publishing. Accessing data via scraping and APIs. Using the cloud for tasks that are beyond the capability of your local computing resources. | [Ilya Musabirov]() | [Ilya Musabirov]() | [Gittu George](https://www.linkedin.com/in/georgegit/) | [Gittu George](https://www.linkedin.com/in/georgegit/) [DSCI 553](https://github.com/UBC-MDS/DSCI_553_stat-inf-2) | 5 | Statistical Inference and Computation II | Multiple hypothesis testing, false discovery rate. Two-group comparisons, Bayesian paradigm. | Bayesian reasoning for data science. How to formulate and implement inference using the prior-to-posterior paradigm. | [Hedayat Zarkoob]() | [Hedayat Zarkoob]() | [Alexi Rodríguez-Arelis](https://alexrod.netlify.app/) | [Alexi Rodríguez-Arelis](https://alexrod.netlify.app/) -[DSCI 563](https://github.com/UBC-MDS/DSCI_563_unsup-learn) | 5 | Unsupervised Learning | Unsupervised learning. K-means/medoids. Model-based clustering. Expectation-maximization algorithm. Hierarchical clustering. Dimension reduction. Matrix decomposition. Heatmaps, contour plots, dendograms. | How to find groups and other structure in unlabeled, possibly high dimensional data. Dimension reduction for visualization and data analysis. Clustering, association rules, model fitting via the EM algorithm. | [Prajeet Bajpai]() | [Prajeet Bajpai]() | [Varada Kolhatkar](https://kvarada.github.io/) | [Varada Kolhatkar](https://kvarada.github.io/) +[DSCI 563](https://github.com/UBC-MDS/DSCI_563_unsup-learn) | 5 | Unsupervised Learning | Unsupervised learning. K-means/medoids. Model-based clustering. Expectation-maximization algorithm. Hierarchical clustering. Dimension reduction. Matrix decomposition. Heatmaps, contour plots, dendograms. | How to find groups and other structure in unlabeled, possibly high dimensional data. Dimension reduction for visualization and data analysis. Clustering, association rules, model fitting via the EM algorithm. | [Prajeet Bajpai](https://p-bajpai.github.io) | [Prajeet Bajpai](https://p-bajpai.github.io) | [Varada Kolhatkar](https://kvarada.github.io/) | [Varada Kolhatkar](https://kvarada.github.io/) [DSCI 574](https://github.com/UBC-MDS/DSCI_574_spat-temp-mod) | 5 | Spatial and Temporal Models | Time series. State space and change point detection. Hidden Markov Models. Gaussian processes. | Model fitting and prediction in the presence of correlation due to temporal and/or spatial association. ARIMA models. | [Katie Burak](https://www.linkedin.com/in/katie-burak-a9a12615b/) | [Katie Burak](https://www.linkedin.com/in/katie-burak-a9a12615b/) | [Vincent Liu](https://vincentliu3.github.io/) | [Vincent Liu](https://vincentliu3.github.io/) [DSCI 532](https://github.com/UBC-MDS/DSCI_532_viz-2) | 6 | Data Visualization II | Interactive visualization, design choices, dynamic change over time, multiple views, data reduction, dealing with complexity. | How to make principled and effective choices with respect to marks, spatial arrangement, and colour. Analysis, design, and implementation of interactive figures. How to provide multiple views, deal with complexity, and make difficult decisions about data reduction. | [Daniel Chen](https://daniel.rbind.io/) | [Daniel Chen](https://daniel.rbind.io/) | [Joel Östblom](https://joelostblom.com/) | [Joel Östblom](https://joelostblom.com/) -[DSCI 542](https://github.com/UBC-MDS/DSCI_542_comm-arg) | 6 | Communication and Argumentation | Claims, reasons, and evidence. Strengths and weaknesses of models. Effective oral and written presentation of scientific results, including interpretation of data and recognition of assumptions, bias, validity, and reliability. Citations, references, and peer-review. | How to interpret and present data science findings to a variety of audiences. Written and spoken presentation skills. | [Andy Tai]() | [Andy Tai]() | [Hedayat Zarkoob]() | [Hedayat Zarkoob]() +[DSCI 542](https://github.com/UBC-MDS/DSCI_542_comm-arg) | 6 | Communication and Argumentation | Claims, reasons, and evidence. Strengths and weaknesses of models. Effective oral and written presentation of scientific results, including interpretation of data and recognition of assumptions, bias, validity, and reliability. Citations, references, and peer-review. | How to interpret and present data science findings to a variety of audiences. Written and spoken presentation skills. | [Andy Tai](https://andytai7.github.io/Andy-Tai/) | [Andy Tai](https://andytai7.github.io/Andy-Tai/) | [Hedayat Zarkoob]() | [Hedayat Zarkoob]() [DSCI 554](https://github.com/UBC-MDS/DSCI_554_exper-causal-inf) | 6 | Experimentation and Causal Inference | Randomization. A/B testing. Blocked designs. Orthogonality. Batch effects, confounding. Causality. Contemporary examples. Simulations. | Statistical evidence from randomized experiments versus observational studies. Applications of randomization, e.g., A/B testing for website optimization. Methods for dealing with the multiple testing problem. | [Katie Burak](https://www.linkedin.com/in/katie-burak-a9a12615b/) | [Katie Burak](https://www.linkedin.com/in/katie-burak-a9a12615b/) | [Payman Nickchi]() | [Payman Nickchi]() [DSCI 575](https://github.com/UBC-MDS/DSCI_575_adv-mach-learn) | 6 | Advanced Machine Learning | Neural networks trained with backpropagation. Deep learning. Overfitting and underfitting. Active data acquisition. Hyperparameter optimization. | Advanced machine learning methods in the context of natural language processing (NLP) applications. Bag of words, recommender systems, topic models, natural language as sequence data, Markov chains, and recurrent neural networks. | [Vincent Liu](https://vincentliu3.github.io/) | [Vincent Liu](https://vincentliu3.github.io/) | [Varada Kolhatkar](https://kvarada.github.io/) | [Varada Kolhatkar](https://kvarada.github.io/) -[DSCI 591](https://github.com/UBC-MDS/DSCI_591_capstone-proj) | 7 | Capstone Project | A capstone design project designed to give students experience in leading complex multidisciplinary projects relevant to data science. | A mentored group project based on real data and questions from a partner within or outside the university. Students will formulate questions and design and execute a suitable analysis plan. The group will work collaboratively to produce a reproducible analysis pipeline, project report, presentation and possibly other products, such as a dashboard. | MDS teaching team | MDS teaching team +[DSCI 591](https://github.com/UBC-MDS/DSCI_591_capstone-proj) | 7 | Capstone Project | A capstone design project designed to give students experience in leading complex multidisciplinary projects relevant to data science. | A mentored group project based on real data and questions from a partner within or outside the university. Students will formulate questions and design and execute a suitable analysis plan. The group will work collaboratively to produce a reproducible analysis pipeline, project report, presentation and possibly other products, such as a dashboard. | MDS teaching team | MDS teaching team | MDS teaching team | MDS teaching team