SISG 2021 Module 16: Computational Pipeline for WGS Data
2021-08-09
1 Introduction
This site contains course materials for SISG Module 16: Computational Pipeline for WGS Data, July 21-23, 2021. The official SISG web page can be found here (requires login) – course evaluations and certificates of completion are available via this page.
Lectures
Slides for lectures are linked below in the schedule. All lectures will be given via Zoom, and recordings of lectures will be posted afterwards.
Zoom
Link for lectures: https://washington.zoom.us/j/92319788832
Tutorials and Exercises
To work through the tutorials and exercises, log into NHLBI BioData Catalyst powered by Seven Bridges with your username and password – we will use this platform for all live demonstrations during the course.
All of the R code and data can also be downloaded from the github repository from which the site is built and run on your local machine after the course has ended. Download the workshop data and exercises: https://github.com/UW-GAC/SISG_2021/archive/master.zip
Slack
Join the Slack channel here: https://uwbiostatisticssisg.slack.com
1.1 Schedule
NOTE: All times are Pacific Daylight Time (GMT-07:00)
Wednesday, July 21
- Zoom session 11:30-13:15 PDT - Introduction and Genomic Data Storage
- Lecture
- Introduction (slides) (recorded lecture)
- Using BioData Catalyst for SISG exercises (instructions) (recorded lecture)
- Sequencing data formats (slides) (recorded lecture)
- Intro to Genomic Data Structure (slides) (recorded lecture)
- Interactive breakout rooms
- Lecture
- Zoom session 13:45-14:30 PDT - Phenotype harmonization
- Lecture
- Phenotype harmonization (slides) (recorded lecture)
- Interactive breakout rooms
- Lecture
Thursday, July 22
- Zoom session 8:00-10:00 PDT - Association tests, Part I
- Zoom session 10:30-11:45 PDT - Association tests, Part II
- Zoom session 12:45-14:30 PDT - Population Structure and Relatedness
- Lecture
- Population structure inference (slides) (recorded lecture)
- Relatedness inference (slides) (recorded lecture)
- Ancestry and relatedness inference with GDS (slides) (recorded lecture)
- Interactive breakout rooms
- Lecture
Friday, July 23
- Zoom session 8:00-9:30 PDT - Mixed models
- Zoom session 9:45-11:30 PDT - Variant annotation
- Lecture
- Variant annotation (slides) (recorded lecture)
- Annotation Explorer demo (video tutorial) (text tutorial) (Exercise) (Solution)
- Interactive breakout rooms
- Lecture
- Zoom session 12:30-13:45 PDT - Working in the cloud
- Lecture
- Analysis pipelines on the cloud (slides) (recorded lecture)
- Running a workflow on BioData Catalyst (recorded demo)
- Interactive breakout rooms
- Lecture
- Zoom session 14:00-14:30 PDT - Open session for questions/advice
- Discussion (video)
1.2 R packages used
1.3 Resources
NHLBI BioData Catalyst Powered by Seven Bridges
If you are new to R, you might find the following material helpful:
- Introduction to R materials from SISG Module 3
- Graphics with ggplot2
- Data manipulation with dplyr