SISG Module 17: Computational Pipeline for WGS Data
2019-07-25
1 Introduction
This site contains course materials for Course materials for SISG Module 17: Computational Pipeline for WGS Data, July 24-26, 2019. Data used is located in the github repository from which the site is built, as well as in the TOPMed analysis pipeline.
To work through the exercises, log into http://bit.ly/datastage-sb with your username and password.
Slides for lectures are linked below in the schedule. A detailed description of the course and instructor biographies can be found at https://www.biostat.washington.edu/suminst/SISG2019/modules/SM1917
Join the Slack channel here: https://sisg2019module17.slack.com (link to sign up)
1.1 Schedule
Wednesday, July 24
- Introduction
- Setup for interactive exercises (Part 1)
- Sequencing data formats
- Phenotype harmonization
- Association tests
- Methods and motivation (Part 1)
Thursday, July 25
- Association tests
- Methods and motivation (Part 2)
- GENESIS for association tests
- Exercises
- Aggregate tests (Part 3)
- Exercises
- Population structure and relatedness
- Mixed model association testing
- Variant annotation
Friday, July 26
- Variant annotation
- Analysis pipeline on the cloud
- Cloud platforms
- Analysis Commons
- Seven Bridges (Part 2)
- Terra (Account instructions)
Download the workshop data and exercises: https://github.com/UW-GAC/SISG_2019/archive/master.zip
1.2 R packages used
1.3 Resources
If you are new to R, you might find the following material helpful:
- Introduction to R materials from SISG Module 3
- Graphics with ggplot2 tutorial
- Data manipulation with dplyr