1 Introduction

This site contains course materials for Course materials for SISG Module 17: Computational Pipeline for WGS Data, July 24-26, 2019. Data used is located in the github repository from which the site is built, as well as in the TOPMed analysis pipeline.

To work through the exercises, log into http://bit.ly/datastage-sb with your username and password.

Slides for lectures are linked below in the schedule. A detailed description of the course and instructor biographies can be found at https://www.biostat.washington.edu/suminst/SISG2019/modules/SM1917

Join the Slack channel here: https://sisg2019module17.slack.com (link to sign up)

1.1 Schedule

Wednesday, July 24

Introduction
- Setup for interactive exercises (Part 1)
Sequencing data formats
Phenotype harmonization
- Exercises
Association tests
- Methods and motivation (Part 1)

Thursday, July 25

Association tests
- Methods and motivation (Part 2)
- GENESIS for association tests
- Exercises
- Aggregate tests (Part 3)
- Exercises
Population structure and relatedness
Mixed model association testing
- Exercises
Variant annotation

Friday, July 26

Variant annotation
- Exercises
Analysis pipeline on the cloud
Cloud platforms
- Analysis Commons
- Seven Bridges (Part 2)
- Terra (Account instructions)

Download the workshop data and exercises: https://github.com/UW-GAC/SISG_2019/archive/master.zip

1.2 R packages used

1.3 Resources

If you are new to R, you might find the following material helpful:

Introduction to R materials from SISG Module 3
Graphics with ggplot2 tutorial
Data manipulation with dplyr