Difference between revisions of "Tutorials"

From Statistical Genetics Courses

Jump to: navigation, search
(RV-TDT)
(Fine-mapping (SuSiE method))
(16 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
==Running Tutorials on Your Computer==
 
==Running Tutorials on Your Computer==
Starting Fall 2019 we adopt [https://www.docker.com/ docker] to run our course material . We have created various [https://hub.docker.com/u/statisticalgenetics docker repositories] with source material freely available from [https://github.com/statgenetics/statgen-courses github] for users to readily setup and reproduce our tutorials on their own computers. These docker images can also be used as production tool to run relevant software on your computer (Mac, Linux or Windows) for your own data analysis.
+
Starting Fall 2019 we adopt [https://www.docker.com/ docker] to run our course material . We have created various [https://hub.docker.com/u/statisticalgenetics docker repositories] with source material freely available from [https://github.com/statgenetics/statgen-courses github] for users to readily setup and reproduce our tutorials on their own computers. These docker images can also be used as production tool to run relevant software on your computer (Mac, Linux or Windows) or even a high performance computing cluster (if properly configured) for your own data analysis.
  
In this document we will focus on discussing how to set it up and run course tutorials on your computer, using these docker images and optionally a utility script we created to streamline various docker commands.
+
===General instructions===
 +
* [https://github.com/statgenetics/statgen-courses/wiki/How-to-launch-course-tutorials#alternative-to-cloud-server-use-your-own-computer Instructions to setup course tutorial environment on your computer]
 +
* [https://github.com/statgenetics/statgen-courses/wiki/How-to-launch-course-tutorials#option-1-launch-exercise-in-jupyterlab Instructions to run course tutorial through JupyterLab]
 +
* [https://github.com/statgenetics/statgen-courses/wiki/How-to-launch-course-tutorials#option-2-launch-from-command-shell Instructions to run course tutorial through command line terminal]
  
===Pre-requisites===
 
Software you need to install on your computer are <code>SoS</code> (a workflow system to run our course utility script) and <code>docker</code>.
 
  
====Mac and Linux users====
+
===Tutorial specific instructions===
<code>SoS</code> requires Python 3.6+ to run. It is recommended that you install [https://docs.conda.io/en/latest/miniconda.html Miniconda] to run Python 3 if you don't have it already. Once you have Python 3 installed, simply type <code>pip install sos</code> to install <code>SoS</code>, or, check out [https://vatlab.github.io/sos-docs/running.html here for alternative installation methods] if you have troubles with that command. To install <code>docker</code> from command line please follow our instructions [http://statgen.us/lab-wiki/orientation/jupyter-setup.html#install-docker here]. Alternatively Mac users can download docker app for Mac and install from a graphical interface.
+
We use a script [https://github.com/statgenetics/statgen-courses/blob/master/src/statgen-setup "statgen-setup"] to start the docker based environments for these tutorials. Please refer to the previous section for instructions on the installation of this script.
  
Finally please download our utility script [https://raw.githubusercontent.com/statgenetics/statgen-courses/master/src/statgen-setup <code>src/statgen-setup</code>] to your <code>PATH</code> and change it to executable, eg, <code>chmod +x ~/bin/statgen-setup</code> if you put it under <code>~/bin</code> which is part of your <code>PATH</code>. To verify your setup, type:
+
Material and instructions for specific exercise are listed in each section below (''only those using statgen-setup command are relevant to our docker based tutorials''). They provide links to materials and a minimal set of commands to use for launching and running an exercise.
 
+
<pre>statgen-setup -h</pre>
+
you should see some meaningful output.
+
 
+
====Windows users====
+
Running these tutorials in Windows is currently not supported. Although in principle these docker images will also work in Windows, this has not yet been tested out and we are unable to provide sure instructions to setting it up. The utility script "statgen-setup" that we provide will certainly need adjustments (though minor) to work with Windows.
+
 
+
===Tutorial specific instructions===
+
Material and instructions for specific exercise are listed in each section below (''only those using statgen-setup command are relevant to our docker based tutorials''). They provide links to materials and a minimal set of commands to use for launching and running an exercise. For advanced options and other features provided by our utility script please read here our [https://github.com/statgenetics/statgen-courses/blob/master/README.md complete documentation to the utility script].
+
  
 
==Alohomora==
 
==Alohomora==
Line 26: Line 18:
 
* [http://gmc.mdc-berlin.de/alohomora/ Software Link]
 
* [http://gmc.mdc-berlin.de/alohomora/ Software Link]
  
==Annotation==
+
==Annovar complex traits==
* [https://github.com/statgenetics/statgen-courses/blob/master/handout/FunctionalAnnotation.docx Functional Annotation Exercise <nowiki>[DOCX]</nowiki>]
+
* [https://github.com/statgenetics/statgen-courses/blob/master/handout/FunctionalAnnotation.pdf Functional Annotation Exercise <nowiki>[PDF]</nowiki>]
 
* [[Commands in Annotation Exercise|Exercise Commands]]
 
* [[Commands in Annotation Exercise|Exercise Commands]]
  
Line 35: Line 27:
 
</pre>
 
</pre>
  
The "statgen-setup" script is available [https://github.com/statgenetics/statgen-courses/blob/master/src/statgen-setup here] and can be installed following [https://github.com/statgenetics/statgen-courses/blob/master/README.md#prepare-your-computer-to-manage-the-tutorials these instructions].
+
==Annovar Mendelian traits==
 
+
==Annovar MEndelian==
+
 
* [http://statgen.us/files/tutorials/FunctionalAnnotation_Annovar_final.pdf Exercise <nowiki>[PDF]</nowiki>]
 
* [http://statgen.us/files/tutorials/FunctionalAnnotation_Annovar_final.pdf Exercise <nowiki>[PDF]</nowiki>]
* [https://statgen.research.bcm.edu/files/2017/09/commands/annovar-functional_annotation.txt Commands Part I - Functional Annotation]
+
* [https://statgen.us/files/2017/09/commands/annovar-functional_annotation.txt Commands Part I - Functional Annotation]
* [https://statgen.research.bcm.edu/files/2017/09/commands/annovar-variant_filtering.txt Commands Part II - Variant Filtering]
+
* [https://statgen.us/files/2017/09/commands/annovar-variant_filtering.txt Commands Part II - Variant Filtering]
  
 
==Cochran Armitage Trend Test==
 
==Cochran Armitage Trend Test==
Line 54: Line 44:
 
<pre>statgen-setup login --tutorial epistasis
 
<pre>statgen-setup login --tutorial epistasis
 
</pre>
 
</pre>
 
The "statgen-setup" script is available [https://github.com/statgenetics/statgen-courses/blob/master/src/statgen-setup here] and can be installed following [https://github.com/statgenetics/statgen-courses/blob/master/README.md#prepare-your-computer-to-manage-the-tutorials these instructions].
 
 
  
  
Line 67: Line 54:
 
</pre>
 
</pre>
  
The "statgen-setup" script is available [https://github.com/statgenetics/statgen-courses/blob/master/src/statgen-setup here] and can be installed following [https://github.com/statgenetics/statgen-courses/blob/master/README.md#prepare-your-computer-to-manage-the-tutorials these instructions].
+
 
 +
==Fine-mapping (SuSiE method)==
 +
* [https://github.com/statgenetics/statgen-courses/blob/master/handout/finemapping.docx susieR Exercise <nowiki>[DOCX]</nowiki>]
 +
* [https://github.com/statgenetics/statgen-courses/blob/master/handout/finemapping_answers.docx susieR Exercise Answers <nowiki>[DOCX]</nowiki>]
 +
* [https://github.com/statgenetics/statgen-courses/blob/master/notebooks/finemapping.ipynb susieR Exercise <nowiki>[Ipython notebook]</nowiki>]
 +
* [https://github.com/statgenetics/statgen-courses/blob/master/notebooks/finemapping_answers.ipynb susieR Exercise Answers <nowiki>[Ipython notebook]</nowiki>]
 +
 
 +
 
 +
To run the exercise from docker image provided,
 +
 
 +
<pre>statgen-setup launch --tutorial finemap
 +
</pre>
  
 
==GCTA==
 
==GCTA==
Line 77: Line 75:
 
</pre>
 
</pre>
  
The "statgen-setup" script is available [https://github.com/statgenetics/statgen-courses/blob/master/src/statgen-setup here] and can be installed following [https://github.com/statgenetics/statgen-courses/blob/master/README.md#prepare-your-computer-to-manage-the-tutorials these instructions].
 
  
 
==Gemini==
 
==Gemini==
Line 86: Line 83:
 
<pre>statgen-setup login --tutorial gemini
 
<pre>statgen-setup login --tutorial gemini
 
</pre>
 
</pre>
 
The "statgen-setup" script is available [https://github.com/statgenetics/statgen-courses/blob/master/src/statgen-setup here] and can be installed following [https://github.com/statgenetics/statgen-courses/blob/master/README.md#prepare-your-computer-to-manage-the-tutorials these instructions].
 
 
  
  
Line 102: Line 96:
  
 
==GWAS: Data Quality Control==
 
==GWAS: Data Quality Control==
* [https://github.com/statgenetics/statgen-courses/blob/master/handout/PLINK_data_QC.docx Exercise <nowiki>[PDF]</nowiki>]
+
* [https://github.com/statgenetics/statgen-courses/blob/master/handout/PLINK_data_QC.pdf Exercise <nowiki>[PDF]</nowiki>]
 
* [[GWAS Data QC Exercise|Exercise Commands]]
 
* [[GWAS Data QC Exercise|Exercise Commands]]
  
Line 109: Line 103:
 
<pre>statgen-setup login --tutorial plink
 
<pre>statgen-setup login --tutorial plink
 
</pre>
 
</pre>
 
The "statgen-setup" script is available [https://github.com/statgenetics/statgen-courses/blob/master/src/statgen-setup here] and can be installed following [https://github.com/statgenetics/statgen-courses/blob/master/README.md#prepare-your-computer-to-manage-the-tutorials these instructions].
 
 
  
  
 
==GWAS: Association Analysis Controlling for Population Substructure==
 
==GWAS: Association Analysis Controlling for Population Substructure==
* [https://github.com/statgenetics/statgen-courses/blob/master/handout/PLINK_Substructure.docx Exercise <nowiki>[PDF]</nowiki>]
+
* [https://github.com/statgenetics/statgen-courses/blob/master/handout/PLINK_Substructure.pdf Exercise <nowiki>[PDF]</nowiki>]
 
* [[GWAS_Controlling_for_Population_Substructure|Exercise Commands]]
 
* [[GWAS_Controlling_for_Population_Substructure|Exercise Commands]]
  
Line 122: Line 113:
 
<pre>statgen-setup login --tutorial plink
 
<pre>statgen-setup login --tutorial plink
 
</pre>
 
</pre>
 
The "statgen-setup" script is available [https://github.com/statgenetics/statgen-courses/blob/master/src/statgen-setup here] and can be installed following [https://github.com/statgenetics/statgen-courses/blob/master/README.md#prepare-your-computer-to-manage-the-tutorials these instructions].
 
 
  
  
Line 136: Line 124:
 
* [http://statgen.us/files/igv_exercise.zip Exercise files (VCF and BAM)]
 
* [http://statgen.us/files/igv_exercise.zip Exercise files (VCF and BAM)]
  
To run the exercise from docker image provided,
 
 
<pre>statgen-setup login --tutorial igv
 
</pre>
 
  
The "statgen-setup" script is available [https://github.com/statgenetics/statgen-courses/blob/master/src/statgen-setup here] and can be installed following [https://github.com/statgenetics/statgen-courses/blob/master/README.md#prepare-your-computer-to-manage-the-tutorials these instructions].
 
  
 
==Linkage/FastLinkage==
 
==Linkage/FastLinkage==
Line 152: Line 135:
 
</pre>
 
</pre>
  
The "statgen-setup" script is available [https://github.com/statgenetics/statgen-courses/blob/master/src/statgen-setup here] and can be installed following [https://github.com/statgenetics/statgen-courses/blob/master/README.md#prepare-your-computer-to-manage-the-tutorials these instructions].
 
  
 
==Pleiotropy==
 
==Pleiotropy==
* [https://github.com/statgenetics/statgen-courses/blob/master/handout/Pleiotropy.docx Pleiotropy Exercise <nowiki>[DOCX]</nowiki>]
+
* [https://github.com/statgenetics/statgen-courses/blob/master/handout/Pleiotropy.pdf Pleiotropy Exercise <nowiki>[PDF]</nowiki>]
* [https://github.com/statgenetics/statgen-courses/blob/master/handout/Pleiotropy_answers.docx Pleiotropy Answers to Questions <nowiki>[DOCX]</nowiki>]
+
* [https://github.com/statgenetics/statgen-courses/blob/master/handout/Pleiotropy_answers.pdf Pleiotropy Answers to Questions <nowiki>[PDF]</nowiki>]
  
 
To run the exercise from docker image provided,
 
To run the exercise from docker image provided,
Line 163: Line 145:
 
</pre>
 
</pre>
  
The "statgen-setup" script is available [https://github.com/statgenetics/statgen-courses/blob/master/src/statgen-setup here] and can be installed following [https://github.com/statgenetics/statgen-courses/blob/master/README.md#prepare-your-computer-to-manage-the-tutorials these instructions].
 
  
 
==Polygenic risk prediction (NPS method)==
 
==Polygenic risk prediction (NPS method)==
* [https://github.com/statgenetics/statgen-courses/blob/master/handout/NPS.docx PRS NPS Exercise <nowiki>[DOCX]</nowiki>]
+
* [https://github.com/statgenetics/statgen-courses/blob/master/handout/NPS.pdf PRS NPS Exercise <nowiki>[DOCX]</nowiki>]
  
 
To run the exercise from docker image provided,
 
To run the exercise from docker image provided,
Line 173: Line 154:
 
</pre>
 
</pre>
  
The "statgen-setup" script is available [https://github.com/statgenetics/statgen-courses/blob/master/src/statgen-setup here] and can be installed following [https://github.com/statgenetics/statgen-courses/blob/master/README.md#prepare-your-computer-to-manage-the-tutorials these instructions].
+
 
 +
==Polygenic risk prediction (LDpred2 method)==
 +
* [https://github.com/statgenetics/statgen-courses/blob/master/handout/ldpred2_example.pdf PRS LDpred2 Exercise <nowiki>[PDF]</nowiki>]
 +
* [https://github.com/cumc/bioworkflows/blob/master/ldpred/ldpred2_example.ipynb PRS LDpred2 Exercise <nowiki>[Ipython Notebook]</nowiki>]
 +
 
 +
 
 +
To run the exercise from docker image provided,
 +
 
 +
<pre>statgen-setup launch --tutorial ldpred2
 +
</pre>
 +
 
 +
Then follow prompts on the terminal output to open up the JupyterLab server in your web browser. If it is the first time you start this server, please open a command terminal inside JupyterLab, and type
 +
 
 +
<pre>get-data</pre>
 +
 
 +
to load the data-set to the JupyterLab workspace.
 +
 
  
 
<!--
 
<!--
Line 191: Line 188:
 
</pre>
 
</pre>
  
The "statgen-setup" script is available [https://github.com/statgenetics/statgen-courses/blob/master/src/statgen-setup here] and can be installed following [https://github.com/statgenetics/statgen-courses/blob/master/README.md#prepare-your-computer-to-manage-the-tutorials these instructions].
 
  
 
==PSEQ==
 
==PSEQ==
* [https://github.com/statgenetics/statgen-courses/blob/master/handout/PSEQ.doc PSEQ Exercise <nowiki>[DOCX]</nowiki>]
+
* [https://github.com/statgenetics/statgen-courses/blob/master/handout/PSEQ.pdf PSEQ Exercise <nowiki>[PDF]</nowiki>]
* [[PSEQ Commands in Exercise|Exercise Commands]]
+
* [https://github.com/statgenetics/statgen-courses/blob/master/notebooks/PSEQ.ipynb PSEQ Exercise <nowiki>[Ipython Notebook]</nowiki>]
  
 
To run the exercise from docker image provided,
 
To run the exercise from docker image provided,
 +
 +
<pre>statgen-setup launch --tutorial pseq
 +
</pre>
 +
 +
Notice that since PSEQ exercise does not involve generating and visualizing plots, it is also fine to use a command terminal, instead of the JupyterLab server, to run this exercise and reproduce exactly what was described in the tutorial. To do so,
  
 
<pre>statgen-setup login --tutorial pseq
 
<pre>statgen-setup login --tutorial pseq
 
</pre>
 
</pre>
  
The "statgen-setup" script is available [https://github.com/statgenetics/statgen-courses/blob/master/src/statgen-setup here] and can be installed following [https://github.com/statgenetics/statgen-courses/blob/master/README.md#prepare-your-computer-to-manage-the-tutorials these instructions].
+
==REGENIE==
 +
* [https://github.com/statgenetics/statgen-courses/blob/master/handout/regenie_example.pdf REGENIE Exercise <nowiki>[PDF]</nowiki>]
 +
* [https://github.com/statgenetics/statgen-courses/blob/master/notebooks/regenie_example.ipynb REGENIE Exercise <nowiki>[Ipython Notebook]</nowiki>]
 +
 
 +
 
 +
To run the exercise from docker image provided,
 +
 
 +
<pre>statgen-setup launch --tutorial regenie
 +
</pre>
 +
 
 +
Then follow prompts on the terminal output to open up the JupyterLab server in your web browser. If it is the first time you start this server, please open a command terminal inside JupyterLab, and type
 +
 
 +
<pre>get-data</pre>
 +
 
 +
to load the data-set to the JupyterLab workspace.
 +
 
 +
 
  
 
==Regression==
 
==Regression==
Line 213: Line 230:
 
</pre>
 
</pre>
  
The "statgen-setup" script is available [https://github.com/statgenetics/statgen-courses/blob/master/src/statgen-setup here] and can be installed following [https://github.com/statgenetics/statgen-courses/blob/master/README.md#prepare-your-computer-to-manage-the-tutorials these instructions].
 
  
 
==RV-TDT==
 
==RV-TDT==
Line 252: Line 268:
 
To run the exercise from docker image provided,
 
To run the exercise from docker image provided,
  
<pre>statgen-setup login --tutorial slink
+
<pre>statgen-setup login --tutorial slink</pre>
</pre>
+
 
+
The "statgen-setup" script is available [https://github.com/statgenetics/statgen-courses/blob/master/src/statgen-setup here] and can be installed following [https://github.com/statgenetics/statgen-courses/blob/master/README.md#prepare-your-computer-to-manage-the-tutorials these instructions].
+
 
+
 
+
  
 
==SUPERLINK==
 
==SUPERLINK==
Line 264: Line 275:
  
 
==Variant Association Tools==
 
==Variant Association Tools==
* [https://statgenetics.github.io/statgen-courses/notebooks/VAT.html VAT Exercise <nowiki>[HTML]</nowiki>]
+
* [https://github.com/statgenetics/statgen-courses/blob/master/handout/VAT.docx VAT Exercise <nowiki>[DOCX]</nowiki>]
* [[VAT Commands in Exercise|Exercise Commands]]
+
* [https://github.com/statgenetics/statgen-courses/blob/master/notebooks/VAT.ipynb VAT Exercise <nowiki>[Ipython notebook]</nowiki>]
  
  
 
To run the exercise from docker image provided,
 
To run the exercise from docker image provided,
  
<pre>statgen-setup login --tutorial vat
+
<pre>statgen-setup launch --tutorial vat
 
</pre>
 
</pre>
  
The "statgen-setup" script is available [https://github.com/statgenetics/statgen-courses/blob/master/src/statgen-setup here] and can be installed following [https://github.com/statgenetics/statgen-courses/blob/master/README.md#prepare-your-computer-to-manage-the-tutorials these instructions].
+
Then follow the prompts on the terminal output to open up the JupyterLab server in your web browser. You should find the exercise notebook in the side panel, and you can click to open it.

Revision as of 16:45, 9 November 2021

Running Tutorials on Your Computer

Starting Fall 2019 we adopt docker to run our course material . We have created various docker repositories with source material freely available from github for users to readily setup and reproduce our tutorials on their own computers. These docker images can also be used as production tool to run relevant software on your computer (Mac, Linux or Windows) or even a high performance computing cluster (if properly configured) for your own data analysis.

General instructions


Tutorial specific instructions

We use a script "statgen-setup" to start the docker based environments for these tutorials. Please refer to the previous section for instructions on the installation of this script.

Material and instructions for specific exercise are listed in each section below (only those using statgen-setup command are relevant to our docker based tutorials). They provide links to materials and a minimal set of commands to use for launching and running an exercise.

Alohomora

Annovar complex traits

To run the exercise from docker image provided,

statgen-setup login --tutorial annovar

Annovar Mendelian traits

Cochran Armitage Trend Test


Epistasis (PLINK and CASSI)

To run the exercise from docker image provided,

statgen-setup login --tutorial epistasis


FastLMM

To run the exercise from docker image provided,

statgen-setup login --tutorial fastlmm-gcta


Fine-mapping (SuSiE method)


To run the exercise from docker image provided,

statgen-setup launch --tutorial finemap

GCTA

To run the exercise from docker image provided,

statgen-setup login --tutorial fastlmm-gcta


Gemini

To run the exercise from docker image provided,

statgen-setup login --tutorial gemini


Genehunter


To install from packages, follow the configuration steps above and run the following command.

sudo apt-get install genehunter-tutorial

The exercise's files will then be installed in the folder /home/shared/genehunter. You can run from there or copy the files into your user's home directory and proceed with the exercise.

GWAS: Data Quality Control

To run the exercise from docker image provided,

statgen-setup login --tutorial plink


GWAS: Association Analysis Controlling for Population Substructure

To run the exercise from docker image provided,

statgen-setup login --tutorial plink


Homozygosity Mapper

IGV


Linkage/FastLinkage


To run the exercise from docker image provided,

statgen-setup login --tutorial mlink


Pleiotropy

To run the exercise from docker image provided,

statgen-setup login --tutorial pleiotropy


Polygenic risk prediction (NPS method)

To run the exercise from docker image provided,

statgen-setup login --tutorial nps


Polygenic risk prediction (LDpred2 method)


To run the exercise from docker image provided,

statgen-setup launch --tutorial ldpred2

Then follow prompts on the terminal output to open up the JupyterLab server in your web browser. If it is the first time you start this server, please open a command terminal inside JupyterLab, and type

get-data

to load the data-set to the JupyterLab workspace.


Population Genetics


To run the exercise from docker image provided,

statgen-setup login --tutorial popgen


PSEQ

To run the exercise from docker image provided,

statgen-setup launch --tutorial pseq

Notice that since PSEQ exercise does not involve generating and visualizing plots, it is also fine to use a command terminal, instead of the JupyterLab server, to run this exercise and reproduce exactly what was described in the tutorial. To do so,

statgen-setup login --tutorial pseq

REGENIE


To run the exercise from docker image provided,

statgen-setup launch --tutorial regenie

Then follow prompts on the terminal output to open up the JupyterLab server in your web browser. If it is the first time you start this server, please open a command terminal inside JupyterLab, and type

get-data

to load the data-set to the JupyterLab workspace.


Regression

To run the exercise from docker image provided,

statgen-setup login --tutorial regression


RV-TDT

Installing Packages

To install from packages, follow the configuration steps above and run the following command.

sudo apt-get install rvtdt-tutorial

The exercise's files will then be installed in the folder /home/shared/rvtdt. You can run from there or copy the files into your user's home directory and proceed with the exercise.

SEQLinkage


To install from packages, follow the configuration steps above and run the following command.

sudo apt-get install seqlinkage-tutorial

The exercise's files will then be installed in the folder /home/shared/seqlinkage. You can run from there or copy the files into your user's home directory and proceed with the exercise.

SEQSpark

Installing Packages

To install from packages, follow the configuration steps above and run the following command.

sudo apt-get install seqspark-tutorial

The exercise's files will then be installed in the folder /home/shared/seqspark. You can run from there or copy the files into your user's home directory and proceed with the exercise.In order for the commands to work correctly, you don't need to reboot, but you should log out and log back in to make sure that the computer's environment is correctly configured.

SLINK

To run the exercise from docker image provided,

statgen-setup login --tutorial slink

SUPERLINK

Variant Association Tools


To run the exercise from docker image provided,

statgen-setup launch --tutorial vat

Then follow the prompts on the terminal output to open up the JupyterLab server in your web browser. You should find the exercise notebook in the side panel, and you can click to open it.