Difference between revisions of "GRCh37/hg19 GRCh38/hg38 Multi-Genome Tutorial"

From GenPlay, Einstein Genome Analyzer

Jump to: navigation, search
(Downloading Files)
(Downloading Files)
Line 18: Line 18:
 
*[http://genplay.einstein.yu.edu/library/Human/hg19/Gene_Annotation/Genes_RefSeq_hg19_09.20.2013.bed Refseq BED file for GRCh37/hg19] (right click on the link and select ''Save Link As...'')
 
*[http://genplay.einstein.yu.edu/library/Human/hg19/Gene_Annotation/Genes_RefSeq_hg19_09.20.2013.bed Refseq BED file for GRCh37/hg19] (right click on the link and select ''Save Link As...'')
 
*[http://genplay.einstein.yu.edu/library/Human/hg38/DNA_Sequence/hg38.2bit DNA sequence file for NCBI38/hg38]
 
*[http://genplay.einstein.yu.edu/library/Human/hg38/DNA_Sequence/hg38.2bit DNA sequence file for NCBI38/hg38]
*[http://genplay.einstein.yu.edu/library/Human/hg19/DNA_Sequence/hg19.2bit DNA sequence file for NCBI37/hg19]
+
*[http://genplay.einstein.yu.edu/library/Human/hg19/DNA_Sequence/DNA_hg19_09.20.2013.2bit DNA sequence file for NCBI37/hg19]
  
 
== Starting a New Project ==
 
== Starting a New Project ==

Revision as of 15:44, 27 June 2014

Goal: This tutorial illustrates how the multi-genome mode of GenPlay can be used to simultaneously display data aligned on different reference genome. In this tutorial we will compare gene annotation data aligned on GRCh37/Hg19 with gene annotation data aligned on GRCh38/Hg38.

Prerequisite: GenPlay need to be installed on your computer. If you haven't installed GenPlay yet, please visit the Downloads page and follow the instruction to download and install GenPlay.

Note: The final result of this tutorial is available as a project that can be loaded from the Projects page of this website.

Getting started

In order to set up and manage a Multi-Genome Project in Genplay, please refer to the following sections of the documentation:

Downloading Files

Starting a New Project

Selecting the Reference Assembly

After starting GenPlay you will be prompted to select a name, a clade, a genome and an assembly for your project. You can enter "hg19 - hg38 Tutorial" for the name, select the mammal clade, the human genome and the hg38 assembly (figure 1).

Figure 1: New Project Window

Then, click on the tool box button on the assembly line. A new window will appear allowing you to select chromosomes. For this tutorial we will work only on the basic chromosomes (chr1 to chr22 plus chrX and Y) . You can select the basic chromosomes by clicking on the Basics (figure 2).

Figure 2: Project Chromosomes

Setting the Multi-Genome Parameters

Manually

Next we need to setup a multi-genome project. To do so, click on the Multi Genome Project radio button at the bottom of the screen and click on Select VCF. Click on the Add... label of the File column to select the VCF file to load. Select the VCF downloaded earlier. Only one VCF file is going to be loaded for this tutorial. The VCF file contains differences between the reference genome NCBI37/hg19 and the reference genome GRCh38/hg38.

File column

Click on the Add... label and then on the Add... menu and select the VCF file downloaded earlier (hg19ToHg38.vcf.gz).

Raw name(s) column

The raw name is automatically filled. In the case of this tutorial there is only one genome beside the hg38 assembly: hg19

Nickname column

The nickname can be used do differentiate samples with having the same raw name. In this example we can keep the default nick name.

Group column

Since this tutorial is about comparing reference genomes; a generic group name can be Reference genome. Click on the Group 1 text of the Group column and then click on the pencil to edit the the group name.

The result is shown in figure 3.

Figure 3: VCF Loader Window

Automatically

You can automatically setup the multi-genome project by clicking on the Import Config button at the bottom of the project screen and select the XML file downloaded earlier. You have to make sure that the VCF file and the XML file are in the same directory when you choose this option.

Displaying SNPs, Insertions and Deletions

Once you're done with the previous step click on create to initialize the project. This should only take a few seconds.

We can now display variants. Let's start by loading SNPs. To do so, right click on the handler of the first track (the blue part of the track with a number on it) and then select the Add Variant Layer option (figure 4).

Figure 4: Add Variant Layer

Then click on the SNPs' check box (figure 5).

Figure 5: VCF Select Variants to Add

Using the same method, load insertions on track 2 and deletions on track 3. The result should be similar to what is shown on figure 6.

Figure 6: Variant Layers Added

Displaying Gene Annotation Layers

Let's start by loading the hg38 gene annotation.

Right click on the handler of the track 4 and select the Add Layer(s) option (figure 7).

Figure 7: Add Layer

Then select the hg38 gene annotation file downloaded at the beginning of this tutorial. On the next screen select Gene Annotation Layer (figure 8).

Figure 8: Load Gene Annotation

And then we need to tell GenPlay that the data were aligned on the hg38 reference genome (figure 9)

Figure 9: Select hg38

We now need to repeat the same operation for hg19. You will need to select the other gene annotation file and then select hg19 as the genome used for the alignment (figure 10).

Figure 10: Select hg19

The final result of this tutorial is showed in figure 11.

Figure 11: Final Result