Difference between revisions of "Multi-Genome Tutorial"

From GenPlay, Einstein Genome Analyzer

Jump to: navigation, search
(Chromosome selection)
(VCF Loading)
Line 35: Line 35:
  
 
===== VCF Loading =====
 
===== VCF Loading =====
'''''Manually'''''<br/>
+
======Manually======
To load VCF files , users must first fill the column lists and then select from the list the appropriate data. The VCF Loader appears after clicking on the ''Edit'' button from the welcome screen. The bottom left part of the VCF Loader contains the ''Column list edition'' section. User has to select a column and click on ''Edit'' button in order to show the associated list.
+
The next thing to do to setup a multi-genome project if to select the list of VCF files to load. Click on the ''Multi Genome Project'' radio button at the bottom of the screen and click on ''Select VCF''.
Only one VCF file is going to be loaded for this tutorial. The VCF file contains differences between the reference genome NCBI36/hg18 and the reference genome GRCh37/hg19.
+
 
<br/><br/>
+
To load VCF files , users must first fill the column lists and then select from the list the appropriate data. The VCF Loader appears after clicking on the ''Edit'' button from the welcome screen. The bottom left part of the VCF Loader contains the ''Column list edition'' section. User has to select a column and click on ''Edit'' button in order to show the associated list. Click on the ''Add...'' label of the File column to select the vcf file to load. Select the VCF download ealier. Only one VCF file is going to be loaded for this tutorial. The VCF file contains differences between the reference genome NCBI36/hg18 and the reference genome GRCh37/hg19.
 +
 
 +
 
 +
''Group'' column
  
''Group'' column<br/>
 
 
This tutorial compares reference genome; a generic group name can be ''Reference genome''.
 
This tutorial compares reference genome; a generic group name can be ''Reference genome''.
 
On the ''Group name list editor'', user clicks on the plus button to show the input text box and fills it (Figure 4).<br/>
 
On the ''Group name list editor'', user clicks on the plus button to show the input text box and fills it (Figure 4).<br/>
Line 50: Line 52:
 
image:mg_hg18tohg19_group_editor.png|Figure 5: Group name editor
 
image:mg_hg18tohg19_group_editor.png|Figure 5: Group name editor
 
</gallery>
 
</gallery>
<br/><br/>
 
  
''Genome'' column<br/>
+
 
 +
''Genome'' column
 +
 
 
The genome name is an Alias for the selected raw name. In this tutorial, the genome name is going to be '''Hg18'''.
 
The genome name is an Alias for the selected raw name. In this tutorial, the genome name is going to be '''Hg18'''.
 
On the ''Genome name list editor'', user clicks on the plus button to invoke the input text box and fills it (Figure 6).<br/>
 
On the ''Genome name list editor'', user clicks on the plus button to invoke the input text box and fills it (Figure 6).<br/>
Line 62: Line 65:
 
image:mg_hg18tohg19_genome_editor.png|Figure 7: Genome name editor
 
image:mg_hg18tohg19_genome_editor.png|Figure 7: Genome name editor
 
</gallery>
 
</gallery>
<br/><br/>
 
  
''Type'' column<br/>
+
 
 +
''Type'' column
 +
 
 
This field cannot be edited by the users. The provided VCF file is a Structural Variant type, user therefore has to choose '''SV''' (Figure 8).<br/>
 
This field cannot be edited by the users. The provided VCF file is a Structural Variant type, user therefore has to choose '''SV''' (Figure 8).<br/>
 
''value:'' '''SV'''
 
''value:'' '''SV'''
 
[[image:mg_hg18tohg19_type.png|center|frame|Figure 8: VCF type list]]
 
[[image:mg_hg18tohg19_type.png|center|frame|Figure 8: VCF type list]]
<br/><br/>
 
  
''File'' column<br/>
+
 
 +
''File'' column
 +
 
 
Once the VCF file is downloaded, user has to open the ''File list editor'', user clicks on the plus button to show the file chooser dialog and choose the VCF file according to its location.<br/>
 
Once the VCF file is downloaded, user has to open the ''File list editor'', user clicks on the plus button to show the file chooser dialog and choose the VCF file according to its location.<br/>
 
''value:'' '''VCF path'''
 
''value:'' '''VCF path'''
 
[[image:mg_hg18tohg19_file_editor.png|center|frame|Figure 9: VCF File editor]]
 
[[image:mg_hg18tohg19_file_editor.png|center|frame|Figure 9: VCF File editor]]
<br/><br/>
 
  
''Raw name(s)'' column<br/>
+
 
 +
''Raw name(s)'' column
 
The raw name list is automatically filled. In the case of this tutorial there is only one genome: '''NCBI36''' (Figure 10).<br/>
 
The raw name list is automatically filled. In the case of this tutorial there is only one genome: '''NCBI36''' (Figure 10).<br/>
 
''value:'' '''NCBI36'''
 
''value:'' '''NCBI36'''
 
[[image:mg_hg18tohg19_raw_name.png|center|frame|Figure 10: Raw name list]]
 
[[image:mg_hg18tohg19_raw_name.png|center|frame|Figure 10: Raw name list]]
 
Again, value is saved by closing the windows
 
Again, value is saved by closing the windows
<br/><br/>
 
  
'''''Import XML settings'''''<br/>
+
 
 +
======Import XML settings======
 
In order to set the project with ease, user can import the settings using the XML file above. Please be careful about the VCF path, user must changes it directly on the xml file if he wants to use the import function.
 
In order to set the project with ease, user can import the settings using the XML file above. Please be careful about the VCF path, user must changes it directly on the xml file if he wants to use the import function.
 
<br/>
 
<br/>

Revision as of 14:19, 23 April 2014

Getting started

In order to set up and manage a Multi-Genome Project in Genplay, please refer to the following sections of the documentation:




Conversion between NCBI36/hg18 and GRCh37/hg19

Description

This tutorial describes how to display concurrently tracks mapped on the genome assembly NCBI36/hg18 and tracks mapped on the genome assembly GRCh37/hg19. In the example, the user will be able to see all the modifications on the NCBI36/hg18 genome leading to the GRCh37/hg19 reference genome.

Note: The final result of this tutorial is available as a project that can be loaded from the Projects page of this website.

Files

Steps

Project settings

Project name

The user must choose a name for a new project; here the name is GenPlay-MG – Reference genome tutorial (Figure 1).

Figure 1: Project name
Project assembly

The reference genome for this tutorial is GRCh37/hg19. The mammal clade and human genome need to be selected (Figure 2).

Figure 2: Project assembly
Chromosome selection

The VCF file contains Structural Variants for chromosomes 1 to 22 and chromosomes X and Y. The list of chromosomes available in the project can be set by clicking on the settings button next to the assembly name (Figure 3).

Figure 3: Chromosome chooser
VCF Loading
Manually

The next thing to do to setup a multi-genome project if to select the list of VCF files to load. Click on the Multi Genome Project radio button at the bottom of the screen and click on Select VCF.

To load VCF files , users must first fill the column lists and then select from the list the appropriate data. The VCF Loader appears after clicking on the Edit button from the welcome screen. The bottom left part of the VCF Loader contains the Column list edition section. User has to select a column and click on Edit button in order to show the associated list. Click on the Add... label of the File column to select the vcf file to load. Select the VCF download ealier. Only one VCF file is going to be loaded for this tutorial. The VCF file contains differences between the reference genome NCBI36/hg18 and the reference genome GRCh37/hg19.


Group column

This tutorial compares reference genome; a generic group name can be Reference genome. On the Group name list editor, user clicks on the plus button to show the input text box and fills it (Figure 4).
The Group name list editor should looks like the Figure 5 below.
Once the values has been added to the list, it can be saved by closing the "Group name list editor window"
value: Reference genome


Genome column

The genome name is an Alias for the selected raw name. In this tutorial, the genome name is going to be Hg18. On the Genome name list editor, user clicks on the plus button to invoke the input text box and fills it (Figure 6).
The Genome name list editor should looks like the Figure 7 below.
Once the values has been added to the list, it can be saved by closing the "Genome name list editor window"
value: Hg18


Type column

This field cannot be edited by the users. The provided VCF file is a Structural Variant type, user therefore has to choose SV (Figure 8).
value: SV

Figure 8: VCF type list


File column

Once the VCF file is downloaded, user has to open the File list editor, user clicks on the plus button to show the file chooser dialog and choose the VCF file according to its location.
value: VCF path

Figure 9: VCF File editor


Raw name(s) column The raw name list is automatically filled. In the case of this tutorial there is only one genome: NCBI36 (Figure 10).
value: NCBI36

Figure 10: Raw name list

Again, value is saved by closing the windows


Import XML settings

In order to set the project with ease, user can import the settings using the XML file above. Please be careful about the VCF path, user must changes it directly on the xml file if he wants to use the import function.

Conclusion
Finally, the screen should be like the one on Figure 11.

Figure 11: VCF loader
Conclusion

The welcome screen should finally be similar to the Figure 12.

Figure 12: Welcome screen

The "Create" button will create the project and will run the synchronization.

GRCh37/hg19 genes loading

To load a file, user has to do a right click on the left part of the track. Then to choose "Load Gene Track", a file chooser appears to select the file given in this tutorial. After having chosen the BED file, a new selection box appears (Figure 13).

Figure 13: Genome selection dialog for GRCh37/hg19 genes file

This box asks which genome is related to the BED file. Here, user has to choose "Feb 2009 (GFCh37/hg19)" option because the BED file contains information about that genome. Gene file for GRCh37/hg19 reference has been loaded.

NCBI36/hg18 genes loading

The same operation as loading a gene files for GRCh37/hg19 reference genome. The only step changing is to choose the "Reference genome - hg18 (NCBI36)" option after the BED file selection (Figure 14)

Figure 14: Genome selection dialog for NCBI36/hg18 genes file

Conclusion

User can navigate into the different chromosomes and visualizes differences between both genomes using the stripes. All genes are perfectly synchronized and are display according to the meta-genome coordinates.

The Figure 15 shows an example of the result of this tutorial. It is possible to see deletions (in red) and insertions (in green) in the NCBI36/Hg18 reference genome compare to the GCh37/Hg19 reference genome.
Chromosome: chr1
Position: 143,822,670

Figure 14: GenPlay-MG (chr1:143,822,670)