Difference between revisions of "Documentation"

From GenPlay, Einstein Genome Analyzer

Jump to: navigation, search
(Score Repartition Around Start)
Line 1: Line 1:
__FORCETOC__
+
== Getting started ==
 
+
=== Starting GenPlay ===
== Starting GenPlay ==
 
 
GenPlay is freely available at http://www.genplay.net/wiki/index.php/Web_Start To start the software, click the button corresponding to the amount of memory that you wish to allocate to the Java virtual machine.
 
GenPlay is freely available at http://www.genplay.net/wiki/index.php/Web_Start To start the software, click the button corresponding to the amount of memory that you wish to allocate to the Java virtual machine.
  
The amount of memory determines how many tracks you will be able to load simultaneously. The programming philosophy behind GenPlay is to provide fast performances once the data is loaded. To achieve that goal the entire genome need to be loaded in memory for multiple tracks at the same time. This results in high quality performance, but requires a lot of memory. The amount of memory needed per track depends on the genome, the track type, the window size, the data precision etc.
+
The amount of memory determines how many layers you will be able to load simultaneously. The programming philosophy behind GenPlay is to provide fast performances once the data is loaded. To achieve that goal the entire genome need to be loaded in memory for multiple layers at the same time. This results in high quality performance, but requires a lot of memory. The amount of memory needed per layer depends on the genome, the layer type, the window size, the data precision etc.
  
You should generally choose as much memory as you can afford on your system (generally about 70% of the total RAM memory that exists on your system). For mammalian genomes we recommend allocating at least 4 GB of RAM although you should be able to load a couple of genome-wide tracks with 1GB or 1.5GB of RAM. Selecting analysis of only one chromosome at a time will drastically reduce the memory requirement and should allow you to load many tracks at very high resolutions. Tracks loaded in GenPlay can also be compressed as explained later in this documentation.
+
You should generally choose as much memory as you can afford on your system (generally about 70% of the total RAM memory that exists on your system). For mammalian genomes we recommend allocating at least 4 GB of RAM although you should be able to load a couple of genome-wide layers with 1GB or 1.5GB of RAM. Selecting analysis of only one chromosome at a time will drastically reduce the memory requirement and should allow you to load many layers at very high resolutions. Layers loaded in GenPlay can also be compressed as explained later in this documentation.
  
 
The amount of RAM memory available to GenPlay is displayed in the lower right corner of the screen.
 
The amount of RAM memory available to GenPlay is displayed in the lower right corner of the screen.
<br/>
+
<br /><br /><br />
<br/>
+
=== The Welcome screen ===
<br/>
+
The welcome screen is the first screen of GenPlay-MG and allow user to create or to load a project.
 +
 
 +
==== New Project ====
 +
In order to create a new project, users must give it a name as shown in Figure 1.
 +
[[image:mg_basics_project name.png|center|frame|Figure 1: Text field to define the project name]]
 +
 
 +
The second step is to choose a reference genome. Users can choose it using the different list according to the clade, the genome and the assembly (Figure 2).
 +
[[image:mg_basics_assembly_chooser.png|center|frame|Figure 2: Assembly chooser]]
 +
 
 +
Several chromosomes are available for each assembly but users can choose to select only some of them.<br/>
 +
To open the chromosome chooser (Figure 3), users have to click on the tools button next to the assembly name.
 +
[[image:mg_basics_chromosome_chooser.png|center|frame|Figure 3: Chromosome chooser]]
 +
 
 +
The third and last step is to choose between a ''Simple Genome Project'' and a ''Multi Genome Project''. If the multi genome project option is selected, the welcome screen should be as the one shown in Figure 4.
 +
[[image:mg_basics_empty_welcome_screen.png|center|frame|Figure 4: Empty welcome screen for multi-genome project]]
 +
===== Single Genome Project =====
 +
 
 +
===== Multi Genome Project =====
 +
====== Introduction ======
 +
====== VCF Files ======
 +
VCF files describe differences between genomes. Usually, it concerns differences between one or several genomes of interest and the reference genome used for the mapping process. VCF files define multiple type of variations; GenPlay is able to read and represent the followings:
 +
* InDels
 +
* SNPs
 +
* SV (Structural Variation)
 +
 
 +
A complete description of VCF files is given on the 1000 genomes project website:<br/>
 +
[http://www.1000genomes.org/wiki/analysis/variant-call-format/vcf-variant-call-format-version-42 Variant Call Format specification]<br/>
 +
 
 +
====== Tabix ======
 +
: 1. Introduction
 +
VCF files contain a lot of information which  makes the scanning (loading) processes longer.<br/>
 +
In order to increase the scanning efficiency, VCF files have to be compressed and indexed. The compression is done using BGZip and the indexing with Tabix.<br/>
 +
[http://samtools.sourceforge.net/tabix.shtml Tabix manual reference pages]<br/>
 +
[http://sourceforge.net/projects/samtools/files/tabix/ Tabix download]
 +
 
 +
: 2. VCF files indexing methods
 +
:: 2.1. Using GenPlay
 +
GenPlay is now able to compress and index VCF files using the VCF Loader.<br/>
 +
The way the VCF Loader works is explained below. When you want to select the compressed file (.vcf.gz), simply select the VCF file (.vcf) instead. You may need to change the file extension filter in the file chooser in order to see .vcf files.<br/>
 +
GenPlay will look then for compressed/indexed files at the same location, if nothing is found, it will offer to compress and index the selected VCF file (Figure 1).
 +
[[image:mg_vcf_loader_compress_index.png|center|frame|Figure 1: VCF Loader compress/index]]
 +
 
 +
It is fully automatic and non-platform dependent (works on Windows, Linux and Mac).
 +
 
 +
:: 2.2. Manually
 +
First, please note the following process must be performed in either Linux or Mac environments.<br/>
 +
Each VCF files must be first compress to a BGZF (.bgz file) format. Tabix provides a tool to perform the compression.
 +
After compression, VCF files must be indexed using the associated command.
 +
Once Tabix is  installed, two commands are necessary to perform the indexation.
 +
<br/><br/>
 +
 
 +
Available commands from the Tabix folder:<br/>
 +
''bgzip -f VCF_PATH;''<br/>
 +
''tabix –p vcf VCF_PATH;''
 +
<br/><br/>
 +
For example, a VCF file named my_vcf.vcf located in the same folder as Tabix can be indexed with the following commands (Figure 2):<br/>
 +
''bgzip -f ./my_vcf.vcf;''<br/>
 +
''tabix –p vcf ./my_vcf.vcf.gz;''
 +
[[image:mg_basics_indexation_commands.png|center|frame|Figure 2: VCF file indexation command]]
 +
<br/><br/>
 +
'''Note:''' the first command '''replaces''' the current VCF file by the compressed VCF file (.vcf.gz). The second command '''creates''' the indexed VCF file in the current folder (.vcf.gz.tbi).<br/>
 +
More options are available on [http://samtools.sourceforge.net/tabix.shtml Tabix manual reference pages].
 +
 
 +
====== The VCF Loader ======
 +
: 1. Introduction
 +
The VCF Loader is the most important part of multi-genome project settings. It allows users to load all necessary VCF files and to define how to extract information from them. It appears when users click on the "Edit" button from the welcome screen.<br/>
 +
The Figure 3 shows an empty VCF Loader screen.
 +
[[image:Mg_welcome_screen_vcf_loader.png|center|frame|Figure 3: VCF loader]]
 +
 
 +
GenPlay-MG does not use directly the VCF file, it uses a compress version of it (.gz). Moreover, GenPlay-MG also needs the compress VCF file to be indexed with Tabix. Both file versions must be in the '''same folder''' and must have the '''same name''', only file extensions differ (.gz and .tbi).
 +
In order to use GenPlay to generate additional files, please refer to the [[#Tabix|section above]].
 +
 
 +
The user can add or remove rows by right clicking on the table.
 +
 
 +
: 2. Columns description
 +
'''''File'''''<br/>
 +
This column refers to the VCF file path. Once loaded, the raw name column is automatically filled with every raw genome name contained in the selected VCF file.<br/>
 +
'''''Raw name'''''<br/>
 +
The ''Raw name'' column list is automatically filled when a VCF file has been chosen. That list contains every genotype headers contained inside the selected VCF file. Because Genome names might be difficult to remembers, GenPlay-MG offers users the option of adding another name (an alias) using the ''Genome'' column.<br/>
 +
'''''Nickname'''''<br/>
 +
The ''Nickname'' column allows users to associate an alias  to the selected genome. This alias will appear in GenPlay-MG and can be useful because genome names in VCF files are often non descriptive numbers that can be hard to remember.<br/>
 +
'''''Group'''''<br/>
 +
Users can gather genomes by group. Group names are used to distinguish genomes  and to perform some specific functionalities.<br/>
 +
 
 +
: 3. Columns edition<br/>
 +
''Group'', ''Nickname'' and ''File'' column have their own editable list.To edit a cell, click on it, go over the item you want to edit and choose one of the following action:<br/>
 +
- Add (green symbol on empty item)<br/>
 +
- Edit (pen symbol on an item)<br/>
 +
- Delete (red symbol on an item)<br/>
 +
 
 +
That way, users can set up all columns before starting (or at the same time) to fill the table.<br/>
 +
 
 +
'''Note: ''' The ''Raw name(s)'' column is automatically filled with genome name from the selected VCF file, that column cannot be edited manually.
 +
 
 +
====== Import/Export ======
 +
Once a project has been set up, it can be saved using the import/export function. Pressing the export button saves an XML files to the hard drive.  This XML file can then be imported to reload the project.
 +
 
 +
The XML file structure is simple. Each row are stored in ''row'' mark containing every attribute names such as ''group'', ''genome, ''file'' and ''raw_name''. The settings file is formatted as shown in Figure 4.
 +
[[image:mg_basics_xml_settings.png|center|frame|Figure 4: XML file settings]]
 +
 
 +
'''Note:''' If the user moves the VCF files or changes one of its genotype headers, the XML file will not work anymore. User has to modify ''file'' and/or ''raw_name'' attribute values.<br/>
  
 +
==== Load Project ====
 +
Documentation in writing...
 +
<br /><br /><br /><br />
 
== GUI Overview ==
 
== GUI Overview ==
 
[[image:gui_overview.png|right|thumb|200px|GUI Overview 1.Ruler 2.Track List 3.Control Panel 4.Status Bar]]
 
[[image:gui_overview.png|right|thumb|200px|GUI Overview 1.Ruler 2.Track List 3.Control Panel 4.Status Bar]]
Line 21: Line 123:
 
# Control Panel
 
# Control Panel
 
# Status Bar
 
# Status Bar
<br style="clear: both" />
+
<br /><br /><br />
 
 
 
=== Ruler ===
 
=== Ruler ===
 
The ruler shows the coordinates of the current displayed position.  
 
The ruler shows the coordinates of the current displayed position.  
  
 
[[image:ruler.png|left|thumb|500px|Ruler 1.Option Button 2.Absolute Positions 3.Relative Positions]]
 
[[image:ruler.png|left|thumb|500px|Ruler 1.Option Button 2.Absolute Positions 3.Relative Positions]]
<br style="clear: both" />
 
  
 
==== General Option Button ====
 
==== General Option Button ====
Line 43: Line 143:
 
==== Relative Position ====
 
==== Relative Position ====
 
The numbers written in black on the second line represent the distance from the middle in base pair.
 
The numbers written in black on the second line represent the distance from the middle in base pair.
 
+
<br /><br /><br />
 
=== Track List ===
 
=== Track List ===
The track list is the cornerstone of the GUI.  From here you can load tracks and execute operations.
+
The track list is the cornerstone of the GUI.  From here you can load layers and execute operations.
  
 
The tracks are divided into two parts.   
 
The tracks are divided into two parts.   
  
On the left, there is the track handler that becomes highlighted when the mouse is over it. By right clicking on the track handler, a contextual menu appears with all the operations that can be executed on the track.
+
On the left, there is the track handler that becomes highlighted when the mouse is over it. By right clicking on the track handler, a contextual menu appears with all the operations that can be executed on the track and its layer(s).
  
 
On the right, the data can be visualized.
 
On the right, the data can be visualized.
<br/>
+
<br /><br /><br />
<br/>
 
<br/>
 
 
 
 
=== Control Panel ===
 
=== Control Panel ===
 
[[image:control_panel.png|right|thumb|500px|Control Panel 1.Position Bar 2.Zoom Bar 3.Chromosome Box 4.Position Text Field]]
 
[[image:control_panel.png|right|thumb|500px|Control Panel 1.Position Bar 2.Zoom Bar 3.Chromosome Box 4.Position Text Field]]
Line 63: Line 160:
 
# Chromosome Box: set the selected chromosome with the chromosome box  
 
# Chromosome Box: set the selected chromosome with the chromosome box  
 
# Position Text Field: the position text field follows the format of the UCSC Genome Browser position field so it is easy to copy and paste the position from one browser to the other
 
# Position Text Field: the position text field follows the format of the UCSC Genome Browser position field so it is easy to copy and paste the position from one browser to the other
 
+
<br /><br /><br />
 
=== Status Bar ===
 
=== Status Bar ===
 
[[image:status_bar.png|right|thumb|500px|Status Bar 1.Progress Bar 2.Stop Button 3.Operation Description 4.Memory Bar]]
 
[[image:status_bar.png|right|thumb|500px|Status Bar 1.Progress Bar 2.Stop Button 3.Operation Description 4.Memory Bar]]
Line 70: Line 167:
 
# Stop button, allows users to stop the current operation. If the button is not bright red the operation can't be stopped
 
# Stop button, allows users to stop the current operation. If the button is not bright red the operation can't be stopped
 
# Operation description, displays a short text describing the current operation as well as the elapsed time from the beginning of the operation
 
# Operation description, displays a short text describing the current operation as well as the elapsed time from the beginning of the operation
# Memory bar, shows the amount of memory used and the amount of memory available. Make sure that you have enough memory before starting a new operation. You can delete or compress tracks to free up memory.
+
# Memory bar, shows the amount of memory used and the amount of memory available. Make sure that you have enough memory before starting a new operation. You can delete or compress layers to free up memory.
<br/><br/><br/>
+
<br /><br /><br /><br />
 
 
 
== Browsing the Genome ==
 
== Browsing the Genome ==
<br/>
 
<br/>
 
<br/>
 
 
=== Changing the Position ===
 
=== Changing the Position ===
 
You can change the position of the displayed window by:
 
You can change the position of the displayed window by:
Line 85: Line 178:
 
# Using the keyboard left and right arrows
 
# Using the keyboard left and right arrows
 
# Double-clicking on a track where you want to center the view
 
# Double-clicking on a track where you want to center the view
<br/>
+
<br /><br /><br />
<br/>
 
<br/>
 
 
=== Changing the Chromosomes ===
 
=== Changing the Chromosomes ===
 
You can switch the selected chromosome by:
 
You can switch the selected chromosome by:
 
# Changing the selection in the chromosome box on the control panel
 
# Changing the selection in the chromosome box on the control panel
 
# Changing the text of the position text field on the control panel
 
# Changing the text of the position text field on the control panel
<br/>
+
<br /><br /><br />
<br/>
 
<br/>
 
 
=== Changing the Zoom ===
 
=== Changing the Zoom ===
 
The level of the zoom can be modified by:
 
The level of the zoom can be modified by:
Line 100: Line 189:
 
# Using the zoom bar on the control panel
 
# Using the zoom bar on the control panel
 
# Changing the text of the position text field on the control panel
 
# Changing the text of the position text field on the control panel
<br/>
+
<br /><br /><br /><br />
<br/>
+
== Loading a Layer ==
<br/>
+
=== Introduction ===
 +
The layers are the way to show information from files. They can represent information in different manners.<br />
 +
A layer is created from a track, each track can contain one or several layers.<br />
 +
To load a layer in a track, right click on its handler (the blue part on the left of the track). This opens a contextual menu with the different actions available on the track.
 +
The menu of a track empty of layer looks like the one in figure 1.<br />
 +
By clicking "Add Layer" appears a dialog to select one of the different layer type GenPlay offers (Figure 2).<br />
 +
Examples of layers that can be loaded in GenPlay are available for download from the GenPlay Library accessible from the GenPlay.net website.
  
== Loading a Track ==
+
<gallery widths=350px perrow=2>
To load a track in any row, right click on the handler of any empty track (the blue part on the left of the track). This opens a menu including options to load the various types of tracks that exist in GenPlay.
+
image:add_layer.png|Figure 1: Track Contextual Menu
[[image:load_track.png|center|thumb|100px|Loading a Track]]
+
image:layer_type.png|Figure 2: Layer Types
Examples of tracks that can be loaded in GenPlay are available for download from the GenPlay Library accessible from the GenPlay.net website.
+
</gallery>
 +
<br /><br /><br />
 +
=== Loading a Sequencing/Microarray Layer ===
 +
The Sequencing/Microarray layer allows the visualization of windows of variable/fix sizes with a score associated to these windows.
 +
Select the “Sequencing/Microarray Layer” option. This opens up a file chooser dialog box. Load the file of your choice from the list of available window files and click the open button.
  
=== Loading a Variable Window Track ===
+
Please refer to the [[#File formats|File formats]] section if you want to know what kind of file can be loaded as a sequencing/microarray layer.
[[image:load_vwt1.png|right|thumb|100px|File Chooser]]
 
Variable window tracks allow the visualization of windows of variable sizes with a score associated to these windows.
 
  
Select the “Load Variable Window Track” option. This opens up a file chooser dialog box. Load the file of your choice from the list of available fixed window files and click the open button.
+
This opens a new dialog to set different parameters for the new layer (as shown on the figure below). The dialog is separated in 6 sections detailed below.
 +
[[image:Add_layer_seq_micro.png|right|thumb|300px|New Layer Settings Dialog]]
  
Please refer to the [[#File formats|File formats]] section if you want to know what kind of file can be loaded as a variable window track.
+
==== Layer Name ====
<br style="clear: both" />
+
Gives a name to the layer.
[[image:load_vwt2.png|left|thumb|100px|Chromosome Selection]]
 
  
==== Chromosome Selection ====
+
==== Bin ====
After selecting your file, a new window will appear and ask which chromosome to extract. By default all the chromosomes of the project are selected. If you want to change this selection, click on the "modify selection" button and uncheck the undesired chromosomes. Working on fewer chromosomes will save memory and loading time.
+
By default, the windows generated in sequencing/microarray layer have a variable size. It represents very precisely the content of the file.<br />
 +
For some other purposes, users may want to have fixed windows size. They are useful to represent the results of many types of experiments including, but not limited to: CHIP-seq, RNA seq, and TimEX-seq.  Files containing the results of alignments (SAM, bowtie, Eland) and files containing already created bin lists (bed, bgr, etc.) can be loaded using this option. In the case of alignment files, bin lists will be created on the fly as described below. Files containing the results of micro-array experiments can also be loaded as long as they are in one of the accepted formats.
 +
It lowers the resolution but usually offers better memory usage.<br />
 +
This is implemented here by enabling the "Bin Data" option. The "Bin Size" field will then be available in order to give the size of the windows in base pairs.<br />
  
'''Important Note:''' GenPlay can accelerate the loading if you know that your file is sorted by chromosome. If you press Yes when GenPlay asks you if the file is sorted when your file is actually not sorted, the file may load incompletely, leading to a loss of valuable information.  The chromosomes must be ordered the same way it is ordered in the chromosome selection combo-box.
+
'''Important Note:''' A bin size of 1 bp will use a lot of memory. According to the experiment, it may be more efficient to disable the bin data option and stay in variable window size mode.
<br style="clear: both" />
 
  
 
==== Score Calculation ====
 
==== Score Calculation ====
[[image:load_vwt3.png|right|thumb|100px|Name and Score Calculation]]
+
[[image:score_calculation_methods.png|right|thumb|100px|Name and Score Calculation]]
Once the chromosome selection is done, a final window will pop-up and ask you to name the track. The default name is the loaded file name.  If there are overlapping windows in your data file, you will also be prompted to select a method for calculating the score of the windowsOverlapping windows will be split into smaller windows using a simple algorithm.
+
It can happen that files contain overlapping windows. In this case, GenPlay splits them into smaller windows using a simple algorithm.<br />
<br style="clear: both" />
+
This algorithm can be chosen in that section offering the following possibilities:<br />
 +
* Addition
 +
* Average
 +
* Maximum
 +
* Minimum
 +
 
 +
Some examples are shown in the sections below for both [[#For non bined layer|non bined]] and [[#For bined layer|bined]] layers.
 +
 
 +
==== Strand ====
 +
If your input file contains information regarding the strands, you'll be able to choose to load the data from either both or only one strand. 
 +
 
 +
You can also decide to shift the reads from both strands as shown in the figure on the left. To shift the strands just put a value in the "Shift" input box.
 +
 
 +
The value you entered is going to be added to the position of the data on the 5' strand and subtracted from the ones on the 3' strand.
 +
 
 +
==== Fragment Length ====
 +
==== Selected Chromosomes ====
 +
By default all the chromosomes of the project are selected. If you want to change this selection, click on the "modify selection" button and uncheck the undesired chromosomes. Working on fewer chromosomes will save memory and loading time.
 +
 
 +
'''Important Note:''' GenPlay can accelerate the loading if you know that your file is sorted by chromosome.  If you press Yes when GenPlay asks you if the file is sorted when your file is actually not sorted, the file may load incompletely, leading to a loss of valuable informationThe chromosomes must be ordered the same way it is ordered in the chromosome selection combo-box.
  
 
==== Examples of Score Calculations ====
 
==== Examples of Score Calculations ====
 
+
===== For non bined layer =====
<br/>
+
====== Example 1 ======
<br/>
 
<br/>
 
===== Example 1 =====
 
 
''' Input file '''
 
''' Input file '''
 
{|  cellpadding="4" cellspacing="0" border="1"
 
{|  cellpadding="4" cellspacing="0" border="1"
Line 188: Line 303:
 
| 1
 
| 1
 
|}
 
|}
 +
  
 
''' Result '''
 
''' Result '''
 +
[[image:loadVWT_ex1.png|center|frame|Loading of an alignment file as a variable window layer]]
  
  
[[image:loadVWT_ex1.png|center|frame|Loading of an alignment file as a variable window track]]
+
----
 
 
  
----
 
  
===== Example 2 =====
+
====== Example 2 ======
 
{|  cellpadding="4" cellspacing="0" border="1"
 
{|  cellpadding="4" cellspacing="0" border="1"
 
! Chr  
 
! Chr  
Line 221: Line 336:
  
  
[[image:loadVWT_ex2.png|center|thumb|400px|Loading of an interval file as a variable window track]]
+
[[image:loadVWT_ex2.png|center|thumb|400px|Loading of an interval file as a variable window layer]]
  
  
Line 261: Line 376:
 
| 100
 
| 100
 
|}
 
|}
<br/>
 
<br/>
 
<br/>
 
  
=== Loading Fixed Window Tracks ===
 
[[image:load_fwt1.png|left|thumb|100px|File Chooser]]
 
Fixed window tracks display bin lists.  They are useful to represent the results of many types of experiments including, but not limited to: CHIP-seq, RNA seq, and TimEX-seq.  Files containing the results of alignments (SAM, bowtie, Eland) and files containing already created bin lists (bed, bgr, etc.) can be loaded using this option. In the case of alignment files, bin lists will be created on the fly as described below. Files containing the results of micro-array experiments can also be loaded as long as they are in one of the accepted formats.
 
  
By right clicking on an empty track handler, the contextual menu will pop up.  Select the “Load Fixed Window Track” option. This opens up a file chooser dialog box as shown in the figure on the left.
+
===== For bined layer =====
 
+
====== Example 1 ======
Load the track of your choice from the list of files and click the open button. Please refer to the [[#File formats|File formats]] section if you want to know what kind of file can be loaded as a fixed window track.
+
Loading of an alignment file as a fixed window layer with a window size of 100:  
<br/>
 
<br/>
 
<br/>
 
==== Track Name ====
 
[[image:load_fwt2.png|Right|thumb|200px|Fixed Window Track Options]]
 
The default track name will be the file name. The name of the track can be changed later after the track is loaded.
 
<br/>
 
<br/>
 
<br/>
 
==== Window Size ====
 
This specifies the size of the genomic windows (bins) in base pair (bp) for the track that will be created to summarize the results.
 
<br/>
 
<br/>
 
<br/>
 
==== Score Calculation ====
 
This option allows you to choose how the scores of the bins are calculated. You may choose between three options: average, maximum or sum. The algorithm of the score calculation is explained below.
 
<br/>
 
<br/>
 
<br/>
 
==== Strand Selection ====
 
[[image:strand_shifting.png|left|thumb|100px|Strand Shifting]]
 
If your input file contains information regarding the strands, you'll be able to choose to load the data from either both or only one strand. 
 
 
 
You can also decide to shift the reads from both strands as shown in the figure on the left. To shift the strands just put a value in the "Shift" input box.
 
 
 
The value you entered is going to be added to the position of the data on the 5' strand and subtracted from the ones on the 3' strand.
 
<br style="clear: both" />
 
<br/>
 
<br/>
 
<br/>
 
==== Data Precision ====
 
Because GenPlay requires a lot of RAM memory, we provide the option of changing the precision at which the score for each bin is stored.
 
* Scores in 64 bit are stored in floating value double precision (which can represent extremely large numbers unlikely to be useful for genomic experiments).
 
* Scores in 32 bit are stored in floating value single precision (which can also represent very large numbers).
 
* Scores stored in 16 bits can range between - 3267.8 and +3267.7 (with one decimal place).
 
* Scores stored in 8 bits can range between 0 and 255 (with no decimal).
 
* Score in 1 bit can be equal to zero or 1 (useful to create masks for instance).
 
 
 
We recommend storing scores in 32 or 16 bits.
 
<br/>
 
<br/>
 
<br/>
 
==== Chromosome Selection ====
 
You can load either the whole genome or only specific chromosomes (which saves time and memory).
 
 
 
'''Important Note:''' When specific chromosomes are selected, you will be prompted to tell if you file is sorted by chromosome.  If you answer that your file is sorted by chromosome when actually it is not your file may load incompletely, leading to a loss of valuable information.  The chromosomes must be ordered the same way it is ordered in the chromosome selection combo-box.
 
 
 
When the OK button is clicked, the track is loaded in the location desired.
 
<br/>
 
<br/>
 
<br/>
 
 
 
==== Examples of Score Calculations ====
 
<br/>
 
<br/>
 
<br/>
 
===== Example 1 =====
 
Loading of an alignment file as a fixed window track with a window size of 100:  
 
  
 
(each line represents one read position, score is always one)
 
(each line represents one read position, score is always one)
Line 387: Line 438:
  
  
[[image:loadFWT_ex1.png|center|frame|Loading of an alignment file as a fixed window track with a window size of 100]]
+
[[image:loadFWT_ex1.png|center|frame|Loading of an alignment file as a fixed window layer with a window size of 100]]
  
  
Line 423: Line 474:
  
 
----
 
----
<br/>
+
 
<br/>
+
 
<br/>
+
====== Example 2 ======
===== Example 2 =====
+
Loading of an alignment file as a fixed window layer with a window size of 100:  
Loading of an alignment file as a fixed window track with a window size of 100:  
 
  
 
(each line represents one read position, score varies)
 
(each line represents one read position, score varies)
 
  
 
''' Input file '''
 
''' Input file '''
Line 486: Line 535:
  
  
[[image:loadFWT_ex2.png|center|frame|Loading of an alignment file as a fixed window track with a window size of 100]]
+
[[image:loadFWT_ex2.png|center|frame|Loading of an alignment file as a fixed window layer with a window size of 100]]
  
  
Line 522: Line 571:
  
 
----
 
----
<br/>
 
<br/>
 
<br/>
 
  
===== Example 3 =====
+
 
Loading of an interval file as a fixed window track with a window size of 100:
+
====== Example 3 ======
 +
Loading of an interval file as a fixed window layer with a window size of 100:
  
 
''' Input file '''
 
''' Input file '''
Line 553: Line 600:
  
  
[[image:loadFWT_ex3.png|center|frame|Loading of an interval file as a fixed window track with a window size of 100]]
+
[[image:loadFWT_ex3.png|center|frame|Loading of an interval file as a fixed window layer with a window size of 100]]
  
  
Line 593: Line 640:
 
| 14.70
 
| 14.70
 
|}
 
|}
 
+
<br /><br /><br />
<br/>
+
=== Loading a Gene Annotation Layer ===
<br/>
+
[[image:gene_track.png|left|thumb||A Gene Layer]]
<br/>
 
 
 
=== Loading a Gene Track ===
 
[[image:gene_track.png|left|thumb||A Gene Track]]
 
 
[[image:score_color.png|right|thumb|40px|Score Color]]
 
[[image:score_color.png|right|thumb|40px|Score Color]]
After right clicking on the empty track handler, select the “Load Gene Track” option. This opens up a file chooser dialog box that allows you to select the file that you want to load. Please refer to the [[#File formats|File formats]] section if you want to know what kind of file can be loaded as a gene track.
+
Select the “Gene Layer" option. This opens up a file chooser dialog box that allows you to select the file that you want to load. Please refer to the [[#File formats|File formats]] section if you want to know what kind of file can be loaded as a gene layer.
  
Once it's done, just wait until the loading is complete and the gene track will appear in the track you selected.  
+
Once it's done, just wait until the loading is complete and the gene layer will appear in the track you selected.  
  
 
Note that the genes on the plus strand are in red and the genes on the minus strand are in blue. If the file contains expression values, the exons are color coded to represent the expression (red = high, blue = low, as shown on the right).
 
Note that the genes on the plus strand are in red and the genes on the minus strand are in blue. If the file contains expression values, the exons are color coded to represent the expression (red = high, blue = low, as shown on the right).
<br style="clear: both" />
+
<br /><br /><br />
 +
=== Loading a Repeat Family Layer ===
 +
Select the "Repeat Layer" option. This opens up a file chooser dialog box that allows you to select the file that you want to load. Please refer to the [[#File formats|File formats]] section if you want to know what kind of file can be loaded as a repeat layer.
  
=== Loading a sequence track ===
+
This layer type displays repeats organized by family or class.
After right clicking on the empty track handler, select the “Load Sequence Track” option. This opens up a file chooser dialog box that allows you to select the file that you want to load. Please refer to the [[#File formats|File formats]] section if you want to know what kind of file can be loaded as a sequence track.
+
<br /><br /><br />
[[image:sequence_track.png|center|thumb|300px|A Sequence Track]]
+
=== Loading a DNA Sequence Layer ===
Sequence tracks show DNA sequences from .2bit files.  
+
Select the “DNA Sequence Layer” option. This opens up a file chooser dialog box that allows you to select the file that you want to load. Please refer to the [[#File formats|File formats]] section if you want to know what kind of file can be loaded as a sequence layer.
 +
[[image:sequence_track.png|center|thumb|300px|A Sequence Layer]]
 +
Sequence layers show DNA sequences from .2bit files.  
  
 
The hg18, hg19, mm8 and mm9 sequence files can be downloaded from the [http://129.98.70.162/wiki/index.php/Library library] of GenPlay.
 
The hg18, hg19, mm8 and mm9 sequence files can be downloaded from the [http://129.98.70.162/wiki/index.php/Library library] of GenPlay.
<br/>
+
<br /><br /><br />
<br/>
+
=== Loading a Mask Layer ===
<br/>
+
[[image:Stripes.png|right|thumb|200px|CPG Islands Shown As Stripes On a Refseq Gene Layer]]
 
+
Select the "Mask Layer" option. The stripes acting as masks can be useful to show regions of interest such as CpG Islands or repeat regions.
=== Loading a SNP Track ===
 
First, select the “Load SNP Track” option on the track contextual menu. This opens up a file chooser dialog box that allows you to select the file that you want to load. Please refer to the [[#File formats|File formats]] section if you want to know what kind of file can be loaded as a SNP track.
 
 
 
A SNP track shows the Single-Nucleotide Polymorphisms.
 
<br/>
 
<br/>
 
<br/>
 
 
 
=== Loading a Repeat Track ===
 
Select the “Load Repeat Track” option on the track contextual menu. This opens up a file chooser dialog box that allows you to select the file that you want to load. Please refer to the [[#File formats|File formats]] section if you want to know what kind of file can be loaded as a repeat track.
 
 
 
This track type displays repeats organized by family or class.
 
<br/>
 
<br/>
 
<br/>
 
  
 +
Check the [[#File Formats|File Formats]] section out if you need to know what kind of file can be loaded as a stripes.
 +
<br /><br /><br />
 +
=== Loading a Variant Layer ===
 +
[[image:Mg_add_layer_variant_selection.png|right|thumb|200px|Add a Variant Layer]]
 +
Select the "Variant Layer" option, this option is only available in multi-genome projects. This will pop up a new dialog to select which sample the user wants to load, and which variation(s).
 +
A variant layer is according to only one sample. It is also possible to change the colors of each variation independently by clicking on the colored square next to the variation checkbox.
 +
<br /><br /><br />
 
=== Loading Data From a DAS Server ===
 
=== Loading Data From a DAS Server ===
 
The distributed annotation system (DAS) is a client-server system in which a client can retrieve data from one or multiple servers. GenPlay can connect to any server that follows the DAS/1 protocol as specified by [http://www.biodas.org/wiki/DAS/1 BioDAS]
 
The distributed annotation system (DAS) is a client-server system in which a client can retrieve data from one or multiple servers. GenPlay can connect to any server that follows the DAS/1 protocol as specified by [http://www.biodas.org/wiki/DAS/1 BioDAS]
  
 
[[image:DAS_dialog.png|left|thumb||DAS Dialog]]
 
[[image:DAS_dialog.png|left|thumb||DAS Dialog]]
The “Load from DAS Server” option from the track contextual menu will show the DAS Dialog.
+
The “Add Layer from DAS Server” option from the track handler menu will show the DAS Dialog.
  
 
Select the server from which you want to retrieve the data in the "Server" box.  
 
Select the server from which you want to retrieve the data in the "Server" box.  
Line 646: Line 685:
 
Once that's done you need to select the data that you want to retrieve in the "Data Type" box.
 
Once that's done you need to select the data that you want to retrieve in the "Data Type" box.
  
GenPlay can either generate a gene track or a variable window track from the retrieved data. You can select what type of output track you want in the "Generate" option.
+
GenPlay can either generate a gene layer or a variable window layer from the retrieved data. You can select what type of output layer you want in the "Generate" option.
  
 
Finally, you can also choose to download data on only a part of the genome. This can be useful because retrieving data from a DAS server can be time consuming.
 
Finally, you can also choose to download data on only a part of the genome. This can be useful because retrieving data from a DAS server can be time consuming.
  
 
'''Note:''' The [[#DAS server|DAS server]] section shows how to add new servers to the list of available servers in the DAS dialog.
 
'''Note:''' The [[#DAS server|DAS server]] section shows how to add new servers to the list of available servers in the DAS dialog.
<br/>
+
<br /><br /><br /><br />
<br/>
 
<br/>
 
 
 
=== Generating a Multi Curves Track ===
 
[[image:Multi_curves_track.png|center|thumb|200px|A Mutli Curves Track]]
 
If more than one fixed or variable window tracks are loaded, you can overlay them in a multi curves track. To do so, first select the "Generate Multi Curves Track" in the track contextual menu.
 
[[image:load_mct1.png|left|thumb|100px|Mutli Curves Dialog]]
 
Then a dialog will appear asking you which tracks you want to see in the multi curves track.
 
 
 
The available tracks are in the list on the left of the dialog and the selected track appears in the list on the right. Select a track by clicking on its name and use the left and right arrows in the middle of the screen to toggle a track from one list to the other. Double clicking on track produces the same effect.
 
 
 
The order of the tracks in the right list will determine the order in which the tracks are printed. The track on top of the list will be printed on top the other tracks. You can change the order of the tracks by clicking on the name of a track in the right list and using the up and down arrow in the middle of the dialog.
 
 
 
'''Note:''' in order to change the appearance of a the multi curve track, you need to change the appearance of the tracks that appear in the multi curves track.
 
<br/>
 
<br/>
 
<br/>
 
 
 
=== Loading Stripes ===
 
[[image:Stripes.png|right|thumb|200px|CPG Islands Shown As Stripes On a Refseq Gene Track]]
 
By clicking on the "Load Stripes" option of the track contextual menu you can load transparent stripes superimposed on a track. The stripes can be useful to show regions of interest such as CpG Islands or repeat regions.
 
 
 
Check the [[#File Formats|File Formats]] section out if you need to know what kind of file can be loaded as a stripes.
 
<br/>
 
<br/>
 
<br/>
 
 
 
 
== Main Menu ==
 
== Main Menu ==
 
[[image:main_menu.png|right|thumb|200px|Main Menu]]
 
[[image:main_menu.png|right|thumb|200px|Main Menu]]
 
On GenPlay’s main screen, click on the top left button (shown by a little hammer and wrench) to pop up the main menu.
 
On GenPlay’s main screen, click on the top left button (shown by a little hammer and wrench) to pop up the main menu.
<br/>
 
<br/>
 
<br/>
 
  
 +
=== New Project ===
 +
This will pop up the welcome screen in order to start a new project. All work not saved will be lost.
 +
<br /><br /><br />
 
=== Load / Save Project ===
 
=== Load / Save Project ===
This menu allows you to load or to save a whole GenPlay project in a space efficient binary compressed format. When you load a GenPlay project, all the tracks of your current project will be replaced by the ones from the loaded project and all the information that hasn't been saved will be lost.
+
This menu allows you to load or to save a whole GenPlay project in a space efficient binary compressed format. When you load a GenPlay project, all the tracks and layers of your current project will be replaced by the ones from the loaded project and all the information that hasn't been saved will be lost.
 
'''Important Note:''' The GenPlay project files may be dependent on the version of GenPlay you're using. Be sure to remember with which version of GenPlay you saved a project and use the same version next time you load your project.
 
'''Important Note:''' The GenPlay project files may be dependent on the version of GenPlay you're using. Be sure to remember with which version of GenPlay you saved a project and use the same version next time you load your project.
 
+
<br /><br /><br />
'''Important Note 2:''' In the current GenPlay version, The genome selected in the configuration file is not saved with the project. A project will generally not load and give an error message if the genome kept in memory in the GenPlay temp file is different from the genome used when the project was saved.  To change the genome simply go to the upper left corner and access the configuration menu through the options menu.
 
 
 
<br/>
 
<br/>
 
<br/>
 
 
=== Full Screen ===
 
=== Full Screen ===
 
Click on this item from the main menu to toggle the full screen mode. When the full screen mode is on, the control panel and the status bar are hidden.  
 
Click on this item from the main menu to toggle the full screen mode. When the full screen mode is on, the control panel and the status bar are hidden.  
  
 
You can also toggle the full screen mode by pressing the F11 key.
 
You can also toggle the full screen mode by pressing the F11 key.
 
+
<br /><br /><br />
 +
=== Warnings report ===
 +
This option will pop up the Warnings report dialog in order to consult previous and current alerts.
 +
<br /><br /><br />
 
=== Option ===
 
=== Option ===
 
The option menu item allows you to modify the configuration of GenPlay. Please refer to the section [[#Changing the Configuration of GenPlay|Changing the configuration of GenPlay]] for further information.
 
The option menu item allows you to modify the configuration of GenPlay. Please refer to the section [[#Changing the Configuration of GenPlay|Changing the configuration of GenPlay]] for further information.
<br/>
+
<br /><br /><br />
<br/>
 
<br/>
 
 
 
 
=== RNA To DNA Reference ===
 
=== RNA To DNA Reference ===
This option allows you to transformed the coordinate system of the result of a RNA-Seq experiment based on alignment to a transcriptome  (for instance  all refseq genes),  to a genomic coordinate system.
+
This option allows you to transformed the coordinate system of the result of a RNA-Seq experiment based on alignment to a transcriptome  (for instance  all refseq genes),  to a genomic coordinate system.
  
 
You need two files in order to use this functionality.
 
You need two files in order to use this functionality.
Line 739: Line 746:
 
And the result as a GdpGene file is:
 
And the result as a GdpGene file is:
 
  NM_000016 chr1 + 76190042 76229353 76190042,76194085,76198328,76198537,76199212,76200475,76205664,76211490,76215103,76216135,76226806,76228376 76190502,76194173,76198426,76198607,76199313,76200556,76205795,76211599,76215244,76216231,76227055,76229353 667888.95,1506024.1,0,0,0,0,0,0,0,0,0,0
 
  NM_000016 chr1 + 76190042 76229353 76190042,76194085,76198328,76198537,76199212,76200475,76205664,76211490,76215103,76216135,76226806,76228376 76190502,76194173,76198426,76198607,76199313,76200556,76205795,76211599,76215244,76216231,76227055,76229353 667888.95,1506024.1,0,0,0,0,0,0,0,0,0,0
<br/>
+
<br /><br /><br />
<br/>
 
<br/>
 
 
 
 
=== Help and About GenPlay ===
 
=== Help and About GenPlay ===
 
The help and the about GenPlay options open a browser showing respectively the documentation and about pages of GenPlay website.
 
The help and the about GenPlay options open a browser showing respectively the documentation and about pages of GenPlay website.
<br/>
+
<br /><br /><br />
<br/>
 
<br/>
 
 
=== Exit ===
 
=== Exit ===
 
This option closes the application after asking for confirmation.
 
This option closes the application after asking for confirmation.
 
+
<br /><br /><br /><br />
 
== Changing the Configuration of GenPlay ==
 
== Changing the Configuration of GenPlay ==
Click on the option item of the main menu to open the configuration screen.
+
[[image:main_menu.png|right|thumb|250px|Option Menu]]
[[image:changing_configuration.png|center|thumb|100px|Option Menu]]
+
Click on the option item of the [[#Main Menu|main menu]] to open the configuration screen.
 
+
<br /><br /><br />
 
=== General Options ===
 
=== General Options ===
The following screen lets you set the general options:
+
The following screen lets you set the general options.
[[image:general_options.png|center|thumb|100px|General Options]]
 
The Default Directory lets you specify where the files containing GenPlay tracks will be stored in your file system.
 
  
The Log File is a text file that contains a time-stamped history of the files extracted and loaded on GenPlay.
+
The Default Directory lets the user choose which folder to open by default for any of the file chooser within GenPlay.
  
 
From this screen, you can also modify the appearance of the software by changing the look & feel.
 
From this screen, you can also modify the appearance of the software by changing the look & feel.
<br/>
+
<br /><br /><br />
<br/>
 
<br/>
 
 
 
=== Configuration Files ===
 
[[image:changing_configuration.png|right|thumb|100px|Configuration Files]]
 
The configuration files screen allows you to change the zoom file as well as the genome configuration file. It is necessary to restart GenPlay after modifying this option in order for the changes to take effect.
 
<br/>
 
<br/>
 
<br/>
 
==== Zoom File ====
 
The Zoom configuration file contains the predefined levels of zooming. To change the levels of zoom, just create a text file with one level of zooming (in bp) per line order from the smallest to the greatest. Here is an example:
 
<br style="clear: both" />
 
 
 
10
 
100
 
1000
 
10000
 
100000
 
1000000
 
10000000
 
100000000
 
 
 
<br/>
 
<br/>
 
<br/>
 
==== Genome File ====
 
Once GenPlay is started, a configuration file describing the genome that you want to analyze is loaded (the default is human hg19).
 
Configurations are simple text files that specify the name and length of the chromosomes or scaffolds of the current genome. Configuration files for human and mouse recent genome assembly can be downloaded from the GenPlay [http://www.GenPlay.net/wiki/index.php/Library library].  Genome configuration files form human and mouse come in two options full and basic. Basic only contains the standard chromosome. The full version of the fiels also allow the display of chromosome variants.
 
 
 
Configuration files for any genome can easily be created in any word processor using the provided examples as a model.
 
Here is an example of a genome file:
 
chr1 249250621
 
chr5 180915260
 
chr13 115169878
 
chrX 155270560
 
chrY 59373566
 
 
 
<br/>
 
<br/>
 
<br/>
 
 
 
 
=== Track Option ===
 
=== Track Option ===
[[image:track_option.png|right|thumb|100px|Track Option]]
 
 
The Number of Tracks text box defines the maximum number of tracks that can be loaded on GenPlay.
 
The Number of Tracks text box defines the maximum number of tracks that can be loaded on GenPlay.
  
Line 812: Line 770:
  
 
The Undo Count text box defines the number of operations that can be undone. Note that the higher the number of undos selected, the more memory will be required.
 
The Undo Count text box defines the number of operations that can be undone. Note that the higher the number of undos selected, the more memory will be required.
<br/>
 
<br/>
 
<br/>
 
  
 +
The reset option allows the user to easily reset a layer in order to come back as if it has been freshly loaded.
 +
 +
The legend showing layers name on the upper right of a track can also be enabled or disabled.
 +
<br /><br /><br />
 
=== DAS Server ===
 
=== DAS Server ===
[[image:das_server_option.png|right|thumb|100px|DAS Server Option]]
 
 
The DAS server option shows the list of existing DAS servers along with the URL where these servers are located. It also provides the options to add new servers and remove existing servers.
 
The DAS server option shows the list of existing DAS servers along with the URL where these servers are located. It also provides the options to add new servers and remove existing servers.
  
 
GenPlay can communicate and retrieve data from the servers implementing the [http://www.biodas.org/wiki/DAS/1 DAS/1 protocol]
 
GenPlay can communicate and retrieve data from the servers implementing the [http://www.biodas.org/wiki/DAS/1 DAS/1 protocol]
<br/>
+
<br /><br /><br />
<br/>
 
<br/>
 
 
 
 
=== Restore Default ===
 
=== Restore Default ===
 
The Restore Default configuration restores everything back to the factory settings.
 
The Restore Default configuration restores everything back to the factory settings.
 
+
<gallery widths=220px heights=100px perrow=3>
<br/><br/><br/>
+
image:Options_general.png|General Options
 +
image:options_track.png|Track Option
 +
image:das_server_option.png|DAS Server
 +
</gallery>
 +
<br /><br /><br /><br />
 
== File Formats ==
 
== File Formats ==
 
The different file formats used in GenPlay are described on this [[GenPlay File Formats|page]].
 
The different file formats used in GenPlay are described on this [[GenPlay File Formats|page]].
 
+
<br /><br /><br /><br />
<br/><br/><br/>
+
== Using Tracks ==
== Manipulating tracks ==
+
[[image:add_layer.png|right|thumb|150px|Track Menu]]
[[image:manipulating_tracks.png|right|thumb|150px|Track Menu]]
+
=== Handling Tracks ===
=== Moving a Track ===
+
==== Moving a Track ====
 
To move a track up or down in the track list, just click on the track handler (the left part of the track with the track number) and drag the track to the desired position.
 
To move a track up or down in the track list, just click on the track handler (the left part of the track with the track number) and drag the track to the desired position.
 
<br/>
 
<br/>
 
<br/>
 
<br/>
 
<br/>
 
<br/>
=== Inserting a Track ===
+
==== Inserting a Track ====
 
To insert a track, right click on the track handler of the track right under where you want to insert and choose the "Insert" option.
 
To insert a track, right click on the track handler of the track right under where you want to insert and choose the "Insert" option.
 
<br/>
 
<br/>
 
<br/>
 
<br/>
 
<br/>
 
<br/>
=== Copying, Cutting and Pasting a Track ===
+
==== Deleting a Track ====
To copy a track, select the desired track and click on the copy option in the contextual menu or press CTRL+C
 
 
 
To cut a track, select the desired track and click on the cut option in the contextual menu or press CTRL+X
 
 
 
To paste a track, select the empty track where you want to paste and click on the paste option in the contextual menu or press CTRL+P
 
<br/>
 
<br/>
 
<br/>
 
=== Deleting a Track ===
 
 
To delete, select  a track and click on the delete option of the contextual menu or press Delete on the keyboard.
 
To delete, select  a track and click on the delete option of the contextual menu or press Delete on the keyboard.
 
<br/>
 
<br/>
 
<br/>
 
<br/>
 
<br/>
 
<br/>
=== Renaming a Track ===
+
==== Copying, Cutting and Pasting a Layer ====
To rename, select a track and click on the rename option of the contextual menu or press the F2 key.
+
[[image:paste_layer.png|right|thumb|200px|Track Menu]]
<br/>
+
To copy layers, select the desired track where the layers are and click on the copy option in the contextual menu or press CTRL+C.
<br/>
+
 
 +
To cut layers, select the desired track where the layers are and click on the cut option in the contextual menu or press CTRL+X.
 +
 
 +
To paste a track, select the track where you want to paste and click on the paste option in the contextual menu or press CTRL+P.<br />
 +
A new window will appear showing all layers recently copied/cut that can be pasted on the track. The user has to select all layers he wants to paste and then click "Ok".
 
<br/>
 
<br/>
=== Setting the Height of a Track ===
 
To set the height, select  a track and click on the set height option of the contextual menu or click on the bottom of a track handler and drag the mouse up or down.
 
 
<br/>
 
<br/>
 
<br/>
 
<br/>
<br/>
+
==== Taking a Screenshot of the Track ====
=== Changing the Appearance of a Track ===
 
[[image:track_appearance.png|left|thumb|150px|Track Appearance]]
 
To change the appearance of a variable or fixed window track, click on the appearance option of the contextual menu.  For any other type of track you can set the number of vertical lines displayed from the contextual menu.
 
<br style="clear: both" />
 
<br/>
 
<br/>
 
<br/>
 
=== Taking a Screenshot of the Track ===
 
 
To take a screenshot, select a track and choose the "Save as Image" option in the contextual menu.
 
To take a screenshot, select a track and choose the "Save as Image" option in the contextual menu.
 
<br/>
 
<br/>
 
<br/>
 
<br/>
 
<br/>
 
<br/>
=== Showing / Hiding the Stripes ===
+
==== Using the Undo / Redo / Reset Options ====
To show stripes on a track, select a track and choose the "Load Stripes" option in the contextual menu. Choose the "Remove Stripes" option to hide the stripes.
+
The undo, redo and reset options are only available for the Variable and Fixed Window layers. They are accessible from the contextual menu when you right click on the track handler.
<br/>
 
<br/>
 
<br/>
 
=== Using the Undo / Redo / Reset Options ===
 
The undo, redo and reset options are only available for the Variable and Fixed Window tracks. They are accessible from the contextual menu when you right click on the track handler.
 
  
 
The number of undo and redo operations available can be specified as described in the [[#Track Option|Track Option]] section. Note that this operations are memory consuming and reducing the number of undo / redo available can save memory.
 
The number of undo and redo operations available can be specified as described in the [[#Track Option|Track Option]] section. Note that this operations are memory consuming and reducing the number of undo / redo available can save memory.
  
 
The reset operation restore the track to the way it was right after being loaded. A reset operation can also be undone.
 
The reset operation restore the track to the way it was right after being loaded. A reset operation can also be undone.
<br/>
+
<br /><br /><br />
<br/>
+
=== Track/Layer Settings ===
<br/>
+
==== General ====
=== Compressing a Fixed Window Track ===
+
[[image:track_settings_track.png|right|thumb|300px|Track Settings - General]]
The Fixed Window tracks can also be compressed. To compress a Fixed Window track you need to click on the Compression option of the contextual menu. Compressing a track frees memory but it is not possible to use an operation on a compressed track. Therefore, you need to uncompress the track before using any operation.
+
===== Basic Options =====
<br/>
+
*Name: The name of the track.
<br/>
+
*Height: The height of the track.
<br/>
+
===== Axis Options =====
 +
*Show horizontal lines: Split the track horizontally.
 +
*Horizontal line count: Number of horizontal lines, equally separated.
 +
*Show vertical lines: Split the track vertically.
 +
*Vertical line count: Number of vertical lines, equally separated.
 +
===== Score Options =====
 +
*Minimum Score: The minimum score to show.
 +
*Maximum Score: The maximum score to show.
 +
*Auto-rescaled: Enable the automatic score rescaling.
 +
*Score Position: Choose where the score is shown (top/bottom).
 +
*Score Color: Set the font color of the score.
 +
==== Layers ====
 +
[[image:track_settings_layer.png|right|thumb|300px|Track Settings - Layers]]
 +
*Name: Click on the name to edit it.
 +
*Type: The type of layer.
 +
*Color: Click to edit the color of the layer.
 +
*Graph Type: Click to change the graph type:
 +
**Curve
 +
**Points
 +
**Bar
 +
**Dense
 +
*Visible: Show/hide the layer.
 +
*Active: Set the layer as "active". The active layer as direct interaction with the mouse pointer and clicks.
 +
*Set For Deletion: If set, the layer(s) will be deleted when clicking "Ok".
 +
<br /><br /><br /><br />
 +
== Operations ==
 +
Once a layer is loaded, a right click on the location of the track handler opens a popup menu as shown in the figure below.
 +
[[image:operation_menu.png|center|thumb|600px|Operation Menu]]
 +
The Operation sub-menu of the popup menu contains all the actions that you can use on the selected layer.
 +
=== Sequencing/Microarray Layer Operations ===
 +
Bin-ed and non bin-ed layers do not have all the same operations. They share most of them but some are specific.
 +
<gallery widths=250px heights=400px perrow=2>
 +
image:micro_seq_operations.png|Non bin-ed Microarray/Sequencing Layer Operations
 +
image:micro_seq_bin_operations.png|Bin-ed Microarray/Sequencing Layer Operations
 +
</gallery>
 +
==== Common operations ====
 +
===== Show History =====
 +
Show the history of the layer, every changes that have been made since loaded.
 +
===== Constant Operation =====
 +
[[image:Micro_seq_constant_operation.png|right|thumb|250px|Operation With Constant]]
 +
Thes operations use one constant in the following ways:
 +
* Addition: adds the constant to each window (F(x) = x + constant).
 +
* Subtraction: substracts the constant to each window (F(x) = x - constant).
 +
* Multiplication: multiplies the score by the constant(F(x) = x * constant).
 +
* Division: divides the score by the constant (F(x) = x / constant).
 +
* Inversion: inverts the score of each windows (F(x) = constant / x).
 +
* Unique Score: sets all windows to an unique score (F(x) = constant).
 +
The function can also be applied to null windows by checking the box.
 +
===== Two Layers Operation =====
 +
This allows operations between two Sequencing/Microarray layers, bin-ed and non bin-ed.<br />
 +
In order to set the operations, few windows appear in the following order:
 +
# A first window appears in order to select the second layer.
 +
# The second window asks in which track the resulting layer will be put.
 +
# The third and last window offers the algorithms to complete the operation (x1: score first layer; x2: score second layer):
 +
* Addition: add scores (x = x1 + x2).
 +
* Subtraction: substract scores (x = x1 - x2).
 +
* Multiplication: multiply scores (x = x1 * x2).
 +
* Division: divide scores (x = x1 / x2).
 +
* Average: average score (x = (x1 + x2) / 2).
 +
* Maximum: keeps the highest score.
 +
* Minimum: keeps the lowest score.
 +
 
 +
'''Note:''' The only way the resulting layer would be a bin-ed layer is to make an operation between two bin-ed layer having the same bin size. Any other case will result in a non bin-ed layer.
 +
===== Index =====
 +
Indexation can be useful to compare multiple layers at the same scale. It "re-scales" existing scores to a new range defined by the user.<br />
 +
If scores go from 10 to 600 but for some reason would need to be observed between 0 and 100, this operation will do the work.<br />
 +
It will first ask for the new minimum and the new maximum. The next dialog asks to perfom the re-scaling by chromosome independently or genome wide.<br />
 +
Using the previous example, for a new scale of [0; 100] if the first chromosome as a maximum score of 600 and the second one has a maximum score of 800; 800 will become the reference value of 100 for both chromosomes if the operation is processed genome wide. If the operation is processed by chromosome independently, 600 will become the reference value of 100 for the first chromosome, and 800 for the second chromosome.
  
== Operations ==
+
Since this operation uses the minimum and maximum scores, it is very important to note that indexing does not work well in the presence of outliers. Indexing works best if outliers are eliminated or removed first using a filter (see below).
Once a track is loaded, a right click on the location of the track handler opens a popup menu as shown in the figure below.
+
===== Log =====
[[image:operation_menu.png|center|thumb|100px|Operation Menu]]
 
The Operation sub-menu of the popup menu contains all the actions that you can use on the selected track.
 
<br/>
 
<br/>
 
<br/>
 
=== Variable Window Track Operations ===
 
<br/>
 
<br/>
 
<br/>
 
==== Operations With a Constant (Addition, Subtraction, Multiplication, Division, Invert) ====
 
[[image:operation_constante.png|right|thumb|100px|Operation With Constant]]
 
These operations add, subtract, multiply, divide the score of each window by a constant value. The invert function inverts the socore of each windows. Clicking on any of these operations opens a dialog box where the user can input the value of the constant in a text field, as shown in the figure (example for addition).
 
<br style="clear: both" />
 
<br/>
 
<br/>
 
<br/>
 
==== Two Tracks Operation ====
 
[[image:operation_2tracks.png|left|thumb|100px|Two Tracks Operation]]
 
This allows basic operations between tracks (fixed and variable window tracks only). It can be useful to subtract background, normalize data with a control track or perform many other track manipulations.
 
The available operations between two tracks are addition, subtraction, multiplication, division, average, minimum, maximum.
 
<br style="clear: both" />
 
<br/>
 
<br/>
 
<br/>
 
==== Indexation ====
 
Indexation can be useful to compare multiple tracks at the same scale.  Importantly, indexing does not work well in the presence of outliers. Indexing works best if outliers are eliminated or removed first using a filter (see below). To index the scores of a track based on the greatest and the smallest value of the whole genome you need to choose a new minimum and a new maximum value.
 
<br/>
 
<br/>
 
<br/>
 
==== Indexation Per Chromosome ====
 
This operation indexes each chromosome separately. Users enter the new minimum and maximum score values in a text field. When the OK button is clicked, the resulting track is displayed.
 
<br/>
 
<br/>
 
<br/>
 
==== Log ====
 
 
[[image:operation_log.png|right|thumb|100px|Logarithm Bases]]
 
[[image:operation_log.png|right|thumb|100px|Logarithm Bases]]
 
For each window, the log operation applies the function f(x) = log(x), where x is the window score. The base of the logarithm function can be selected between either 2 (binary log), e (natural log) or 10 (common log).
 
For each window, the log operation applies the function f(x) = log(x), where x is the window score. The base of the logarithm function can be selected between either 2 (binary log), e (natural log) or 10 (common log).
 
+
===== Normalize =====
==== Log With Damper ====
 
For each window, this operation applies the function f(x) = log((x + damper)  /  (avg + damper)), where x is the window score. The base of the logarithm function can be either 2 (binary log), e (natural log) or 10 (common log).
 
 
 
The log with damper operation is useful to normalize some micro array data (Nimblegen for instance) see [http://genome.cshlp.org/content/19/12/2288.short Desprat et al. Genome Res. 2009 Dec;19(12):2288-99]
 
 
 
==== Normalize ====
 
 
[[image:operation_Normalize.png|left|thumb|100px|Normalization Coefficient]]
 
[[image:operation_Normalize.png|left|thumb|100px|Normalization Coefficient]]
 
After a normalize operation the score of each window is divided by the result of the Score Count operation and multiplied by a specified fixed value. By default, after normalization the scores are expressed per 10 millions reads.
 
After a normalize operation the score of each window is divided by the result of the Score Count operation and multiplied by a specified fixed value. By default, after normalization the scores are expressed per 10 millions reads.
<br style="clear: both" />
+
===== Standard Score =====
<br/>
+
Calculates the standard score for the selected layer i.e. (x - avg) / stdev; where x is the score, avg is the average score of the layer and stdev is the standard deviation of the scores of the layer.
<br/>
+
===== Filter =====
<br/>
 
==== Standard Score ====
 
Calculates the standard score for the selected track i.e. (x - avg) / stdev; where x is the score, avg is the average score of the track and stdev is the standard deviation of the scores of the track.
 
<br/>
 
<br/>
 
<br/>
 
==== Minimum, Maximum ====
 
[[image:operation_choosechromo.png|right|thumb|100px|Select Chromosomes]]
 
The maximum and minimum operations display respectively the greatest and the smallest score on the selected chromosomes. It shows a menu asking to select chromosomes.
 
<br/>
 
<br/>
 
<br/>
 
==== Score Count ====
 
The score count operation computes the sum of the window scores on the selected chromosomes.
 
<br/>
 
<br/>
 
<br/>
 
==== Average ====
 
This operation computes the average score of the windows of the selected chromosomes. Note that the score of each window is weighted by the length of the window.
 
 
 
==== Standard Deviation ====
 
[[image:operation_stdev.png|left|thumb|100px|Standard Deviation]]
 
This operation computes the standard deviation of the scores of the windows of the selected chromosomes. Note that the scores of each window are weighted by the length of the window.
 
<br style="clear: both" />
 
<br/>
 
<br/>
 
<br/>
 
==== Count Non-Null Length ====
 
This operation returns the sum of the lengths of the windows with a score different from zero on the selected chromosomes.
 
<br/>
 
<br/>
 
<br/>
 
==== Filter ====
 
 
GenPlay provides four different filters:
 
GenPlay provides four different filters:
<br/>
+
====== Percentage Filter ======
<br/>
 
<br/>
 
===== Percentage Filter =====
 
 
[[image:operation_pfilter.png|right|thumb|130px|Percentage Filter]]
 
[[image:operation_pfilter.png|right|thumb|130px|Percentage Filter]]
 
This option filters the X% lowest values and the Y% greatest values where X and Y are two decimals and where X + Y <= 100.
 
This option filters the X% lowest values and the Y% greatest values where X and Y are two decimals and where X + Y <= 100.
 
You can choose between removing the filtered values (remove) or setting the filtered values to the boundary values (saturate).
 
You can choose between removing the filtered values (remove) or setting the filtered values to the boundary values (saturate).
<br style="clear: both" />
+
====== Threshold Filter ======
<br/>
 
<br/>
 
<br/>
 
===== Threshold Filter =====
 
 
[[image:operation_tfilter.png|right|thumb|130px|Threshold Filter]]
 
[[image:operation_tfilter.png|right|thumb|130px|Threshold Filter]]
 
This option removes the values that are lower than X OR greater than Y, where X and Y are two specified threshold values.
 
This option removes the values that are lower than X OR greater than Y, where X and Y are two specified threshold values.
 
You can choose between removing the filtered values (remove) or setting the filtered values to the boundary values (saturate).
 
You can choose between removing the filtered values (remove) or setting the filtered values to the boundary values (saturate).
<br style="clear: both" />
+
====== Band-Stop Filter ======
<br/>
 
<br/>
 
<br/>
 
===== Band-Stop Filter =====
 
 
[[image:operation_bfilter.png|right|thumb|130px|Band-Stop Filter]]
 
[[image:operation_bfilter.png|right|thumb|130px|Band-Stop Filter]]
 
This option removes values between two specified threshold.
 
This option removes values between two specified threshold.
<br style="clear: both" />
+
====== Count Filter ======
<br/>
 
<br/>
 
<br/>
 
===== Count Filter =====
 
 
[[image:operation_cfilter.png|right|thumb|130px|Count Filter]]
 
[[image:operation_cfilter.png|right|thumb|130px|Count Filter]]
 
This option filters the X lowest values and the Y greatest values, where X and Y are two specified integers.
 
This option filters the X lowest values and the Y greatest values, where X and Y are two specified integers.
 
You can choose between removing the filtered values (remove) or setting the filtered values to the boundary values (saturate).
 
You can choose between removing the filtered values (remove) or setting the filtered values to the boundary values (saturate).
<br style="clear: both" />
+
===== Transfrag =====
<br/>
+
This operation aggregates the windows of the selected layer that are separated by a gap smaller than a specified size (in bp).  
<br/>
 
<br/>
 
==== Transfrag ====
 
This operation aggregates the windows of the selected track that are separated by a gap smaller than a specified size (in bp).  
 
  
 
The score of the new window can be the sum, the average or the maximum of the scores of the aggregated windows.
 
The score of the new window can be the sum, the average or the maximum of the scores of the aggregated windows.
<br/>
+
===== Score Distribution Histogram =====
<br/>
+
The show repartition operation generates a graph showing the distribution of the scores of the selected layers. The options for the type of plot are score v/s window count and score v/s base pair count.
<br/>
 
==== Show Repartition ====
 
The show repartition operation generates a graph showing the distribution of the scores of the selected tracks. The options for the type of plot are score v/s window count and score v/s base pair count.
 
  
 
The user needs to choose a size for the bins of scores. The graphics will show, depending on the selection, how many windows or how many base pair there is for each bin of scores.
 
The user needs to choose a size for the bins of scores. The graphics will show, depending on the selection, how many windows or how many base pair there is for each bin of scores.
<br/>
+
===== Convert Layer =====
<br/>
+
This operation converts the current layer into another layer among the following:
<br/>
+
*Gene Annotation Layer
==== Generate Fixed Window Track ====
+
*Microarray/Sequencing Layer bin/non-bin
This operation generates a fixed window track, with the specified bin size and data precision from the selected variable window track.
+
*Mask Layer
<br/>
+
==== Non Bin-ed Layers Only ====
<br/>
+
===== CG Methylation Profile =====
<br/>
+
This operation computes the methylation values on CG sequences by combining the value on the C position and the value on the G position.<br />
=== Fixed Window Track Operations ===
+
This is based on data fron a sequence layer in order to find CG sequences.
<br/>
+
 
<br/>
+
==== Bin-ed Layers Only ====
<br/>
+
===== Smooth =====
==== Operations With a Constant (Addition, Subtraction, Multiplication, Division, Invert) ====
+
The smooth operation can be processed according to the 3 following algorithms:
Please refer to the equivalent operation in the [[#Variable Window Track Operations|Variable Window Track Operations]] section for information about this functionality.
+
====== Gauss Smoothing ======
<br/>
 
<br/>
 
<br/>
 
==== Two Tracks Operation ====
 
Please refer to the equivalent operation in the [[#Variable Window Track Operations|Variable Window Track Operations]] section for information about this functionality.
 
<br/>
 
<br/>
 
<br/>
 
==== Gauss ====
 
 
[[image:operation_fwt_gaussian.png|right|thumb|150px|Sigma Value]]
 
[[image:operation_fwt_gaussian.png|right|thumb|150px|Sigma Value]]
This operation applies a [http://en.wikipedia.org/wiki/Gaussian_filter Gaussian filter] to the track, depending on the sigma value provided by the user.
+
This operation applies a [http://en.wikipedia.org/wiki/Gaussian_filter Gaussian filter] to the layer, depending on the sigma value provided by the user.
 
 
G(x) = (1 / √ (2Π) σ) * e-x2 / 2 σ2   
 
  
Where, x is the score and σ is the standard deviation of the track.
+
G(x) = (1 / v (2?) s) * e-x2 / 2 s2   
  
You can choose the extrapolate option to "fill" the windows with a score of zero.
+
Where, x is the score and s is the standard deviation of the layer.
<br/>
 
<br/>
 
<br/>
 
==== Moving Average ====
 
For each window of the track, compute the average on a region of a specified size center on the window and score the window with the result of this average. The half-size of the region is prompted prior to the calculation.
 
  
 
You can choose the extrapolate option to "fill" the windows with a score of zero.
 
You can choose the extrapolate option to "fill" the windows with a score of zero.
<br/>
+
====== Loess Smoothing ======
<br/>
+
This operation computes the Loess regression of degree 1 on the selected layer.  
<br/>
 
==== Loess Regression ====
 
This operation computes the Loess regression of degree 1 on the selected track.  
 
  
 
For each x value where a y value is to be calculated, the Loess technique performs a regression on points in a moving range around the x value, where the values in the moving range are weighted according to their distance from this X value.
 
For each x value where a y value is to be calculated, the Loess technique performs a regression on points in a moving range around the x value, where the values in the moving range are weighted according to their distance from this X value.
Line 1,082: Line 973:
  
 
You can choose the extrapolate option to "fill" the windows with a score of zero.
 
You can choose the extrapolate option to "fill" the windows with a score of zero.
<br/>
+
====== Moving Average Smoothing ======
<br/>
+
For each window of the layer, compute the average on a region of a specified size center on the window and score the window with the result of this average. The half-size of the region is prompted prior to the calculation.
<br/>
 
  
==== Indexation ====
+
You can choose the extrapolate option to "fill" the windows with a score of zero.
Please refer to the equivalent operation in the [[#Variable Window Track Operations|Variable Window Track Operations]] section for information about this functionality.
+
===== Find Peaks =====
<br/>
 
<br/>
 
<br/>
 
==== Indexation Per Chromosome ====
 
Please refer to the equivalent operation in the [[#Variable Window Track Operations|Variable Window Track Operations]] section for information about this functionality.
 
<br/>
 
<br/>
 
<br/>
 
==== Log ====
 
Please refer to the equivalent operation in the [[#Variable Window Track Operations|Variable Window Track Operations]] section for information about this functionality.
 
<br/>
 
<br/>
 
<br/>
 
==== Log With Damper ====
 
Please refer to the equivalent operation in the [[#Variable Window Track Operations|Variable Window Track Operations]] section for information about this functionality.
 
<br/>
 
<br/>
 
<br/>
 
==== Normalize ====
 
Please refer to the equivalent operation in the [[#Variable Window Track Operations|Variable Window Track Operations]] section for information about this functionality.
 
<br/>
 
<br/>
 
<br/>
 
==== Standard Score ====
 
Please refer to the equivalent operation in the [[#Variable Window Track Operations|Variable Window Track Operations]] section for information about this functionality.
 
<br/>
 
<br/>
 
<br/>
 
==== Minimum, Maximum ====
 
Please refer to the equivalent operation in the [[#Variable Window Track Operations|Variable Window Track Operations]] section for information about this functionality.
 
<br/>
 
<br/>
 
<br/>
 
==== Bin Count ====
 
The bin count operation displays the number of windows (bins) with a score different from 0 on the selected chromosomes. It shows a menu asking to select chromosomes.
 
<br/>
 
<br/>
 
<br/>
 
==== Score Count ====
 
[[image:operation_choosechromo.png|right|thumb|100px|Select Chromosomes]]
 
The score count operation returns the sum of the scores of each window of the selected chromosomes of the selected track. It shows a menu to select the chromosomes. If the track was initially loaded using some of the reads to summarize the data by windows this returns the total number of mapped reads in the experiments.
 
<br/>
 
<br/>
 
<br/>
 
==== Average ====
 
Computes the average score of the windows of the selected chromosomes.
 
<br/>
 
<br/>
 
<br/
 
>==== Standard Deviation ====
 
Computes the standard deviation of the scores of the windows of the selected chromosomes.
 
<br style="clear: both" />
 
<br/>
 
<br/>
 
<br/>
 
 
 
==== Correlation ====
 
[[image:operation_fwt_correlation.png|right|thumb|100px|Correlation Report]]
 
The correlation operation computes the Pearson’s correlation between the score values of two tracks. The two tracks need to have the same bin size. The following formula is used to calculate the correlation:
 
 
 
ρ = ( ∑ xi yi – n x’ y’) / ((n - 1) σx σy)
 
 
 
Where:
 
* ρ is the Pearson’s correlation
 
* xi and yi are the scores of the tracks
 
* n is the number of values
 
* x’ and y’ are the means of the scores of the tracks
 
* σx and σy are the standard deviations of the scores of the tracks
 
 
 
The figure on the right shows a correlation report.
 
 
 
'''Note:''' The correlation is computed only on the windows that are different from zero on both track. If one of the track has a zero value window, the window of the other track with the same coordinate will be skipped as well.
 
<br style="clear: both" />
 
<br/>
 
<br/>
 
<br/>
 
==== Filter ====
 
Please refer to the equivalent operation in the [[#Variable Window Track Operations|Variable Window Track Operations]] section for information about this functionality.
 
<br/>
 
<br/>
 
<br/>
 
==== Find Peaks ====
 
 
The find peak operation offers three different algorithms that can be used to find the peaks:
 
The find peak operation offers three different algorithms that can be used to find the peaks:
<br/>
+
====== Standard Deviation Peak Finder ======
<br/>
 
<br/>
 
===== Standard Deviation Peak Finder =====
 
 
[[image:operation_fwt_sfinder.png|left|thumb|150px|Standard Deviation Peak Finder]]
 
[[image:operation_fwt_sfinder.png|left|thumb|150px|Standard Deviation Peak Finder]]
 
The standard deviation peak finder prompts the user to enter two parameters.
 
The standard deviation peak finder prompts the user to enter two parameters.
Line 1,183: Line 988:
  
 
For a window to be accepted, its standard deviation needs to be at least ‘T’ times greater than the value of the standard deviation of the chromosome.
 
For a window to be accepted, its standard deviation needs to be at least ‘T’ times greater than the value of the standard deviation of the chromosome.
<br style="clear: both" />
+
====== Density Peak Finder ======
<br/>
 
<br/>
 
<br/>
 
===== Density Peak Finder =====
 
 
[[image:operation_fwt_dfinder.png|right|thumb|150px|Density Peak Finder]]
 
[[image:operation_fwt_dfinder.png|right|thumb|150px|Density Peak Finder]]
 
The Density Finder works as follows:
 
The Density Finder works as follows:
Line 1,194: Line 995:
  
 
For the window under consideration to be accepted, at least ‘P’ percentage of values must be above the high threshold ‘H’ or at least ‘P’ percentage of values must be below the low threshold ‘L’.
 
For the window under consideration to be accepted, at least ‘P’ percentage of values must be above the high threshold ‘H’ or at least ‘P’ percentage of values must be below the low threshold ‘L’.
<br style="clear: both" />
+
====== Island Finder ======
<br/>
 
<br/>
 
<br/>
 
===== Island Finder =====
 
 
[[image:operation_fwt_ifinder.png|left|thumb|150px|Island Finder]]
 
[[image:operation_fwt_ifinder.png|left|thumb|150px|Island Finder]]
 
The Island Finder is based on the algorithm described in the paper  
 
The Island Finder is based on the algorithm described in the paper  
Line 1,208: Line 1,005:
 
* Island score: Depicts the islands by considering the score.
 
* Island score: Depicts the islands by considering the score.
 
* Island Summit: Depicts the island with the summit of the input island as a score.
 
* Island Summit: Depicts the island with the summit of the input island as a score.
<br style="clear: both" />
+
===== Correlation =====
<br/>
+
[[image:operation_fwt_correlation.png|right|thumb|100px|Correlation Report]]
<br/>
+
The correlation operation computes the Pearson’s correlation between the score values of two layers. The two layers need to have the same bin size. The following formula is used to calculate the correlation:
<br/>
+
 
==== Transfrag ====
+
? = ( ? xi yi – n x’ y’) / ((n - 1) sx sy)
[[image:operation_transfrag.png|right|thumb|150px|Tracks Before and After Transfrag]]
+
 
This operation aggregates the bins of the selected track that are separated by a gap (bins with a score of zero) smaller than a specified size.
+
Where:
 +
* ? is the Pearson’s correlation
 +
* xi and yi are the scores of the layers
 +
* n is the number of values
 +
* x’ and y’ are the means of the scores of the layers
 +
* sx and sy are the standard deviations of the scores of the layers
 +
 
 +
The figure on the right shows a correlation report.
  
The score of the new window can be the sum, the average or the maximum of the scores of the aggregated windows. The result track can either be a fixed window track or a gene track.
+
'''Note:''' The correlation is computed only on the windows that are different from zero on both layer. If one of the layer has a zero value window, the window of the other layer with the same coordinate will be skipped as well.
<br style="clear: both" />
+
===== Density =====
<br/>
+
This operation generates a new fixed window layer where the score of the windows represent the density of non null windows in the neighborhood of the windows.
<br/>
 
<br/>
 
==== Change Bin Size ====
 
The change bin size operation changes the size of the bins of the track. It shows a dialog box allowing the user to enter the new bin size.
 
<br/>
 
<br/>
 
<br/>
 
==== Change Precision ====
 
The change precision operation allows you to change the data precision of the selected track. Refer to the [[#Data Precision|Data Precision]] section for further information regarding the data precision.
 
<br/>
 
<br/>
 
<br/>
 
==== Density ====
 
This operation generates a new fixed window track where the score of the windows represent the density of non null windows in the neighborhood of the windows.
 
 
You first need to enter the size S of the neighborhood.
 
You first need to enter the size S of the neighborhood.
 
For each window W, the algorithm count how many of the S windows before W and the S windows after W have a score different from zero. This value is then divided by 2 * S + 1 and the result is the score of W.
 
For each window W, the algorithm count how many of the S windows before W and the S windows after W have a score different from zero. This value is then divided by 2 * S + 1 and the result is the score of W.
<br/>
+
===== Intervals Scoring =====
<br/>
+
This operation needs two layers:
<br/>
+
* The selected layer that defines the scores
==== Show Repartition ====
+
* A second layer that defines the intervals
Please refer to the equivalent operation in the [[#Variable Window Track Operations|Variable Window Track Operations]] section for information about this functionality.
+
This operation generates a new layer containing the intervals of the "interval track". For each interval the algorithm then looks at the corresponding scores in the score layer, and compute either the maximum, the average or the sum of all the scores that fall in the interval. This value is the new score value in the result layer.
<br/>
+
 
<br/>
+
You can also choose to use only a certain percentage of the greatest scores that falls in the interval.
<br/>
+
===== Concatenate =====
==== Concatenate ====
+
[[image:operation_select_tracks.png|right|thumb|150px|Select Layers to Concatenate]]
[[image:operation_select_tracks.png|right|thumb|150px|Select Tracks to Concatenate]]
+
The concatenate operations allows you to generate a file containing the scores of multiple fixed window layers that have the same bin size.
The concatenate operations allows you to generate a file containing the scores of multiple fixed window tracks that have the same bin size.
 
 
The output file contains the following fields:
 
The output file contains the following fields:
 
# chromosome
 
# chromosome
 
# start position
 
# start position
 
# stop position
 
# stop position
# score track 1
+
# score layer 1
# score track 2
+
# score layer 2
# score track 3
+
# score layer 3
 
# ...
 
# ...
<br style="clear: both" />
+
<br /><br /><br />
<br/>
+
=== Gene Layer Operations ===
<br/>
+
Directly on a gene layer, you can:
<br/>
+
# Double click on a gene to open a web page describing the gene. Make sure that your input file contains a geneDBURL line as described in the [[#File Formats|File Formats]] section in order to enable this option.
==== Interval Summarization ====
 
This operation needs two tracks:
 
* The selected track that defines the scores
 
* A second track that defines the intervals
 
This operation generates a new track containing the intervals of the "interval track". For each interval the algorithm then looks at the corresponding scores in the score track, and compute either the maximum, the average or the sum of all the scores that fall in the interval. This value is the new score value in the result track.
 
 
 
You can also choose to use only a certain percentage of the greatest scores that falls in the interval.
 
<br/>
 
<br/>
 
<br/>
 
==== Generate Variable Window Track ====
 
This operation generate a variable window track from the selected fixed window track.
 
<br/>
 
<br/>
 
<br/>
 
=== Gene Track Operations ===
 
Directly on a gene track, you can:
 
# Double click on a gene to open a web page describing the gene. Make sure that your input file contains a searchURL line as described in the [[#File Formats|File Formats]] section in order to enable this option.
 
 
# Put the mouse over a gene to have some information about the name and the score of the gene. If the exons of the gene have different scores you can put your mouse over an exon to have the exon score.
 
# Put the mouse over a gene to have some information about the name and the score of the gene. If the exons of the gene have different scores you can put your mouse over an exon to have the exon score.
<br/>
+
==== Score Count ====
<br/>
+
This operation count the sum of all scores.<br />
<br/>
+
A window asks first to select chromosomes to include in the calculation (all by default).
 +
==== Average ====
 +
This operation computes the average of all scores.<br />
 +
A window asks first to select chromosomes to include in the calculation (all by default).
 +
==== Count Genes ====
 +
This operation count the total number of genes.<br />
 +
A window asks first to select chromosomes to include in the calculation (all by default).
 +
==== Count Genes with Non-Null Score ====
 +
This operation count the total number of genes excluding the ones with a score of 0.<br />
 +
A window asks first to select chromosomes to include in the calculation (all by default).
 +
==== Count Exons ====
 +
This operation count the total number of exons.<br />
 +
A window asks first to select chromosomes to include in the calculation (all by default).
 
==== Search Gene ====
 
==== Search Gene ====
[[image:operation_gene_search.png|left|thumb|100px|Find Gene]]
+
[[image:gene_search_gene.png|left|thumb|100px|Search Gene]]
Use this option to search a gene on the selected track by typing the name of the gene.
+
Use this option to search a gene on the selected layer by typing the name of the gene.
  
 
Check the Match Case option if you want the search to be case sensitive.
 
Check the Match Case option if you want the search to be case sensitive.
 
Check the whole word option if you want to search genes where the input match the whole name of the gene.
 
Check the whole word option if you want to search genes where the input match the whole name of the gene.
 
Press next or previous to find respectively the next or previous gene found.
 
Press next or previous to find respectively the next or previous gene found.
You can also open the Find Gene dialog by pressing CTRL+F after selecting a gene track.
+
You can also open the Find Gene dialog by pressing CTRL+F after selecting a gene layer.
<br style="clear: both" />
 
<br/>
 
<br/>
 
<br/>
 
 
==== Extract Intervals ====
 
==== Extract Intervals ====
[[image:operation_gene_extract_intervals.png|right|thumb|300px|Extract Intervals]]
+
[[image:gene_extract_intervals.png|right|thumb|200px|Extract Intervals]]
This option allows you to extract intervals defined relatively to the beginning, the end or the middle of a gene and to generate a new gene track showing these intervals.
+
This option allows you to extract intervals defined relatively to the beginning, the end or the middle of a gene and to generate a new gene layer showing these intervals.
  
You can, for example, defined promoters as regions that starts 100bp before the beginning of genes and that ends 150bp after the beginning of genes. This option would allow you to generate a new track from this parameters.
+
You can, for example, defined promoters as regions that starts 100bp before the beginning of genes and that ends 150bp after the beginning of genes. This option would allow you to generate a new layer from this parameters.
<br style="clear: both" />
 
<br/>
 
<br/>
 
<br/>
 
 
==== Extract Exons ====
 
==== Extract Exons ====
[[image:operation_gene_extract_exon.png|right|thumb|300px|Extract Exons]]
+
[[image:gene_extract_exons.png|right|thumb|150px|Extract Exons]]
This option generate a new gene track showing only the exons of the genes of the selected track.  
+
This option generate a new gene layer showing only the exons of the genes of the selected layer.  
  
 
You can choose between the three following options:
 
You can choose between the three following options:
Line 1,309: Line 1,084:
 
# Extract the last exon
 
# Extract the last exon
 
# Extract all the exons
 
# Extract all the exons
<br style="clear: both" />
+
==== Unique Score ====
<br/>
+
[[image:gene_unique_score.png|right|thumb|150px|Unique Score]]
<br/>
+
This operation sets the same score for all exons.
<br/>
 
 
==== Score Exons ====
 
==== Score Exons ====
To execute this operation you need to have at least one fixed or variable window track loaded. For each exon of each gene of the selected gene track this operation is going to compute either the average, the maximum or the sum of all the windows of the specified fixed or variable window track that falls in the exon.
+
[[image:gene_score_exons.png|right|thumb|200px|Score Exons]]
<br/>
+
To execute this operation you need to have at least one microarray/sequencing layer loaded. For each exon of each gene of the selected gene layer, this operation computes a new score based on the window score from the selected layer that falls into the exon. There are 3 different ways to compute the new score:
<br/>
+
*Base Coverage Sum
<br/>
+
*Maximum coverage
 
+
*RPKM
 
==== Filter ====
 
==== Filter ====
This option provides four different filters for gene tracks:
+
This option provides four different filters for gene layers:
<br/>
 
<br/>
 
<br/>
 
 
===== Percentage Filter =====
 
===== Percentage Filter =====
[[image:operation_pfilter.png|right|thumb|130px|Percentage Filter]]
+
[[image:gene_filter_percentage.png|right|thumb|130px|Percentage Filter]]
 
This option filters the genes with the X% lowest overall score and the Y% greatest overall scores where X and Y are two decimals and where X + Y <= 100.
 
This option filters the genes with the X% lowest overall score and the Y% greatest overall scores where X and Y are two decimals and where X + Y <= 100.
 
You can choose between removing the filtered values (remove) or setting the filtered values to the boundary values (saturate).
 
You can choose between removing the filtered values (remove) or setting the filtered values to the boundary values (saturate).
<br style="clear: both" />
 
<br/>
 
<br/>
 
<br/>
 
 
===== Threshold Filter =====
 
===== Threshold Filter =====
[[image:operation_tfilter.png|right|thumb|130px|Threshold Filter]]
+
[[image:gene_filter_threshold.png|right|thumb|130px|Threshold Filter]]
 
This option filters the genes with an overall score that are lower than X OR greater than Y, where X and Y are two specified threshold values.
 
This option filters the genes with an overall score that are lower than X OR greater than Y, where X and Y are two specified threshold values.
 
You can choose between removing the filtered values (remove) or setting the filtered values to the boundary values (saturate).
 
You can choose between removing the filtered values (remove) or setting the filtered values to the boundary values (saturate).
<br style="clear: both" />
 
<br/>
 
<br/>
 
<br/>
 
 
===== Band-Stop Filter =====
 
===== Band-Stop Filter =====
[[image:operation_bfilter.png|right|thumb|130px|Band-Stop Filter]]
+
[[image:gene_filter_band-stop.png|right|thumb|130px|Band-Stop Filter]]
 
This option removes the genes with an overall score between two specified threshold.
 
This option removes the genes with an overall score between two specified threshold.
<br style="clear: both" />
 
<br/>
 
<br/>
 
<br/>
 
 
===== Count Filter =====
 
===== Count Filter =====
[[image:operation_cfilter.png|right|thumb|130px|Count Filter]]
+
[[image:gene_filter_count.png|right|thumb|130px|Count Filter]]
 
This option filters the X lowest scored genes and the Y greatest scored genes, where X and Y are two specified integers.
 
This option filters the X lowest scored genes and the Y greatest scored genes, where X and Y are two specified integers.
 
You can choose between removing the filtered values (remove) or setting the filtered values to the boundary values (saturate).
 
You can choose between removing the filtered values (remove) or setting the filtered values to the boundary values (saturate).
<br style="clear: both" />
 
<br/>
 
<br/>
 
<br/>
 
 
 
==== Filter Strand ====
 
==== Filter Strand ====
You need to select a strand when prompted.  At the end of the operation the track will contain only the genes on the selected strand. All the other genes will have been removed.
+
You need to select a strand when prompted.  At the end of the operation the layer will contain only the genes on the selected strand. All the other genes will have been removed.
<br/>
 
<br/>
 
<br/>
 
 
 
 
==== Rename Genes ====
 
==== Rename Genes ====
 
This operation allows you to change the name of the genes. You need to provide a text file where each line contains the current gene name and the new gene name separated by a tabulation. Every time a gene with a name from the first column is found this name will be replace by the new gene name from the second column.
 
This operation allows you to change the name of the genes. You need to provide a text file where each line contains the current gene name and the new gene name separated by a tabulation. Every time a gene with a name from the first column is found this name will be replace by the new gene name from the second column.
<br/>
 
<br/>
 
<br/>
 
 
==== Distance Calculation ====
 
==== Distance Calculation ====
 
Development in progress, coming soon.
 
Development in progress, coming soon.
<br/>
 
<br/>
 
<br/>
 
 
==== Score Repartition Around Start ====
 
==== Score Repartition Around Start ====
You first need to select a Fixed window track containing the scores.  After that, you need to select the chromosomes on which you want to execute the operation. You also need to specify a bin size S, a bin count C and a method for the calculation of the scores.   
+
You first need to select a Fixed window layer containing the scores.  After that, you need to select the chromosomes on which you want to execute the operation. You also need to specify a bin size S, a bin count C and a method for the calculation of the scores.   
  
 
The operation will create C bins on each side of the start position of each gene.  The size S of each bin is in base-pair.  Depending of the method of calculation chosen the operation is going to compute the sum, the maximum or the average of the scores for each corresponding bin from each gene and display a bar  graph of the result. The data can be exported by right-clicking on the graph and using the "save as" function.  
 
The operation will create C bins on each side of the start position of each gene.  The size S of each bin is in base-pair.  Depending of the method of calculation chosen the operation is going to compute the sum, the maximum or the average of the scores for each corresponding bin from each gene and display a bar  graph of the result. The data can be exported by right-clicking on the graph and using the "save as" function.  
<br/><br/>
+
 
Multi-curve graph can be generated using the following procedure:  
+
 
To generate a comparison between 2 fixed-window tracks: 1) Perform an analysis for the first track as described above. 2) Save it to your hard drive. 3) Close the graph window. 4) Perform the same analysis on the second track. 4) Right click on the second graph and choose the load data option. 5) Load the first analysis. Colors of the curves, type of graphs (bar, points, curve) and scale can be adjusted by right-clicking on the graph.  Procedure can be used to load more than two graphs. To produce more complex graphs we recommend loading the saved data on your favorites spreadsheet software.  
+
Multi-curve graph can be generated using the following procedure:<br />
 +
To generate a comparison between 2 fixed-window layers: 1) Perform an analysis for the first layer as described above. 2) Save it to your hard drive. 3) Close the graph window. 4) Perform the same analysis on the second layer. 4) Right click on the second graph and choose the load data option. 5) Load the first analysis. Colors of the curves, type of graphs (bar, points, curve) and scale can be adjusted by right-clicking on the graph.  Procedure can be used to load more than two graphs. To produce more complex graphs we recommend loading the saved data on your favorites spreadsheet software.  
 
Score Repartition Around Start
 
Score Repartition Around Start
<br/>
+
<br /><br /><br />
<br/>
+
=== Repeat Layer Operations ===
<br/>
+
There is currently no operation available for the repeat layer.
 
+
<br /><br /><br />
=== Sequence Track Operations ===
+
=== DNA Sequence Layer Operations ===
There is currently no operation available for the sequence tracks.
+
There is currently no operation available for the sequence layers.
<br/>
+
<br /><br /><br />
<br/>
+
=== Mask Layer Operations ===
<br/>
+
Documentation in writing...
=== SNP Track Operations ===
+
<br /><br /><br />
Directly on a SNP track, you can put the mouse over a SNP to have some extra information about the name or the base counts ratio of the SNP.
+
=== Variant Layer Operations ===
<br/>
+
Documentation in writing...
<br/>
 
<br/>
 
==== Find Next / Find Previous ====
 
This operation set the position of the screen middle bar (red line) on the position of the next or the previous SNP on the track.
 
<br/>
 
<br/>
 
<br/>
 
==== Threshold Filter ====
 
[[image:operation_SNP_threshold.png|right|thumb|100px|Threshold Filter]]
 
The threshold filter operation removes all the SNPs with a first base count or the second base count smaller than specified thresholds.
 
<br style="clear: both" />
 
<br/>
 
<br/>
 
<br/>
 
==== Ratio Filter ====
 
[[image:operation_SNP_ratio.png|left|thumb|100px|Ratio Filter]]
 
The ratio filter operation removes all the SNPs where the ratio (first base count) / (second base count) is smaller or greater than specified values.
 
<br style="clear: both" />
 
<br/>
 
<br/>
 
<br/>
 
==== Remove SNPs Not In Genes ====
 
This operation will ask you to select a gene track in order to remove all the SNPs from the selected track that are not inside the genes of the gene track.
 
<br/>
 
<br/>
 
<br/>
 
=== Repeat Track Operations ===
 
There is currently no operation available for the repeat track.
 

Revision as of 17:10, 4 October 2013

Contents

Getting started

Starting GenPlay

GenPlay is freely available at http://www.genplay.net/wiki/index.php/Web_Start To start the software, click the button corresponding to the amount of memory that you wish to allocate to the Java virtual machine.

The amount of memory determines how many layers you will be able to load simultaneously. The programming philosophy behind GenPlay is to provide fast performances once the data is loaded. To achieve that goal the entire genome need to be loaded in memory for multiple layers at the same time. This results in high quality performance, but requires a lot of memory. The amount of memory needed per layer depends on the genome, the layer type, the window size, the data precision etc.

You should generally choose as much memory as you can afford on your system (generally about 70% of the total RAM memory that exists on your system). For mammalian genomes we recommend allocating at least 4 GB of RAM although you should be able to load a couple of genome-wide layers with 1GB or 1.5GB of RAM. Selecting analysis of only one chromosome at a time will drastically reduce the memory requirement and should allow you to load many layers at very high resolutions. Layers loaded in GenPlay can also be compressed as explained later in this documentation.

The amount of RAM memory available to GenPlay is displayed in the lower right corner of the screen.


The Welcome screen

The welcome screen is the first screen of GenPlay-MG and allow user to create or to load a project.

New Project

In order to create a new project, users must give it a name as shown in Figure 1.

Figure 1: Text field to define the project name

The second step is to choose a reference genome. Users can choose it using the different list according to the clade, the genome and the assembly (Figure 2).

Figure 2: Assembly chooser

Several chromosomes are available for each assembly but users can choose to select only some of them.
To open the chromosome chooser (Figure 3), users have to click on the tools button next to the assembly name.

Figure 3: Chromosome chooser

The third and last step is to choose between a Simple Genome Project and a Multi Genome Project. If the multi genome project option is selected, the welcome screen should be as the one shown in Figure 4.

Figure 4: Empty welcome screen for multi-genome project
Single Genome Project
Multi Genome Project
Introduction
VCF Files

VCF files describe differences between genomes. Usually, it concerns differences between one or several genomes of interest and the reference genome used for the mapping process. VCF files define multiple type of variations; GenPlay is able to read and represent the followings:

  • InDels
  • SNPs
  • SV (Structural Variation)

A complete description of VCF files is given on the 1000 genomes project website:
Variant Call Format specification

Tabix
1. Introduction

VCF files contain a lot of information which makes the scanning (loading) processes longer.
In order to increase the scanning efficiency, VCF files have to be compressed and indexed. The compression is done using BGZip and the indexing with Tabix.
Tabix manual reference pages
Tabix download

2. VCF files indexing methods
2.1. Using GenPlay

GenPlay is now able to compress and index VCF files using the VCF Loader.
The way the VCF Loader works is explained below. When you want to select the compressed file (.vcf.gz), simply select the VCF file (.vcf) instead. You may need to change the file extension filter in the file chooser in order to see .vcf files.
GenPlay will look then for compressed/indexed files at the same location, if nothing is found, it will offer to compress and index the selected VCF file (Figure 1).

Figure 1: VCF Loader compress/index

It is fully automatic and non-platform dependent (works on Windows, Linux and Mac).

2.2. Manually

First, please note the following process must be performed in either Linux or Mac environments.
Each VCF files must be first compress to a BGZF (.bgz file) format. Tabix provides a tool to perform the compression. After compression, VCF files must be indexed using the associated command. Once Tabix is installed, two commands are necessary to perform the indexation.

Available commands from the Tabix folder:
bgzip -f VCF_PATH;
tabix –p vcf VCF_PATH;

For example, a VCF file named my_vcf.vcf located in the same folder as Tabix can be indexed with the following commands (Figure 2):
bgzip -f ./my_vcf.vcf;
tabix –p vcf ./my_vcf.vcf.gz;

Figure 2: VCF file indexation command



Note: the first command replaces the current VCF file by the compressed VCF file (.vcf.gz). The second command creates the indexed VCF file in the current folder (.vcf.gz.tbi).
More options are available on Tabix manual reference pages.

The VCF Loader
1. Introduction

The VCF Loader is the most important part of multi-genome project settings. It allows users to load all necessary VCF files and to define how to extract information from them. It appears when users click on the "Edit" button from the welcome screen.
The Figure 3 shows an empty VCF Loader screen.

Figure 3: VCF loader

GenPlay-MG does not use directly the VCF file, it uses a compress version of it (.gz). Moreover, GenPlay-MG also needs the compress VCF file to be indexed with Tabix. Both file versions must be in the same folder and must have the same name, only file extensions differ (.gz and .tbi). In order to use GenPlay to generate additional files, please refer to the section above.

The user can add or remove rows by right clicking on the table.

2. Columns description

File
This column refers to the VCF file path. Once loaded, the raw name column is automatically filled with every raw genome name contained in the selected VCF file.
Raw name
The Raw name column list is automatically filled when a VCF file has been chosen. That list contains every genotype headers contained inside the selected VCF file. Because Genome names might be difficult to remembers, GenPlay-MG offers users the option of adding another name (an alias) using the Genome column.
Nickname
The Nickname column allows users to associate an alias to the selected genome. This alias will appear in GenPlay-MG and can be useful because genome names in VCF files are often non descriptive numbers that can be hard to remember.
Group
Users can gather genomes by group. Group names are used to distinguish genomes and to perform some specific functionalities.

3. Columns edition

Group, Nickname and File column have their own editable list.To edit a cell, click on it, go over the item you want to edit and choose one of the following action:
- Add (green symbol on empty item)
- Edit (pen symbol on an item)
- Delete (red symbol on an item)

That way, users can set up all columns before starting (or at the same time) to fill the table.

Note: The Raw name(s) column is automatically filled with genome name from the selected VCF file, that column cannot be edited manually.

Import/Export

Once a project has been set up, it can be saved using the import/export function. Pressing the export button saves an XML files to the hard drive. This XML file can then be imported to reload the project.

The XML file structure is simple. Each row are stored in row mark containing every attribute names such as group, genome, file and raw_name. The settings file is formatted as shown in Figure 4.

Figure 4: XML file settings

Note: If the user moves the VCF files or changes one of its genotype headers, the XML file will not work anymore. User has to modify file and/or raw_name attribute values.

Load Project

Documentation in writing...



GUI Overview

GUI Overview 1.Ruler 2.Track List 3.Control Panel 4.Status Bar

GenPlay main window is divided in 4 main parts:

  1. Ruler
  2. Track List
  3. Control Panel
  4. Status Bar




Ruler

The ruler shows the coordinates of the current displayed position.

Ruler 1.Option Button 2.Absolute Positions 3.Relative Positions

General Option Button

The button on the left of the ruler opens the pop-up menu with all the general options.

Absolute Positions

The numbers written in red on top of the ruler are the absolute position on the selected chromosome or scaffold.

The number on the left is the position of the first displayed base. This value can be negative.

The number in the middle is the position of the red line. This value can go from 0 to the length of the current chromosome or scaffold as specified in the chromosome configuration file.

The value on the right is the last displayed position. This value range from 1 to 2*(chromosome length).

Relative Position

The numbers written in black on the second line represent the distance from the middle in base pair.


Track List

The track list is the cornerstone of the GUI. From here you can load layers and execute operations.

The tracks are divided into two parts.

On the left, there is the track handler that becomes highlighted when the mouse is over it. By right clicking on the track handler, a contextual menu appears with all the operations that can be executed on the track and its layer(s).

On the right, the data can be visualized.


Control Panel

Control Panel 1.Position Bar 2.Zoom Bar 3.Chromosome Box 4.Position Text Field

The control panel is divided into 4 parts:

  1. Position Bar: the position bar allows you to change the position of the current displayed windows
  2. Zoom Bar: use the zoom bar to modify the level of zoom
  3. Chromosome Box: set the selected chromosome with the chromosome box
  4. Position Text Field: the position text field follows the format of the UCSC Genome Browser position field so it is easy to copy and paste the position from one browser to the other




Status Bar

Status Bar 1.Progress Bar 2.Stop Button 3.Operation Description 4.Memory Bar

The status bar helps monitor the progress of the current operation as well as memory usage. It is divided into 4 sub-components:

  1. Progress bar, shows the level of completion of the current operation
  2. Stop button, allows users to stop the current operation. If the button is not bright red the operation can't be stopped
  3. Operation description, displays a short text describing the current operation as well as the elapsed time from the beginning of the operation
  4. Memory bar, shows the amount of memory used and the amount of memory available. Make sure that you have enough memory before starting a new operation. You can delete or compress layers to free up memory.





Browsing the Genome

Changing the Position

You can change the position of the displayed window by:

  1. Dragging any track on the left or on the right with the left button of the mouse
  2. Clicking with the middle button of the mouse inside a track and then moving the cursor on the left or on the right of the middle red line
  3. Moving the knob of the position bar on the control panel
  4. Changing the value of the position text field on the control panel
  5. Using the keyboard left and right arrows
  6. Double-clicking on a track where you want to center the view




Changing the Chromosomes

You can switch the selected chromosome by:

  1. Changing the selection in the chromosome box on the control panel
  2. Changing the text of the position text field on the control panel




Changing the Zoom

The level of the zoom can be modified by:

  1. Wheeling up or down inside a track with the mouse wheel
  2. Using the zoom bar on the control panel
  3. Changing the text of the position text field on the control panel





Loading a Layer

Introduction

The layers are the way to show information from files. They can represent information in different manners.
A layer is created from a track, each track can contain one or several layers.
To load a layer in a track, right click on its handler (the blue part on the left of the track). This opens a contextual menu with the different actions available on the track. The menu of a track empty of layer looks like the one in figure 1.
By clicking "Add Layer" appears a dialog to select one of the different layer type GenPlay offers (Figure 2).
Examples of layers that can be loaded in GenPlay are available for download from the GenPlay Library accessible from the GenPlay.net website.




Loading a Sequencing/Microarray Layer

The Sequencing/Microarray layer allows the visualization of windows of variable/fix sizes with a score associated to these windows. Select the “Sequencing/Microarray Layer” option. This opens up a file chooser dialog box. Load the file of your choice from the list of available window files and click the open button.

Please refer to the File formats section if you want to know what kind of file can be loaded as a sequencing/microarray layer.

This opens a new dialog to set different parameters for the new layer (as shown on the figure below). The dialog is separated in 6 sections detailed below.

New Layer Settings Dialog

Layer Name

Gives a name to the layer.

Bin

By default, the windows generated in sequencing/microarray layer have a variable size. It represents very precisely the content of the file.
For some other purposes, users may want to have fixed windows size. They are useful to represent the results of many types of experiments including, but not limited to: CHIP-seq, RNA seq, and TimEX-seq. Files containing the results of alignments (SAM, bowtie, Eland) and files containing already created bin lists (bed, bgr, etc.) can be loaded using this option. In the case of alignment files, bin lists will be created on the fly as described below. Files containing the results of micro-array experiments can also be loaded as long as they are in one of the accepted formats. It lowers the resolution but usually offers better memory usage.
This is implemented here by enabling the "Bin Data" option. The "Bin Size" field will then be available in order to give the size of the windows in base pairs.

Important Note: A bin size of 1 bp will use a lot of memory. According to the experiment, it may be more efficient to disable the bin data option and stay in variable window size mode.

Score Calculation

Name and Score Calculation

It can happen that files contain overlapping windows. In this case, GenPlay splits them into smaller windows using a simple algorithm.
This algorithm can be chosen in that section offering the following possibilities:

  • Addition
  • Average
  • Maximum
  • Minimum

Some examples are shown in the sections below for both non bined and bined layers.

Strand

If your input file contains information regarding the strands, you'll be able to choose to load the data from either both or only one strand.

You can also decide to shift the reads from both strands as shown in the figure on the left. To shift the strands just put a value in the "Shift" input box.

The value you entered is going to be added to the position of the data on the 5' strand and subtracted from the ones on the 3' strand.

Fragment Length

Selected Chromosomes

By default all the chromosomes of the project are selected. If you want to change this selection, click on the "modify selection" button and uncheck the undesired chromosomes. Working on fewer chromosomes will save memory and loading time.

Important Note: GenPlay can accelerate the loading if you know that your file is sorted by chromosome. If you press Yes when GenPlay asks you if the file is sorted when your file is actually not sorted, the file may load incompletely, leading to a loss of valuable information. The chromosomes must be ordered the same way it is ordered in the chromosome selection combo-box.

Examples of Score Calculations

For non bined layer
Example 1

Input file

Chr Start Stop Score
Chr1 1125 1126 1
Chr1 1135 1136 1
Chr1 1135 1136 1
Chr1 1149 1150 1
Chr1 1175 1176 1
Chr1 1210 1211 1
Chr1 1230 1231 1
Chr1 1340 1341 1
Chr1 1345 1346 1


Result

Loading of an alignment file as a variable window layer




Example 2
Chr Start Stop Score
Chr1 1020 1120 30
Chr1 1120 1300 120
Chr1 1010 1350 100


Loading of an interval file as a variable window layer


Result

Chr Start Stop Average Maximum Sum
Chr1 1010 1020 100 100 100
Chr1 1020 1120 (100 + 30) / 2 = 65 Max(100, 30) = 100 100 + 30 = 130
Chr1 1120 1300 (100 + 120) / 2 = 110 Max(100, 120) = 120 100 + 120 = 220
Chr1 1300 1350 100 100 100


For bined layer
Example 1

Loading of an alignment file as a fixed window layer with a window size of 100:

(each line represents one read position, score is always one)

Input file

Chr Start Stop Score
Chr1 1125 1126 1
Chr1 1135 1136 1
Chr1 1135 1136 1
Chr1 1149 1150 1
Chr1 1175 1176 1
Chr1 1210 1211 1
Chr1 1230 1231 1
Chr1 1340 1341 1
Chr1 1345 1346 1


Loading of an alignment file as a fixed window layer with a window size of 100


Result

Chr Start Stop Average Maximum Sum
Chr1 1000 1100 1 1 5
Chr1 1100 1200 1 1 2
Chr1 1200 1300 1 1 2




Example 2

Loading of an alignment file as a fixed window layer with a window size of 100:

(each line represents one read position, score varies)

Input file

Chr Start Stop Score
Chr1 1125 1126 1
Chr1 1135 1136 3
Chr1 1145 1146 1
Chr1 1149 1150 1
Chr1 1175 1176 1
Chr1 1210 1211 1
Chr1 1230 1231 1
Chr1 1340 1341 6
Chr1 1345 1346 1


Loading of an alignment file as a fixed window layer with a window size of 100


Result

Chr Start Stop Average Maximum Sum
Chr1 1000 1100 7 / 5 = 1.4 3 7
Chr1 1100 1200 1 1 2
Chr1 1200 1300 7 / 2 = 3.5 6 7




Example 3

Loading of an interval file as a fixed window layer with a window size of 100:

Input file

Chr Start Stop Score
Chr1 1020 1120 30
Chr1 1120 1300 120
Chr1 1010 1350 100


Loading of an interval file as a fixed window layer with a window size of 100


Result

Chr Start Stop Average Maximum Sum
Chr1 1000 1100 (26.47 + 24) / 2 = 25.23 Max(26.47, 24) = 26.47 26.47 + 24 = 50.47
Chr1 1100 1200 (29.41 + 6 + 60) / 3 = 31.80 Max(29.41, 6, 60) = 60 29.41 + 6 + 60 = 95.41
Chr1 1200 1300 (29.41 + 60) / 2 = 44.70 Max(29.41 +60) = 60 29.41 +60 = 89.41
Chr1 1300 1400 14.70 14.70 14.70




Loading a Gene Annotation Layer

A Gene Layer
Score Color

Select the “Gene Layer" option. This opens up a file chooser dialog box that allows you to select the file that you want to load. Please refer to the File formats section if you want to know what kind of file can be loaded as a gene layer.

Once it's done, just wait until the loading is complete and the gene layer will appear in the track you selected.

Note that the genes on the plus strand are in red and the genes on the minus strand are in blue. If the file contains expression values, the exons are color coded to represent the expression (red = high, blue = low, as shown on the right).


Loading a Repeat Family Layer

Select the "Repeat Layer" option. This opens up a file chooser dialog box that allows you to select the file that you want to load. Please refer to the File formats section if you want to know what kind of file can be loaded as a repeat layer.

This layer type displays repeats organized by family or class.


Loading a DNA Sequence Layer

Select the “DNA Sequence Layer” option. This opens up a file chooser dialog box that allows you to select the file that you want to load. Please refer to the File formats section if you want to know what kind of file can be loaded as a sequence layer.

A Sequence Layer

Sequence layers show DNA sequences from .2bit files.

The hg18, hg19, mm8 and mm9 sequence files can be downloaded from the library of GenPlay.


Loading a Mask Layer

CPG Islands Shown As Stripes On a Refseq Gene Layer

Select the "Mask Layer" option. The stripes acting as masks can be useful to show regions of interest such as CpG Islands or repeat regions.

Check the File Formats section out if you need to know what kind of file can be loaded as a stripes.


Loading a Variant Layer

Add a Variant Layer

Select the "Variant Layer" option, this option is only available in multi-genome projects. This will pop up a new dialog to select which sample the user wants to load, and which variation(s). A variant layer is according to only one sample. It is also possible to change the colors of each variation independently by clicking on the colored square next to the variation checkbox.


Loading Data From a DAS Server

The distributed annotation system (DAS) is a client-server system in which a client can retrieve data from one or multiple servers. GenPlay can connect to any server that follows the DAS/1 protocol as specified by BioDAS

DAS Dialog

The “Add Layer from DAS Server” option from the track handler menu will show the DAS Dialog.

Select the server from which you want to retrieve the data in the "Server" box.

Then select the "Data Source". Most of the time, the Data Source corresponds to the reference genome that you want to work on.

Once that's done you need to select the data that you want to retrieve in the "Data Type" box.

GenPlay can either generate a gene layer or a variable window layer from the retrieved data. You can select what type of output layer you want in the "Generate" option.

Finally, you can also choose to download data on only a part of the genome. This can be useful because retrieving data from a DAS server can be time consuming.

Note: The DAS server section shows how to add new servers to the list of available servers in the DAS dialog.



Main Menu

Main Menu

On GenPlay’s main screen, click on the top left button (shown by a little hammer and wrench) to pop up the main menu.

New Project

This will pop up the welcome screen in order to start a new project. All work not saved will be lost.


Load / Save Project

This menu allows you to load or to save a whole GenPlay project in a space efficient binary compressed format. When you load a GenPlay project, all the tracks and layers of your current project will be replaced by the ones from the loaded project and all the information that hasn't been saved will be lost. Important Note: The GenPlay project files may be dependent on the version of GenPlay you're using. Be sure to remember with which version of GenPlay you saved a project and use the same version next time you load your project.


Full Screen

Click on this item from the main menu to toggle the full screen mode. When the full screen mode is on, the control panel and the status bar are hidden.

You can also toggle the full screen mode by pressing the F11 key.


Warnings report

This option will pop up the Warnings report dialog in order to consult previous and current alerts.


Option

The option menu item allows you to modify the configuration of GenPlay. Please refer to the section Changing the configuration of GenPlay for further information.


RNA To DNA Reference

This option allows you to transformed the coordinate system of the result of a RNA-Seq experiment based on alignment to a transcriptome (for instance all refseq genes), to a genomic coordinate system.

You need two files in order to use this functionality.

  1. The result of the RNA-Seq experiment, called "Coverage File" in GenPlay. This file must be in bedGraph file format.
  2. An annotation file in bed format.

Two output files can be generated:

  1. A bedGraph file with the position based on a reference genome
  2. A annotation GdpGene file

Here is an example: Coverage File:

NM_000016	0	413	0
NM_000016	413	456	1
NM_000016	456	471	2
NM_000016	471	488	3
NM_000016	488	494	2
NM_000016	494	504	3

Annotation File:

chr1	76190042	76229353	NM_000016	0	+	76190472	76228448	0	12	460,88,98,70,101,81,131,109,141,96,249,977,	0,4043,8286,8495,9170,10433,15622,21448,25061,26093,36764,38334,

The result as a bedGraph file is:

chr1	76190455	76190498	43.0
chr1	76190498	76190502	8.0
chr1	76194085	76194096	22.0
chr1	76194096	76194113	51.0
chr1	76194113	76194119	12.0
chr1	76194119	76194129	30.0

And the result as a GdpGene file is:

NM_000016	chr1	+	76190042	76229353	76190042,76194085,76198328,76198537,76199212,76200475,76205664,76211490,76215103,76216135,76226806,76228376	76190502,76194173,76198426,76198607,76199313,76200556,76205795,76211599,76215244,76216231,76227055,76229353	667888.95,1506024.1,0,0,0,0,0,0,0,0,0,0




Help and About GenPlay

The help and the about GenPlay options open a browser showing respectively the documentation and about pages of GenPlay website.


Exit

This option closes the application after asking for confirmation.



Changing the Configuration of GenPlay

Option Menu

Click on the option item of the main menu to open the configuration screen.


General Options

The following screen lets you set the general options.

The Default Directory lets the user choose which folder to open by default for any of the file chooser within GenPlay.

From this screen, you can also modify the appearance of the software by changing the look & feel.


Track Option

The Number of Tracks text box defines the maximum number of tracks that can be loaded on GenPlay.

The Default Track Height text box defines the height of each of the tracks.

The Undo Count text box defines the number of operations that can be undone. Note that the higher the number of undos selected, the more memory will be required.

The reset option allows the user to easily reset a layer in order to come back as if it has been freshly loaded.

The legend showing layers name on the upper right of a track can also be enabled or disabled.


DAS Server

The DAS server option shows the list of existing DAS servers along with the URL where these servers are located. It also provides the options to add new servers and remove existing servers.

GenPlay can communicate and retrieve data from the servers implementing the DAS/1 protocol


Restore Default

The Restore Default configuration restores everything back to the factory settings.





File Formats

The different file formats used in GenPlay are described on this page.



Using Tracks

Track Menu

Handling Tracks

Moving a Track

To move a track up or down in the track list, just click on the track handler (the left part of the track with the track number) and drag the track to the desired position.


Inserting a Track

To insert a track, right click on the track handler of the track right under where you want to insert and choose the "Insert" option.


Deleting a Track

To delete, select a track and click on the delete option of the contextual menu or press Delete on the keyboard.


Copying, Cutting and Pasting a Layer

Track Menu

To copy layers, select the desired track where the layers are and click on the copy option in the contextual menu or press CTRL+C.

To cut layers, select the desired track where the layers are and click on the cut option in the contextual menu or press CTRL+X.

To paste a track, select the track where you want to paste and click on the paste option in the contextual menu or press CTRL+P.
A new window will appear showing all layers recently copied/cut that can be pasted on the track. The user has to select all layers he wants to paste and then click "Ok".


Taking a Screenshot of the Track

To take a screenshot, select a track and choose the "Save as Image" option in the contextual menu.


Using the Undo / Redo / Reset Options

The undo, redo and reset options are only available for the Variable and Fixed Window layers. They are accessible from the contextual menu when you right click on the track handler.

The number of undo and redo operations available can be specified as described in the Track Option section. Note that this operations are memory consuming and reducing the number of undo / redo available can save memory.

The reset operation restore the track to the way it was right after being loaded. A reset operation can also be undone.


Track/Layer Settings

General

Track Settings - General
Basic Options
  • Name: The name of the track.
  • Height: The height of the track.
Axis Options
  • Show horizontal lines: Split the track horizontally.
  • Horizontal line count: Number of horizontal lines, equally separated.
  • Show vertical lines: Split the track vertically.
  • Vertical line count: Number of vertical lines, equally separated.
Score Options
  • Minimum Score: The minimum score to show.
  • Maximum Score: The maximum score to show.
  • Auto-rescaled: Enable the automatic score rescaling.
  • Score Position: Choose where the score is shown (top/bottom).
  • Score Color: Set the font color of the score.

Layers

Track Settings - Layers
  • Name: Click on the name to edit it.
  • Type: The type of layer.
  • Color: Click to edit the color of the layer.
  • Graph Type: Click to change the graph type:
    • Curve
    • Points
    • Bar
    • Dense
  • Visible: Show/hide the layer.
  • Active: Set the layer as "active". The active layer as direct interaction with the mouse pointer and clicks.
  • Set For Deletion: If set, the layer(s) will be deleted when clicking "Ok".





Operations

Once a layer is loaded, a right click on the location of the track handler opens a popup menu as shown in the figure below.

Operation Menu

The Operation sub-menu of the popup menu contains all the actions that you can use on the selected layer.

Sequencing/Microarray Layer Operations

Bin-ed and non bin-ed layers do not have all the same operations. They share most of them but some are specific.

Common operations

Show History

Show the history of the layer, every changes that have been made since loaded.

Constant Operation
Operation With Constant

Thes operations use one constant in the following ways:

  • Addition: adds the constant to each window (F(x) = x + constant).
  • Subtraction: substracts the constant to each window (F(x) = x - constant).
  • Multiplication: multiplies the score by the constant(F(x) = x * constant).
  • Division: divides the score by the constant (F(x) = x / constant).
  • Inversion: inverts the score of each windows (F(x) = constant / x).
  • Unique Score: sets all windows to an unique score (F(x) = constant).

The function can also be applied to null windows by checking the box.

Two Layers Operation

This allows operations between two Sequencing/Microarray layers, bin-ed and non bin-ed.
In order to set the operations, few windows appear in the following order:

  1. A first window appears in order to select the second layer.
  2. The second window asks in which track the resulting layer will be put.
  3. The third and last window offers the algorithms to complete the operation (x1: score first layer; x2: score second layer):
  • Addition: add scores (x = x1 + x2).
  • Subtraction: substract scores (x = x1 - x2).
  • Multiplication: multiply scores (x = x1 * x2).
  • Division: divide scores (x = x1 / x2).
  • Average: average score (x = (x1 + x2) / 2).
  • Maximum: keeps the highest score.
  • Minimum: keeps the lowest score.

Note: The only way the resulting layer would be a bin-ed layer is to make an operation between two bin-ed layer having the same bin size. Any other case will result in a non bin-ed layer.

Index

Indexation can be useful to compare multiple layers at the same scale. It "re-scales" existing scores to a new range defined by the user.
If scores go from 10 to 600 but for some reason would need to be observed between 0 and 100, this operation will do the work.
It will first ask for the new minimum and the new maximum. The next dialog asks to perfom the re-scaling by chromosome independently or genome wide.
Using the previous example, for a new scale of [0; 100] if the first chromosome as a maximum score of 600 and the second one has a maximum score of 800; 800 will become the reference value of 100 for both chromosomes if the operation is processed genome wide. If the operation is processed by chromosome independently, 600 will become the reference value of 100 for the first chromosome, and 800 for the second chromosome.

Since this operation uses the minimum and maximum scores, it is very important to note that indexing does not work well in the presence of outliers. Indexing works best if outliers are eliminated or removed first using a filter (see below).

Log
Logarithm Bases

For each window, the log operation applies the function f(x) = log(x), where x is the window score. The base of the logarithm function can be selected between either 2 (binary log), e (natural log) or 10 (common log).

Normalize
Normalization Coefficient

After a normalize operation the score of each window is divided by the result of the Score Count operation and multiplied by a specified fixed value. By default, after normalization the scores are expressed per 10 millions reads.

Standard Score

Calculates the standard score for the selected layer i.e. (x - avg) / stdev; where x is the score, avg is the average score of the layer and stdev is the standard deviation of the scores of the layer.

Filter

GenPlay provides four different filters:

Percentage Filter
Percentage Filter

This option filters the X% lowest values and the Y% greatest values where X and Y are two decimals and where X + Y <= 100. You can choose between removing the filtered values (remove) or setting the filtered values to the boundary values (saturate).

Threshold Filter
Threshold Filter

This option removes the values that are lower than X OR greater than Y, where X and Y are two specified threshold values. You can choose between removing the filtered values (remove) or setting the filtered values to the boundary values (saturate).

Band-Stop Filter
Band-Stop Filter

This option removes values between two specified threshold.

Count Filter
Count Filter

This option filters the X lowest values and the Y greatest values, where X and Y are two specified integers. You can choose between removing the filtered values (remove) or setting the filtered values to the boundary values (saturate).

Transfrag

This operation aggregates the windows of the selected layer that are separated by a gap smaller than a specified size (in bp).

The score of the new window can be the sum, the average or the maximum of the scores of the aggregated windows.

Score Distribution Histogram

The show repartition operation generates a graph showing the distribution of the scores of the selected layers. The options for the type of plot are score v/s window count and score v/s base pair count.

The user needs to choose a size for the bins of scores. The graphics will show, depending on the selection, how many windows or how many base pair there is for each bin of scores.

Convert Layer

This operation converts the current layer into another layer among the following:

  • Gene Annotation Layer
  • Microarray/Sequencing Layer bin/non-bin
  • Mask Layer

Non Bin-ed Layers Only

CG Methylation Profile

This operation computes the methylation values on CG sequences by combining the value on the C position and the value on the G position.
This is based on data fron a sequence layer in order to find CG sequences.

Bin-ed Layers Only

Smooth

The smooth operation can be processed according to the 3 following algorithms:

Gauss Smoothing
Sigma Value

This operation applies a Gaussian filter to the layer, depending on the sigma value provided by the user.

G(x) = (1 / v (2?) s) * e-x2 / 2 s2

Where, x is the score and s is the standard deviation of the layer.

You can choose the extrapolate option to "fill" the windows with a score of zero.

Loess Smoothing

This operation computes the Loess regression of degree 1 on the selected layer.

For each x value where a y value is to be calculated, the Loess technique performs a regression on points in a moving range around the x value, where the values in the moving range are weighted according to their distance from this X value.

The Loess regression is a smoothing function. You will need to precise the half size of the moving window on which the regression will be computed.

The weight function of the Loess regression is computed as follow: W(i) = (1 - X(i)^3)^3, where X(i) is the normalized distance: current distance / maximum distance among points in the moving regression.

You can choose the extrapolate option to "fill" the windows with a score of zero.

Moving Average Smoothing

For each window of the layer, compute the average on a region of a specified size center on the window and score the window with the result of this average. The half-size of the region is prompted prior to the calculation.

You can choose the extrapolate option to "fill" the windows with a score of zero.

Find Peaks

The find peak operation offers three different algorithms that can be used to find the peaks:

Standard Deviation Peak Finder
Standard Deviation Peak Finder

The standard deviation peak finder prompts the user to enter two parameters.

The parameter ‘S’ specifies the number of windows to be considered for each window on either side in order to calculate the standard deviation.

For example, if S = 10, it means that for each window we consider 10 windows to the left and 10 windows to the right to calculate the standard deviation.

For a window to be accepted, its standard deviation needs to be at least ‘T’ times greater than the value of the standard deviation of the chromosome.

Density Peak Finder
Density Peak Finder

The Density Finder works as follows:

The parameter ‘S’ specifies the number of windows to be considered for each window on either side of the window under consideration.

For the window under consideration to be accepted, at least ‘P’ percentage of values must be above the high threshold ‘H’ or at least ‘P’ percentage of values must be below the low threshold ‘L’.

Island Finder
Island Finder

The Island Finder is based on the algorithm described in the paper Zang, C., Schones, D. E., Zeng, C., Cui, K., Zhao, K., and Peng, W. (2009). A clustering approach for identification of enriched domains from histone modification chip-seq data. Bioinformatics (Oxford, England), 25(15):1952-1958.

The parameters window value and gap of the island finder are the parameters ‘l0’ and ‘g’ respectively. The island score allows the user to select the scores greater than or equal to a particular value. The island length parameter allows the user to select islands encompassing at least specified number of windows. There are two result types:

  • Start values: Depicts only those islands that are selected and removes the ones that are rejected.
  • Island score: Depicts the islands by considering the score.
  • Island Summit: Depicts the island with the summit of the input island as a score.
Correlation
Correlation Report

The correlation operation computes the Pearson’s correlation between the score values of two layers. The two layers need to have the same bin size. The following formula is used to calculate the correlation:

? = ( ? xi yi – n x’ y’) / ((n - 1) sx sy)

Where:

  •  ? is the Pearson’s correlation
  • xi and yi are the scores of the layers
  • n is the number of values
  • x’ and y’ are the means of the scores of the layers
  • sx and sy are the standard deviations of the scores of the layers

The figure on the right shows a correlation report.

Note: The correlation is computed only on the windows that are different from zero on both layer. If one of the layer has a zero value window, the window of the other layer with the same coordinate will be skipped as well.

Density

This operation generates a new fixed window layer where the score of the windows represent the density of non null windows in the neighborhood of the windows. You first need to enter the size S of the neighborhood. For each window W, the algorithm count how many of the S windows before W and the S windows after W have a score different from zero. This value is then divided by 2 * S + 1 and the result is the score of W.

Intervals Scoring

This operation needs two layers:

  • The selected layer that defines the scores
  • A second layer that defines the intervals

This operation generates a new layer containing the intervals of the "interval track". For each interval the algorithm then looks at the corresponding scores in the score layer, and compute either the maximum, the average or the sum of all the scores that fall in the interval. This value is the new score value in the result layer.

You can also choose to use only a certain percentage of the greatest scores that falls in the interval.

Concatenate
Select Layers to Concatenate

The concatenate operations allows you to generate a file containing the scores of multiple fixed window layers that have the same bin size. The output file contains the following fields:

  1. chromosome
  2. start position
  3. stop position
  4. score layer 1
  5. score layer 2
  6. score layer 3
  7. ...




Gene Layer Operations

Directly on a gene layer, you can:

  1. Double click on a gene to open a web page describing the gene. Make sure that your input file contains a geneDBURL line as described in the File Formats section in order to enable this option.
  2. Put the mouse over a gene to have some information about the name and the score of the gene. If the exons of the gene have different scores you can put your mouse over an exon to have the exon score.

Score Count

This operation count the sum of all scores.
A window asks first to select chromosomes to include in the calculation (all by default).

Average

This operation computes the average of all scores.
A window asks first to select chromosomes to include in the calculation (all by default).

Count Genes

This operation count the total number of genes.
A window asks first to select chromosomes to include in the calculation (all by default).

Count Genes with Non-Null Score

This operation count the total number of genes excluding the ones with a score of 0.
A window asks first to select chromosomes to include in the calculation (all by default).

Count Exons

This operation count the total number of exons.
A window asks first to select chromosomes to include in the calculation (all by default).

Search Gene

Search Gene

Use this option to search a gene on the selected layer by typing the name of the gene.

Check the Match Case option if you want the search to be case sensitive. Check the whole word option if you want to search genes where the input match the whole name of the gene. Press next or previous to find respectively the next or previous gene found. You can also open the Find Gene dialog by pressing CTRL+F after selecting a gene layer.

Extract Intervals

Extract Intervals

This option allows you to extract intervals defined relatively to the beginning, the end or the middle of a gene and to generate a new gene layer showing these intervals.

You can, for example, defined promoters as regions that starts 100bp before the beginning of genes and that ends 150bp after the beginning of genes. This option would allow you to generate a new layer from this parameters.

Extract Exons

Extract Exons

This option generate a new gene layer showing only the exons of the genes of the selected layer.

You can choose between the three following options:

  1. Extract the first exon of the genes
  2. Extract the last exon
  3. Extract all the exons

Unique Score

Unique Score

This operation sets the same score for all exons.

Score Exons

Score Exons

To execute this operation you need to have at least one microarray/sequencing layer loaded. For each exon of each gene of the selected gene layer, this operation computes a new score based on the window score from the selected layer that falls into the exon. There are 3 different ways to compute the new score:

  • Base Coverage Sum
  • Maximum coverage
  • RPKM

Filter

This option provides four different filters for gene layers:

Percentage Filter
Percentage Filter

This option filters the genes with the X% lowest overall score and the Y% greatest overall scores where X and Y are two decimals and where X + Y <= 100. You can choose between removing the filtered values (remove) or setting the filtered values to the boundary values (saturate).

Threshold Filter
Threshold Filter

This option filters the genes with an overall score that are lower than X OR greater than Y, where X and Y are two specified threshold values. You can choose between removing the filtered values (remove) or setting the filtered values to the boundary values (saturate).

Band-Stop Filter
Band-Stop Filter

This option removes the genes with an overall score between two specified threshold.

Count Filter
Count Filter

This option filters the X lowest scored genes and the Y greatest scored genes, where X and Y are two specified integers. You can choose between removing the filtered values (remove) or setting the filtered values to the boundary values (saturate).

Filter Strand

You need to select a strand when prompted. At the end of the operation the layer will contain only the genes on the selected strand. All the other genes will have been removed.

Rename Genes

This operation allows you to change the name of the genes. You need to provide a text file where each line contains the current gene name and the new gene name separated by a tabulation. Every time a gene with a name from the first column is found this name will be replace by the new gene name from the second column.

Distance Calculation

Development in progress, coming soon.

Score Repartition Around Start

You first need to select a Fixed window layer containing the scores. After that, you need to select the chromosomes on which you want to execute the operation. You also need to specify a bin size S, a bin count C and a method for the calculation of the scores.

The operation will create C bins on each side of the start position of each gene. The size S of each bin is in base-pair. Depending of the method of calculation chosen the operation is going to compute the sum, the maximum or the average of the scores for each corresponding bin from each gene and display a bar graph of the result. The data can be exported by right-clicking on the graph and using the "save as" function.


Multi-curve graph can be generated using the following procedure:
To generate a comparison between 2 fixed-window layers: 1) Perform an analysis for the first layer as described above. 2) Save it to your hard drive. 3) Close the graph window. 4) Perform the same analysis on the second layer. 4) Right click on the second graph and choose the load data option. 5) Load the first analysis. Colors of the curves, type of graphs (bar, points, curve) and scale can be adjusted by right-clicking on the graph. Procedure can be used to load more than two graphs. To produce more complex graphs we recommend loading the saved data on your favorites spreadsheet software. Score Repartition Around Start


Repeat Layer Operations

There is currently no operation available for the repeat layer.


DNA Sequence Layer Operations

There is currently no operation available for the sequence layers.


Mask Layer Operations

Documentation in writing...


Variant Layer Operations

Documentation in writing...