Documentation
From GenPlay, Einstein Genome Analyzer
Contents
- 1 Starting GenPlay
- 2 GUI Overview
- 3 Browsing the genome
- 4 Loading a track
- 5 Main Menu
- 6 Changing the configuration of GenPlay
- 7 File formats
- 8 Manipulating tracks
- 9 Using the operations
Starting GenPlay
GenPlay is freely available at http://www.genplay.net/wiki/index.php/Web_Start To start the software, click the button corresponding to the amount of memory that you wish to allocate to the Java virtual machine.
This amount of memory determines how many tracks you will be able to load at the same time. The programming philosophy behind GenPlay is to provide extremely fast performances once the data are loaded. To achieve that goal the entire genome can be loaded in memory for multiple tracks at the same time. This results in really nice performances but the cost is a requirement for a lot of memory. The amount of memory needed per track depends on the genome, the track type, the window size, the data precision etc.
You should generally choose as much memory as you can afford on your system (generally about 70% of the total RAM memory that exists on your system). For mammalian genomes we recommend allocating at least 4 GB of RAM although you should be able to load a couple of genome-wide tracks with 1GB or 1.5GB of RAM. Selecting analysis of only one chromosome at a time will drastically reduce the memory requirement and should allow you to load many tracks at very high resolutions. Tracks loaded in GenPlay can also be compressed (see below.)
The amount of RAM memory available to GenPlay is displayed in the lower right corner of the screen
GUI Overview
GenPlay main window is divided in 4 main parts:
- Ruler
- Track List
- Control Panel
- Status Bar
Ruler
The ruler display the current displayed position.
Absolute positions
The numbers written in red on top of the ruler are the absolute position on the selected chromosome or scaffold.
The number on the left is the position of the first displayed bases. This value can be negative.
The number in the middle is the position of the red line. This value can go from 0 to the length of the current chromosome or scaffold as specified in the chromosome configuration file.
The value on the right is the last displayed position. This value range from 1 to 2*(chromosome length).
Relative position
The numbers written in black on the second line represent the distance from the middle in base pair.
General Option Button
The button on the left of the ruler opens the popup-menu with all the general options.
Track List
The track list is the corner stone of the GUI. It's where you can load your tracks and execute operations.
The tracks are divided into two parts, on the left there is the track handler that becomes highlighted when the mouse is over it. A right click on the track handler pops up a contextual menu with all the operation that can be executed on the track.
The right part of the track is where the data can be visualized.
Control Panel
The control panel is divided into 4 parts:
- The position bar: the position bar allows you to change the position of the current displayed windows
- The zoom bar: use the zoom bar to modify the level of zoom
- The chromosome box: set the selected chromosome with the chromosome box
- The position text field: the position text field follows the format of the UCSC genome browser position field so it's easy to copy and paste the position from one browser to the other.
Status Bar
The status bar helps you to monitor the progress of the current operation as well as the memory usage. It is divided into 4 sub-components:
- Progress bar, shows the level of completion of the current operation
- Stop button, allows you to stop the current operation. If the button is not bright red the operation is not stoppable
- Operation description, displays a short text describing the current operation as well as the elapsed time from the beginning of the operation
- Memory bar, shows the amount of memory used and the amount of memory available. Make sure that you have enough memory before starting a new operation. You can delete tracks to free some memory.
Browsing the genome
Changing the position
You can change the position of the displayed windows by:
- Dragging any track on the left or on the right with the left button of the mouse
- Click with the middle button of the mouse inside a track and then moving the cursor on the left or on the right of the middle red line
- Changing the position of the position bar of the control panel
- Changing the value of the position text field of the control panel
- Using the keyboard left and right arrows
Changing the chromosomes
Switching the selected chromosome can be done by:
- Changing the selection in the chromosome box of the control panel
- Changing the text of the position text field of the control panel
Changing the zoom
The level of the zoom can be modified by:
- Wheeling up or down inside a track with the mouse wheel
- Using the zoom bar of the control panel
- Changing the text of the position text field of the control panel
Loading a track
To load a track in any row, right click on the handler of any empty track (the blue part with a number on the left of the track). This opens a menu including options to load the various types of tracks that exist in GenPlay.
Example of tracks that can be loaded in GenPlay can be downloaded from the GenPlay Library accessible from the GenPlay.net website.
Loading a variable window track
Variable window tracks allow the visualization of windows of variable sizes with a score associated to this windows.
Select the “Load Variable Window Track” option. This opens up a file chooser dialog box. Load the file of your choice from the list of available fixed window files and click the open button.
Then a new windows is going to appear and to ask which chromosome to extract. By default all the chromosomes of the project are selected. If you want to change this selection click on the "modify selection" button and uncheck the undesired chromosomes. Working on many chromosome will consume more memory and make the loadings longer.
Once it's done, a last window will pop-up and ask you to name the track. The default name is the file name without its extension. On the same window, but only if there is some overlapping windows in your input file, you'll have to tell GenPlay what to do with this overlapping windows. Overlapping windows are split into smaller windows using a simple algorithm.
Loading Fixed Window Track
Fixed window tracks display bin lists are useful to represent the results of many types of experiments including CHIP-seq, RNA seq, TimEX-seq etc. Files containing the results of alignment (SAM, bowtie, Eland) and files containing already created bin lists (bed, bgr, etc.) can be loaded using this option. In the case of alignment files, bin lists will be created on the fly as described below. Files containing the results of micro-array experiments can also be loaded as long as they are in one of the accepted format.
Once the menu pops up, select the “Load Fixed Window Track” option. This opens up a file chooser dialog box as shown in the figure below.
Load the track of your choice from the list of available fixed window tracks (CD36-H3K36me3_Cui_2009.bgr in this example) and click the open button.
Window Size
This specifies the size of the genomic windows (the bins) that will be created to summarize the results.
Score Calculation
The figure below illustrates how score calculation is done using different methods for a window size of 1000.
Data precision
Because GenPlay requires a lot of RAM memory, we provide the option of changing the precision at which the score for each bin is stored.
- Scores in 64 bit are stored in floating value double precision (which can represent extremely large numbers unlikely to be useful for genomic experiments).
- Scores in 32 bit are stored in floating value single precision (which can also represent very large number). Scores stored in 16 bits can range between - 3267.8 and +3267.7 (with one decimal digit).
- Scores stored in 8 bits can range between 0 and 255 (with no decimal).
- Score in 1 bit can be equal to zero or 1 (useful to create masks for instance).
We recommend storing scores in 32 or 16 bits
Chromosome selection
Either the whole genome can be loaded or only specific chromosomes (which saves time and memory).
Important Note: When specific chromosomes are selected, GenPlay works accurately only if the files that are loaded are sorted by chromosomes. Unsorted files will load incompletely, leading to loss of valuable information. In addition to being sorted by chromosomes, BED and BEDGraph files need to be sorted by the chromosome start values too.
When the OK button is clicked, the track is loaded as below in the location desired (in this case track 1).
Loading a gene track
Select the “Load Gene Track” option. This opens up a File Chooser Dialog Box as shown in the figure below. Load the track of your choice from the list of available fixed window tracks (AceView_From_UCSC_04-23-10_From(hg18).bed in this example) and click the open button. This loads the gene track at the desired location (in this case location 1) as shown below. Genes on the plus strand are in red, genes on the minus strand are in blue. If the file contains expression values, the exons are color coded to represent the expression (red = high, blue = low)
Loading a sequence track
These kinds of tracks show a DNA sequence from .2bit files.
Loading a SNP track
Loading a repeat track
This track type display repeats organized by family or class.
Loading data from a DAS server
Loading stripes
This operation loads the stripes along the start and stop positions of the genes. It can be used to superimpose on a track to coincide with its start and stop positions. As the figure below indicates, the width of a stripe is equal to the difference between the stop position and the start position of the gene.
Main Menu
On GenPlay’s main screen click on the top left button (shown by a little hammer and spanner) to pop up the main menu.
Load / Save Project
This menu allow you to load or to save a whole GenPlay project in a really HDD-space efficient binary compressed format. When you load a project of GenPlay all the track of your current project will be replaced by the one from the project you loaded and all the information that hadn't be saved will be lost. Important Note: The GenPlay project files may be dependent to the version of GenPlay you're using. Be sure to remember with which version of GenPlay you saved a project and use the same version next time you load your project.
Full Screen
Click on this item of the main menu to toggle the full screen mode. When the full screen mode is on, the control panel and the status bar are hidden. You can also toggle the full screen mode by pressing the F11 key.
Option
The option menu item allows you to modify the configuration of GenPlay. Please refer to the section Changing the configuration of GenPlay for further information.
RNA To DNA Reference
Help and About GenPlay
The help and the about GenPlay options open a browser showing respectively the documentation and about pages of GenPlay website.
Exit
This option closes the application after asking for confirmation.
Changing the configuration of GenPlay
Click on the option item of the main menu to open the configuration screen.
General options
The following screen let you set the general options:
The Default Directory lets you specify where the files containing GenPlay tracks will be stored in your file system.
The Log File is a text file that contains a time-stamped history of the files extracted and loaded on GenPlay.
From this screen, you can also modify the appearance of the software by changing the look&feel.
Configuration files
The configuration files screen allows the user to change the zoom file as well as the genome configuration file. It is necessary to restart GenPlay after modifying this option in order to take them into account.
Zoom file
The Zoom configuration file is a file that contains the predefined levels of zooming. To change this levels of zooming just create a text file with one level of zooming (in bp) per line order from the smallest to the greatest. Here is an exemple:
10 100 1000 10000 100000 1000000 10000000 100000000
Genome file
Once GenPlay is started a configuration file describing the genome that you want to analyze (the default is human hg19). Configurations are simple text file that specify the name and length of the chromosome or scaffold of the current genome. Configuration files for human and mouse recent assembly can be downloaded from the GenPlay library accessible from the GenPlay.net web page (please see below). Configuration files for any genome can easily be created in any word processor using the provided examples as a model. Here is an example of genome file:
chr1 249250621 chr5 180915260 chr13 115169878 chrX 155270560 chrY 59373566
Track option
The Number of Tracks text box defines the maximum number of tracks that can be loaded on GenPlay.
The Default Track Height text box defines the height of each of the tracks.
The Undo Count text box defines the number of operations that can be undone. Note that the higher the number of undo you select, the more memory will be required.
DAS server
The DAS server option shows the list of existing DAS servers along with the URL where these servers are located. It also provides options to add new servers and remove existing servers.
GenPlay can communicate and retrieve data from the servers implementing the DAS/1 protocol
Restore default
The Restore Default configuration restores everything back to the factory settings.
File formats
The different file formats used in GenPlay are described on this page.
Manipulating tracks
Move a track
To move a track, just click on the track handler (the left part of the track with the track number) and drag the track to the desired position.
