GenomeComb
Genomecomb moved to github on https://github.com/derijkp/genomecomb with documentation on https://derijkp.github.io/genomecomb. For up to date versions, go there. These pages only remain here for the data on the older scientific application (or if someone really needs a long obsolete version of the software)
This text gives examples of how to view results in a projectdir using the gui cg viz.
In the howto, the smal example and test data set ori_mixed_yri_mx2 downloadable from the genomecomb website will be used. This data set was derived from publically available exome and genome sequencing data by extracting only raw data covering the region of the MX2 gene (on chr21) and a part of the ACO2 gene (on chr22).
This howto expects a processed projectdir in tmp/mixed_yri_mx2. This can be created following the directions in howto_process_project. Alternatively you could also copy it from the expected dir (or adapt the path):
cp -a expected/mixed_yri_mx2 tmp/mixed_yri_mx2
Start up cg viz using
cg viz tmp/mixed_yri_mx2/compar/annot_compar-mixed_yri_mx2.tsv.lz4
This opens the annotated combined variant file (fomrmat described in tsv) using cg viz, allowing you to browse through the table (even if it is millions of lines long).
You can use the Fields button to limit the number of fields you want to see. The list on the right of the dialog shows the currently displayed fields. The list on the left shows available fields. Sample specific fields are indicated by having a - followed by a sample suffix. We will select to display only a limited set of sample specific fields:
The sample fields have a specific format, e.g.
When combining sample results, process_project will check if a variant is not present in a sample variant list, whether this is due to actually being reference (zyg = r) or being unsequenced (zyg = u), according to the criteria used. also other data, such as the quality of the "variant" call, is added for the reference calls where possible. You can see that no quality value is available (quality-* = ?) for unsequenced variants (zyg-* = u)
You can use Query to show only lines that fit a number of criteria. The query language is the same as supported by cg select and the specifics can be found in the cg select help. You can type a query directly into the Query field at the top, e.g. type $zyg-gatk-rdsbwa-gilNA19240mx2 == "m" to select only variants that are homozygous gatk calls for sample gilNA19240mx2 and press Enter.
You can use the "Query" button to get help in building queries or the "EasyQuery" button for adding some common queries in an easier way.
The Summaries button can be used to create summary data. This provides functionality similar to the -g and -gc options in cg select (more info in the cg select help), but you can select fields etc. in the GUI.
For example:
Make the tree view on the left larger by dragging the dividing line to the right. Here you can select other result files to view.