GenomeComb
Genomecomb moved to github on https://github.com/derijkp/genomecomb with documentation on https://derijkp.github.io/genomecomb. For up to date versions, go there. These pages only remain here for the data on the older scientific application (or if someone really needs a long obsolete version of the software)
cg process_reports ?options? sampledir ?dbdir? ?reports?
Calculates a number of statistics on a sample in the reports subdir
cg process_reports commands calculated a number of statistics on a sample and stores these in the subdir reports of the sample. If a type of report cannot be made (e.g. fastqstats if there are no fastqs for the sample), it will be skipped. Most reports are functional rather than fancy: a tsv file with sample, source (i.e. program used to make them), parameter and value
Following report types can be selected:
fastqstats: stats about number, length, quality, ... of reads in the fastq files made using fastq-stats (files report_fastq_fw-source.tsv and report_fastq_rev-sample.tsv) ; fastqc: fastqc analysis with graphs etc. per fastq file (in the fastqc subdir) ; flagstat_reads: stats about bamfiles in the sampledir made using samtools flagstat (file report_flagstat_reads-source.tsv) based on primary alignments only (so counts reads) ; flagstat_alignmments: stats about bamfiles in the sampledir made using samtools flagstat (file report_flagstat_alignments-source.tsv). This counts alignments, not reads (includes secondary alignments) ; histodepth: create a histogram of the sequencing depth. If a targetfile is present, histograms for on- and off-target regions will be separated. The report_* version contains coverage statistics at various depth cutoffs ; vars: number of variants, quality variants ($coverage >= 20 and $quality >= 50), etc in the various var files (file report_vars-source.tsv) ; hsmetrics: picard hsmetrics analysis of target coverage ; covered: how much bases are covered in the various region files (5x coverage, 20x coverage, gatk sequenced, ...), oper chromosome and in total (file report_covered-source.tsv) ; histo: "histogram" of coverage (file crsbwa-sample.histo) ; predictgender: predict gender of a sample
This command can be distributed on a cluster or using multiple threads with job options (more info with cg help joboptions)
Process