alignment
The following tools are grouped under alignment as they take alignment files (.bam) as input.
bam
Given a bam file as input, compute the following general alignment statistics and output them as json. The output will contain the following fields:
field |
description |
|---|---|
Input reads |
total number of reads in the BAM file |
Mapped |
total number of reads mapped to the reference genome |
Unmapped |
total number of reads that are not mapped |
Options
option |
description |
required |
default value |
|---|---|---|---|
–bam |
Path to the input bam file (MUST be co-ordinate sorted and indexed) |
✓ |
|
–out-json |
Output file to write json formatted data |
✓ |
|
–min-q |
Minimum alignment quality |
✗ |
0 |
Usage
ngs-statter bam --bam path/to/alignment.bam --out-json path/to/output.json --min-q 0
STAR
Compute alignment statistics for an alignment file generated using STAR aligner and output as json. The output will contain the following fields:
field |
description |
|---|---|
Reads for mapping |
total number of reads in the BAM file |
Mapped: Total |
total number of reads mapped to the reference genome |
Mapped: Uniquely mapped reads |
total number of reads mapped to a unique location in the reference genome |
Mapped: Multimapped reads |
total number of reads mapped to multiple locations in the reference genome |
Mapped: PCR duplicate reads |
total number of mapped reads marked as PCR duplicates in the BAM file, if the BAM file is marked for duplicates, see samtools markdup |
Mapped: Unique reads |
total number of reads mapped to a unique location in the reference genome and not marked as PCR duplicates in the BAM file, if the BAM file is marked for duplicates, see samtools markdup |
Unmapped: Total |
total number of reads that are not mapped to the reference genome |
Unmapped: mapped to too many loci |
total number of reads that marked as unmapped as they are mapped to too many locations in the reference genome |
Unmapped: no seed/windows |
total number of reads that are marked as unmapped as they do not have a seed region that can be mapped to the reference genome |
Unmapped: too many mismatches |
total number of reads that are marked as unmapped as they have too many mismatches compared to the reference genome |
Unmapped: too short |
total number of reads that are marked as unmapped as the seed regions are too short to be mapped to the reference genome |
Unmapped: paired-end mate |
for paired end reads, total number of reads that are marked as unmapped as their paired end mate is mapped to the reference genome |
Options
option |
description |
required |
default value |
|---|---|---|---|
–bam |
Path to the input bam file (MUST be co-ordinate sorted and indexed) |
✓ |
|
–out-json |
Output file to write json formatted data |
✓ |
|
–min-q |
Minimum alignment quality |
✗ |
0 |
Usage
ngs-statter STAR --bam path/to/star_alignment.bam --out-json path/to/output.json --min-q 0