Notes on learning ANGSD
The biggest advantages of ANGSD as they claimed are “Most methods take genotype uncertainty into account instead of basing the analysis on called genotypes. This is especially useful for low and medium depth data”. The software was forked to my own repo.
Install from Github
You also need to install
htslib, although I have no idea what CRAM is, I believe it might be something more fast or a space saving format (like SAM?). I have to say the installation become so easy with github!
git clone https://github.com/samtools/htslib.git git clone git://github.com/ANGSD/angsd.git cd angsd make
Preparation for BAM input
The following codes all following ANGSD’s tutorial.
### download data wget http://popgen.dk/software/download/angsd/bams.tar.gz tar xf bams.tar.gz ### indexing them for i in bams/*.bam;do samtools index $i;done ### create a list ls bams/*.bam > bam.filelist
SNP and genotype calling
SNPs are called based on their allele frequencies by
-doMaf. Basically, they will call a SNP if a site has a minor allele frequency significantly different from 0. (Note: how about really minor allele?)
### MAF for every basepair angsd -bam bam.filelist -doMajorMinor 2 -doMaf 8 -doCounts 1 -out out ### SNP calling angsd -bam bam.filelist -GL 1 -out outfile -doMaf 2 -SNP_pval 1e-6 -doMajorMinor 1 ### Genotype Likelihoods angsd -bam bam.filelist -GL 1 -doGlf 2 -doMajorMinor 1 -doMaf 2 -SNP_pval 2e-6 -out genolike -nThreads 10 ### Genotype calling in one step angsd -bam bam.filelist -GL 2 -out gatk_outfile -doMaf 2 -doMajorMinor 1 -SNP_pval 1e-6 -doGeno 5 -doPost 1 -postCutoff 0.95
-doPlink 1, it will output PLINK format.
blog comments powered by Disqus