10 December 2014

Bedtools contain several handy utilities for genomic data analysis, especially useful for feature intersection analysis by using the function intersect. I like bedtools because of the simple BED format and speed of their algorithm. Below listed some of my personal usage examples for your reference.

## BED and other formats

BED format

  1. BED3: A BED file where each feature is described by chrom, start, and end.
For example: chr1 110 120  
Note: start is zero-based, where the first base in a chr is numbered 0. And the end is one-based. For example, a SNP at 10 should be coded as start=9, end=10.
  1. BED4: chrom, start, end and name.

Other genomic data format can be found from UCSC Genome Browser website

## Find SNPs in window, report SNPs and the windows they belong to.

bedtools intersect -a window.bed -b SNP.bed -wb

More details see here: intersect
* -a: window of bed format
* -b: SNPs of bed format
* -wb: write the original entry in B for each overlap

blog comments powered by Disqus