Chip seq 분석을 마친 후 narrowPeak 파일을 이용하여 Peak Profiling 을 진행하고 figure를 제작할 수 있습니다.
사용한 library는 ChIPseeker와 ggplot2 입니다.
narrowPeak 파일 format은 다음과 같습니다.
- chrom - Name of the chromosome (or contig, scaffold, etc.).
- chromStart - The starting position of the feature in the chromosome or scaffold. The first base in a chromosome is numbered 0.
- chromEnd - The ending position of the feature in the chromosome or scaffold. The chromEnd base is not included in the display of the feature. For example, the first 100 bases of a chromosome are defined aschromStart=0, chromEnd=100, and span the bases numbered 0-99.
- name - Name given to a region (preferably unique). Use '.' if no name is assigned.
- score - Indicates how dark the peak will be displayed in the browser (0-1000). If all scores were '0' when the data were submitted to the DCC, the DCC assigned scores 1-1000 based on signal value. Ideally the average signalValue per base spread is between 100-1000.
- strand - +/- to denote strand or orientation (whenever applicable). Use '.' if no orientation is assigned.
- signalValue - Measurement of overall (usually, average) enrichment for the region.
- pValue - Measurement of statistical significance (-log10). Use -1 if no pValue is assigned.
- qValue - Measurement of statistical significance using false discovery rate (-log10). Use -1 if no qValue is assigned.
- peak - Point-source called for this peak; 0-based offset from chromStart. Use -1 if no point-source called.
1~6 column은 Bedfile format과 동일 합니다.
track type=narrowPeak visibility=3 db=hg19 name="nPk" description="ENCODE narrowPeak Example"
browser position chr1:9356000-9365000
chr1 9356548 9356648 . 0 . 182 5.0945 -1 50
chr1 9358722 9358822 . 0 . 91 4.6052 -1 40
chr1 9361082 9361182 . 0 . 182 9.2103 -1 75
아래 코드들은 R 에서 작성되었습니다.
Usage1
library(ChIPseeker)
library(ggplot2)
peak <- readPeakFile("narrowPeak파일")
peak
## GRanges object with 1331 ranges and 2 metadata columns:
## seqnames ranges strand | V4 V5
## <Rle> <IRanges> <Rle> | <factor> <numeric>
## [1] chr1 815093-817883 * | MACS_peak_1 295.76
## [2] chr1 1243288-1244338 * | MACS_peak_2 63.19
## [3] chr1 2979977-2981228 * | MACS_peak_3 100.16
## [4] chr1 3566182-3567876 * | MACS_peak_4 558.89
## [5] chr1 3816546-3818111 * | MACS_peak_5 57.57
## ... ... ... ... . ... ...
## [1327] chrX 135244783-135245821 * | MACS_peak_1327 55.54
## [1328] chrX 139171964-139173506 * | MACS_peak_1328 270.19
## [1329] chrX 139583954-139586126 * | MACS_peak_1329 918.73
## [1330] chrX 139592002-139593238 * | MACS_peak_1330 210.88
## [1331] chrY 13845134-13845777 * | MACS_peak_1331 58.39
## -------
## seqinfo: 24 sequences from an unspecified genome; no seqlengths
출처 : https://genome.ucsc.edu/FAQ/FAQformat.html#format12
Usage2
peak <- readPeakFile("narrowPeak파일")
covplot("narrowPeak파일")
♣각각의 chr에서의 peak을 profile 할 수 있습니다.
Usage3
peak = GenomicRanges::GRangesList(샘플명=readPeakFile("narrowPeak파일"), 샘플명=readPeakFile("narrowPeak파일")
covplot(peak, weightCol="V5")
♣Control과 Test의 peak의 차이를 한번에 확인할 수 있습니다.
Usage4
peak = GenomicRanges::GRangesList(샘플명=readPeakFile("narrowPeak파일"), 샘플명=readPeakFile("narrowPeak파일")
covplot(peak, weightCol="V5", chrs = c("chr선택","chr선택"), xlim=c(x축 start position, X축 end position))
♣ 원하는 각 chromosome의 size를 제한 하여 peak을 볼 수 있습니다.
Additional Options
xlab -> x축 title
ylab -> y축 title
title -> figure title
----------------------------------
### + facet_grid(chr ~.id) -> 범례를 설정합니다. 또한 y축의 scale이 setting 됩니다.