An Introduction To Exomepeak: Jia Meng, PHD Modified: 18 August, 2013. Compiled: June 24, 2014
An Introduction To Exomepeak: Jia Meng, PHD Modified: 18 August, 2013. Compiled: June 24, 2014
An Introduction To Exomepeak: Jia Meng, PHD Modified: 18 August, 2013. Compiled: June 24, 2014
1 Introduction
The exomePeak R-package has been developed based on the MATLAB “exome-
Peak” package, for the analysis of RNA epitranscriptome sequencing data with
affinity-based shotgun sequencing approach, such as MeRIP-Seq or m6A-Seq.
The exomePeak package is under active development, please don’t
hesitate to contact me @ jia.meng@hotmail if you have any ques-
tions. The inputs of the main function “exomepeak” are the IP BAM files and
input control BAM files:
From one experiment condition: for peak calling to identify the RNA
methylation sites
From two experimental conditions: for peak calling and differential anal-
ysis to unveil the post-transcriptional regulation of RNA modifications.
Gene annotation can be provided as a GTF file, a transcriptDb object, or auto-
matically downloaded from UCSC through the internet. Let us firstly load the
package and get the toy data (came with the package) ready.
> library("exomePeak")
> gtf=system.file("extdata", "example.gtf", package="exomePeak")
> f1=system.file("extdata", "IP1.bam", package="exomePeak")
> f2=system.file("extdata", "IP2.bam", package="exomePeak")
> f3=system.file("extdata", "IP3.bam", package="exomePeak")
> f4=system.file("extdata", "IP4.bam", package="exomePeak")
> f5=system.file("extdata", "Input1.bam", package="exomePeak")
> f6=system.file("extdata", "Input2.bam", package="exomePeak")
> f7=system.file("extdata", "Input3.bam", package="exomePeak")
> f8=system.file("extdata", "treated_IP1.bam", package="exomePeak")
> f9=system.file("extdata", "treated_Input1.bam", package="exomePeak")
>
We will in the next see how the two main functions can be accomplished in
a single command.
1
The first main function of “exomePeak” R-package is to call peaks (enriched
binding sites) to detect RNA methylation sites on the exome. Inputs are the
gene annotation GTF file, IP and Input control samples in BAM format. This
function is used when data from only one condition is available.
> names(result)
The results will be saved in the specified output directory, including the
identified (consistent) peaks in BED/table format. The BED format can be
visualized in genome browser directly and the peaks may span one or multiple
introns. The function also returns two GRangesList objects, in which there are
called peaks and consistent peaks.
The consistent peaks in the latter appear on all the IP replicates compared
with the merged Input control sample, and is thus recommended. The log p-
value, log fdr and fold enrichment of the identified peaks are stored as metadata,
which can be extracted with command mcols.
2
1 -47.8 -46.5 8.05
2 -15.1 -14.0 9.55
3 -15.0 -13.9 3.78
4 -221.0 -219.0 15.50
5 -14.6 -13.6 5.81
6 -163.0 -161.0 17.50
or to get all the peak detected (some of them do not consistently appear on
all replicates.):
When there are MeRIP-Seq data available from two experimental conditions,
the “exomepeak” function may can unveil the dynamics in post-transcriptional
regulation of the RNA methylome. In the following example, the function will
report the sites that are post-transcriptional differentially methylated between
the two tested conditions (TREATED vs. UNTREATED).
3
[1] "1 TREATED IP replicate(s)"
[1] "1 TREATED Input replicate(s)"
[1] "---------------------------------"
[1] "Peak calling and differential analysis result: "
[1] "13 peaks detected."
[1] "Please check 'diff_peak.bed/xls' under /tmp/Rtmp3hNLoZ/Rbuildd643f522fcd/exomePeak/vign
[1] "---------------------------------"
[1] "0 significantly differential methylated peaks are detected."
[1] "Please check 'sig_diff_peak.bed/xls' under /tmp/Rtmp3hNLoZ/Rbuildd643f522fcd/exomePeak/
[1] "---------------------------------"
[1] "0 consistent significantly differential methylated peaks are detected.(Recommended list
[1] "Please check 'con_sig_diff_peak.bed/xls' under /tmp/Rtmp3hNLoZ/Rbuildd643f522fcd/exomeP
[1] "---------------------------------"
The algorithm will firstly identify reads enriched binding sites or peaks,
and then check whether the sites are differentially methylated between the two
experimental conditions. The results will be saved in the specified output di-
rectory, including the identified (consistent) peaks in BED and table formats,
along with the differential information indicating whether the site is hyper- or
hypo-methylated under the treated condition. Similar to the peak calling case,
the BED format can be visualized in genome browser directly and the peaks
may span one or multiple introns.
Similar to the peak calling case, the function will report a set of consis-
tent differentially methylated peaks saved in the specified folder, which is the
recommended set. The function also returns 3 GRangesList object, containing
all the peaks, the differentially methylated peaks with the given threshold on
the merged data, consistently differentially methylated peaks. The consistent
differentially methylated peaks in the last appear to be differential for all the
replicates and is thus recommended. The information of the identified peaks
and the differential analysis are stored as metadata, which can be extracted.
> names(result)
[1] TRUE
4
DataFrame with 6 rows and 3 columns
lg.p lg.fdr fold_enrchment
<numeric> <numeric> <numeric>
1 -5.91 -5.05 2.47
2 -50.70 -49.50 9.19
3 -24.70 -23.50 5.00
4 -222.00 -220.00 14.40
5 -15.40 -14.40 5.97
6 -171.00 -170.00 14.70