PEPPRO is a pipeline designed to process PRO-seq data. It is optimized on unique features of PRO-seq to be fast and accurate. It performs adapter removal, including UMI of variable length, read deduplication, trimming, mapping, and signal tracks (bigWig) for plus and minus strands using scaled (based on mappability information) or unscaled read count patterns.
PEPPRO produces quality control plots, summary statistics, and several data formats to set the stage for project-specific analysis.
- PEPPRO produces an easily-navigable HTML report when used with
Looper: View this HTML Summary report demo
- We have produced an interactive display of the output folder structure, which includes:
- Easily parsable summary statistics file
- BigWig signal tracks (plus and minus stranded):
- nucleotide-resolution, exact RNA polymerase position signal
- smoothed signal
- nucleotide-resolution signal corrected for enzymatic sequence bias
PEPPRO is a python script that runs on the command line (See usage). It can also read projects in PEP format. This means that
PEPPRO projects are also compatible with other PEP tools, and output can be conveniently read into
R using the
pepr package or into
Python using the
peppy package. The pipeline itself is customizable, enabling a user to adjust individual command settings or even swap out specific software by editing a few lines of human readable configuration files.