1: Clone the
Clone the pipeline:
git clone https://github.com/databio/peppro.git
2: Install required software
PEPPRO uses several packages under the hood. Make sure you're up-to-date with a user-specific install:
cd peppro pip install --user -r requirements.txt
PEPPRO uses R to produce QC plots, and we include an R package for these functions. From the
Rscript -e 'install.packages("PEPPROr", repos=NULL, type="source")'
PEPPRO can mix and match tools for adapter removal, read trimming, deduplication, and reverse complementation. The use of
fqdedup, in particular, is useful if you wish to minimize memory use at the expense of speed. We suggest using the default tools simply due to the fact that
fastx toolkit has not been supported since 2012.
seqOutBias can be used to take into account the mappability at a given read length to filter the sample signal.
The pipeline relies on
refgenie assemblies for alignment. First, initialize a folder for genome indexes and the
refgenie config file.
export REFGENIE=your_genome_folder/genome_config.yaml refgenie init -c $REFGENIE
Then, just pull the assets you need.
refgenie pull -g hg38 -a bowtie2 refgenie pull -g rCRSd -a bowtie2 refgenie pull -g human_repeats -a bowtie2
REFGENIE to your .bashrc or .profile to ensure it persists). Alternatively, you can skip the
REFGENIE variable and simply change the value of the
resources.genome_config option in the
pipeline_config.yaml file to point to the folder where you stored the assemblies.
4: Run the pipeline script directly
The pipeline at its core is just a python script, and you can run it on the command line for a single sample (see command-line usage), which you can also get on the command line by running
pipelines/peppro.py --help. You just need to pass a few command-line parameters to specify sample name, reference genome, input files, etc. Here's the basic command to run the included small test example through the pipeline:
/pipelines/peppro.py \ --sample-name test \ --genome hg38 \ --input examples/data/test_r1.fq.gz \ --single-or-paired single \ -O $HOME/peppro_example/
5. Next steps
This is just the beginning. For your next step, take a look at one of these user guides: