使用conda安装Iso-seq3
conda create -n anaCogent5.2 python=2.7 anaconda source activate anaCogent5.2 conda install -n anaCogent5.2 biopython conda install -n anaCogent5.2 -c http://conda.anaconda.org/cgat bx-python conda install -n anaCogent5.2 -c bioconda isoseq3 conda install -n anaCogent5.2 -c bioconda pbccs conda install -n anaCogent5.2 -c bioconda lima #The packages below are optional: conda install -n anaCogent5.2 -c bioconda pbcoretools # for manipulating PacBio datasets conda install -n anaCogent5.2 -c bioconda bamtools # for converting BAM to fasta conda install -n anaCogent5.2 -c bioconda pysam # for making CSV reports
Running IsoSeq
Typical workflow:
1. Generate consensus sequences from your raw subread data
$ ccs movie.subreads.bam movie.ccs.bam –noPolish –minPasses 1
2. Generate full-length reads by primer removal and demultiplexing
$ cat primers.fasta
>primer_5p
AAGCAGTGGTATCAACGCAGAGTACATGGGG
>primer_3p
AAGCAGTGGTATCAACGCAGAGTAC
$ lima movie.ccs.bam primers.fasta movie.fl.bam –isoseq –no-pbi
3. Remove noise from FL reads
$ isoseq3 refine movie.fl.P5–P3.bam primers.fasta movie.flnc.bam
4. Cluster consensus sequences to generate unpolished transcripts
$ isoseq3 cluster movie.flnc.bam unpolished.bam –verbose
5. Optionally, polish transcripts using subreads
$ isoseq3 polish unpolished.bam movie.subreads.bam polished.bam
6. Map unpolished or polished transcripts to genome and collapse transcripts based on genomic mapping
$ pbmm2 align unpolished.bam reference.fasta aligned.sorted.bam –preset ISOSEQ –sort
$ isoseq3 collapse aligned.sorted.bam out.gff
or $ isoseq3 collapse aligned.sorted.bam movie.ccs.bam out.gff