wget -c ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/taxonomic_divisions/uniprot_sprot_vertebrates.dat.gz wget -c ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/taxonomic_divisions/uniprot_trembl_vertebrates.dat.gz zcat uniprot_sprot_vertebrates.dat.gz uniprot_trembl_vertebrates.dat.gz > uniprot_vertebrates.dat awk '{if (/^ /) {gsub(/ /, ""); print} else if (/^AC/) print ">" $2}' uniprot_vertebrates.dat > uniprot_vertebrates.fasta diamond makedb --in uniprot_vertebrates.fasta -d uniprot_vertebrates diamond blastp -d uniprot_vertebrates.dmnd -q grass_carp.pep.fasta --evalue 1e-5 > blastp.outfmt6 python -m jcvi.formats.blast best -n 1 blastp.outfmt6 python add_annotation_from_dat.py blastp.outfmt6.best /data/database/UniProt-Plant/uniprot_plants.dat
github获取add_annotation_from_dat.py
之后会输出swiss_annotation.tsv, 输出信息包括如下几列
gene id
uniprot accession
identity
homology species
EnsemblPlants
GO ID
GO component, CC/MF/BP
evidence
Leave a Reply