This script encapsulates what used to be 3 widely used perl scripts in our lab. It automatically detects the correct function to be called and writes the respective output file of this function.
python3 fetch.py [TXT FILE] [FASTA FILE] --jobs [optional number of cores, default 1]
TXT FILE: .txt file containing sequences in a pattern respective to one of the 3 functions (fetch_fam, fetch_seqs or fetch_seqs_coords)
FASTA FILE: FASTA file to search for sequences
Fetches sequences named in the .txt file from the FASTA file and adds them to an output FASTA named according to family name on txt file.
.txt file follows Orthofinder/OrthoMCL pattern for orthogroups, with the family name tab-separated from sequences). Example:
OG0001 Seq1 Seq2 Seq3 Seq4
OG0002 Seq5 Seq6 Seq6 Seq7 Seq8
Fetches sequences named in the .txt file from the FASTA file and adds them to the same output FASTA.
Sequence names should be one on each line. Example:
Seq1
Seq2
Seq3
Seq4
Fetches sequences named in the .txt file at their estabilished coordinates from the FASTA file and adds them to the same output FASTA
Sequences/scaffolds should be one on each line, followed by tab separated starting and ending coords. Example:
Scaffold1 50 200
Scaffold2 10 40
Scaffold3 10000 10500