Non-standard inputs
The rRNADif pipeline consists of 16S annotation, MSA generation, phylogeny computation steps. Each step can be omitted if the proper input is supplied.
Use set of 16S rRNAs
The set of 16S sequences in fasta format can be used as an input (substituting the genome sequence). In that case, the barrnap annotation step will be omitted. The only flag that is meant to be passed is --step 2
.
Example
warning
Please do not pass sequences with identical names - the program will crash. Please also use the following names of sequences:
So one of the sequence names must be shorter than the rest of them (If they are the length program will crash). This is known bug -> we are working to resolve it as soon as possible!
tip
You can also provide the fasta file with a set of not-related 16S sequences from different genomes. In this case, the rRNADif results about intragenomic variability would have no sense, but the program itself can be used as a quick pipeline to build high quality 16S phylogenetic trees. More information is in the tree building guide
Use Multiple sequence alignment results
The results of the multiple sequence alignment should be in fasta format. The --step 3
flag should be also supplied.
If the point of the analysis is to build the final phylogenetic tree, then the sequences of 16S rRNA should be also passed via -s
flag
warning
Please do not pass sequences with identical names - the program will crash. Please also use the following names of sequences:
So one of the sequence names must be shorter than the rest of them (If they are the length program will crash). This is known bug -> we are working to resolve it as soon as possible! This warning is for both: MSA file and 16S sequences file
Use phylogenetic tree in newick format
You can provide the computed phylogenetic tree as an input in newick format (.nwk files):
If you want to compute the final phylogenetic tree please provide sequences in fasta format via -s
flag:
warning
Please do not pass sequences with identical names - the program will crash. Please also use the following names of sequences:
So one of the sequence names must be shorter than the rest of them (If they are the length program will crash). This is known bug -> we are working to resolve it as soon as possible! This warning is for both: .nwk file and 16S sequences file
Choose a database to analyse against
By default, rRNADif will use the pre-computed database of 21000+ complete bacterial genomes (Bacteria_full). But custom databases could be constructed and therefore used in the analysis. To change the database -n
flag can be used, followed by database name.
note
For more see database creation guide).