Understand the output
How to interpret the results?
The results highly depend on a database subset, sequences are analyzed against. The numbers for provided input contains very little information. Therefore outlier/non-outlier classification is more important (and this part depends on database subset). The logic is that if the analyzed subset of 16S is an outlier against the chosen database, then it is unlikely that these sequences are from the same genome. To be sure we suggest to construct a phylogenetic tree with --tree
flag (the guide is here) for visual inspection.
Where the main whole analysis output?
The main results are under <project-name>_results/results
folder. Here are four files named Results_...csv
:
- Results_all.csv -> holds all the data used for the analysis combined (database subset + input genome)
- Results_mean_outliers.csv -> holds outlier values, based on the mean. (Are outside 1,5 IQR range)
- Results_median_outliers.csv -> holds outlier values, based on the median. (Are outside 1,5 IQR range)
- Results_no_ouliers.csv -> holds values for genomes, which are not outliers.
The example of dataframes looks like this:
Names | Mean | Median |
---|---|---|
Streptomyces_actuosus_ATCC_25421 | 0.0062216838 | 0.010051875 |
Streptomyces_albulus_ZPM | 0.00274316988888889 | 0.002422559 |
Streptomyces_antibioticus_DSM_41481 | 0.0185091656666667 | 0.016785757 |
The first column states for organism species and strain, the second and third - for mean and median branch length values for 16S in this genome
Where the results, concerning only my input sequence?
Results, only for the input sequence are under results/input_sequence_individual results
folder. The directory contains all in-between generated files, but the main one is the .csv one. This dataframe with one row contains results for the input genome.
note
The number of the files can vary depends on the input. Therefore, if --step 2
was used and set of 16S rRNAs was provided, there will be no ..._all_rrna.fasta
file, and so on.
What's the structure of results directory?
Here is the structure of the main results directory -> <project-name>_results/results