Create trans_diff object for the differential analysis on the taxonomic abundance.
This class is a wrapper for a series of differential abundance test and indicator analysis methods, including LEfSe based on the Segata et al. (2011) <doi:10.1186/gb-2011-12-6-r60>, random forest <doi:10.1016/j.geoderma.2018.09.035>, metastat based on White et al. (2009) <doi:10.1371/journal.pcbi.1000352>, the method in R package metagenomeSeq Paulson et al. (2013) <doi:10.1038/nmeth.2658>, non-parametric Kruskal-Wallis Rank Sum Test, Dunn's Kruskal-Wallis Multiple Comparisons based on the FSA package, Wilcoxon Rank Sum and Signed Rank Tests, t test and anova.
Authors: Chi Liu, Yang Cao, Chenhao Li
new()
trans_diff$new( dataset = NULL, method = c("lefse", "rf", "metastat", "mseq", "KW", "KW_dunn", "wilcox", "t.test", "anova")[1], group = NULL, taxa_level = "all", filter_thres = 0, alpha = 0.05, p_adjust_method = "fdr", lefse_subgroup = NULL, lefse_min_subsam = 10, lefse_norm = 1e+06, nresam = 0.6667, boots = 30, rf_ntree = 1000, group_choose_paired = NULL, mseq_count = 1, ... )
dataset
the object of microtable
Class.
method
default "lefse"; see the following available options:
LEfSe method based on Segata et al. (2011) <doi:10.1186/gb-2011-12-6-r60>
random forest and non-parametric test method based on An et al. (2019) <doi:10.1016/j.geoderma.2018.09.035>
Metastat method for all paired groups based on White et al. (2009) <doi:10.1371/journal.pcbi.1000352>
zero-inflated log-normal model-based differential test method from metagenomeSeq package.
KW: Kruskal-Wallis Rank Sum Test for all groups (>= 2)
Dunn's Kruskal-Wallis Multiple Comparisons when group number > 2; see dunnTest function in FSA package
Wilcoxon Rank Sum and Signed Rank Tests for all paired groups
Student's t-Test for all paired groups
Duncan's multiple range test for anova
group
default NULL; sample group used for the comparision; a colname of microtable$sample_table.
taxa_level
default "all"; 'all' represents using abundance data at all taxonomic ranks; For testing at a specific rank, provide taxonomic rank name, such as "Genus"; this parameter can be applied when method != "mseq"; 'mseq' method is performed on the feature abudance, i.e. microtable$otu_table.
filter_thres
default 0; the relative abundance threshold used for method != "metastat" or "mseq".
alpha
default 0.05; differential significance threshold for method = "lefse" or "rf"; used to select taxa with significance across groups.
p_adjust_method
default "fdr"; p.adjust method; see method parameter of p.adjust function for other available options; NULL mean disuse the p value adjustment; So when p_adjust_method = NULL, P.adj is same with P.unadj.
lefse_subgroup
default NULL; sample sub group used for sub-comparision in lefse; Segata et al. (2011) <doi:10.1186/gb-2011-12-6-r60>.
lefse_min_subsam
default 10; sample numbers required in the subgroup test.
lefse_norm
default 1000000; scale value in lefse.
nresam
default 0.6667; sample number ratio used in each bootstrap for method = "lefse" or "rf".
boots
default 30; bootstrap test number for method = "lefse" or "rf".
rf_ntree
default 1000; see ntree in randomForest function of randomForest package when method = "rf".
group_choose_paired
default NULL; a vector used for selecting the required groups for paired testing, only used for method = "metastat" or "mseq".
mseq_count
default 1; Filter features to have at least 'counts' counts.; see the count parameter in MRcoefs function of metagenomeSeq package.
...
parameters passed to cal_diff function of trans_alpha class when method is one of "KW", "KW_dunn", "wilcox", "t.test" and "anova".
res_diff and res_abund.
res_abund includes mean abudance of each taxa (Mean), standard deviation (SD), standard error (SE) and sample number (N) in the group (Group).
res_diff is the detailed differential test result, containing:
"Comparison": The groups for the comparision, maybe all groups or paired groups. If this column is not found, all groups used;
"Group": Which group has the maximum median or mean value across the test groups;
For non-parametric methods, median value; For t.test, mean value;
"Taxa": which taxa is used in this comparision;
"Method": Test method used in the analysis depending on the method input;
"LDA" or "MeanDecreaseGini": LDA: linear discriminant score in LEfSe; MeanDecreaseGini: mean decreasing gini index in random forest;
"P.unadj" and "P.adj": raw p value; P.adj: adjusted p value;
"qvalue": qvalue for metastat analysis.
\donttest{ data(dataset) t1 <- trans_diff$new(dataset = dataset, method = "lefse", group = "Group") t1 <- trans_diff$new(dataset = dataset, method = "rf", group = "Group") t1 <- trans_diff$new(dataset = dataset, method = "metastat", group = "Group", taxa_level = "Genus") t1 <- trans_diff$new(dataset = dataset, method = "wilcox", group = "Group") t1 <- trans_diff$new(dataset = dataset, method = "KW_dunn", group = "Group", taxa_level = "Phylum") }
plot_diff_abund()
Plotting the abundance of differential taxa.
trans_diff$plot_diff_abund( use_number = 1:20, color_values = RColorBrewer::brewer.pal(8, "Dark2"), select_group = NULL, select_taxa = NULL, simplify_names = TRUE, keep_prefix = TRUE, group_order = NULL, barwidth = 0.9, use_se = TRUE, add_sig = FALSE, add_sig_label = "Significance", add_sig_label_color = "black", add_sig_tip_length = 0.01, y_start = 1.01, y_increase = 0.05, text_y_size = 10, coord_flip = TRUE, ... )
use_number
default 1:20; numeric vector; the taxa numbers (1:n) used in the plot; If the n is larger than the number of total significant taxa, automatically use all the taxa.
color_values
default RColorBrewer::brewer.pal(8, "Dark2"); colors palette.
select_group
default NULL; this is used to select the paired groups. This parameter is especially useful when the comparision methods is applied to paired groups; The input select_group must be one of object$res_diff$Comparison.
select_taxa
default NULL; character vector to provide taxa names. The taxa names should be same with the names shown in the plot, not the 'Taxa' column names in object$res_diff$Taxa.
simplify_names
default TRUE; whether use the simplified taxonomic name.
keep_prefix
default TRUE; whether retain the taxonomic prefix.
group_order
default NULL; a vector to order groups, i.e. reorder the legend and colors in plot; If NULL, the function can first check whether the group column of sample_table is factor. If yes, use the levels in it. If provided, overlook the levels in the group of sample_table.
barwidth
default 0.9; the bar width in plot.
use_se
default TRUE; whether use SE in plot, if FALSE, use SD.
add_sig
default FALSE; whether add the significance label to the plot.
add_sig_label
default "Significance"; select a colname of object$res_diff for the label text, such as 'P.adj' or 'Significance'.
add_sig_label_color
default "black"; the color for the label text when add_sig = TRUE.
add_sig_tip_length
default 0.01; the tip length for the added line when add_sig = TRUE.
y_start
default 1.01; the y axis position from which to add the label; the default 1.01 means 1.01 * Value; For method != "anova", all the start positions are same, i.e. Value = max(Mean+SD or Mean+SE); For method = "anova"; the stat position is calculated for each point, i.e. Value = Mean+SD or Mean+SE.
y_increase
default 0.05; the increasing y axia space to add label for paired groups; the default 0.05 means 0.05 * y_start * Value; In addition, this parameter is also used to label the letters of anova result with the fixed (1 + y_increase) * y_start * Value.
text_y_size
default 10; the size for the y axis text.
coord_flip
default TRUE; whether flip cartesian coordinates so that horizontal becomes vertical, and vertical, horizontal.
...
parameters passed to ggsignif::stat_signif when add_sig = TRUE.
ggplot.
\donttest{ t1 <- trans_diff$new(dataset = dataset, method = "anova", group = "Group", taxa_level = "Genus") t1$plot_diff_abund(use_number = 1:10) t1$plot_diff_abund(use_number = 1:10, add_sig = TRUE) t1 <- trans_diff$new(dataset = dataset, method = "wilcox", group = "Group") t1$plot_diff_abund(use_number = 1:20) t1$plot_diff_abund(use_number = 1:20, add_sig = TRUE) t1 <- trans_diff$new(dataset = dataset, method = "lefse", group = "Group") t1$plot_diff_abund(use_number = 1:20) t1$plot_diff_abund(use_number = 1:20, add_sig = TRUE) }
plot_diff_bar()
Bar plot for LDA score.
trans_diff$plot_diff_bar( color_values = RColorBrewer::brewer.pal(8, "Dark2"), use_number = 1:10, threshold = NULL, select_group = NULL, simplify_names = TRUE, keep_prefix = TRUE, group_order = NULL, axis_text_y = 12, plot_vertical = TRUE, ... )
color_values
default RColorBrewer::brewer.pal(8, "Dark2"); colors palette for different groups.
use_number
default 1:10; numeric vector; the taxa numbers used in the plot, i.e. 1:n.
threshold
default NULL; threshold value for selecting taxa, such as 3 for LDA score of LEfSe.
select_group
default NULL; this is used to select the paired group when multiple comparisions are generated; The input select_group must be one of object$res_diff$Comparison.
simplify_names
default TRUE; whether use the simplified taxonomic name.
keep_prefix
default TRUE; whether retain the taxonomic prefix.
group_order
default NULL; a vector to order the legend and colors in plot; If NULL, the function can first check whether the group column of sample_table is factor. If yes, use the levels in it. If provided, this parameter can overwrite the levels in the group of sample_table.
axis_text_y
default 12; the size for the y axis text.
plot_vertical
default TRUE; whether use vertical bar plot or horizontal.
...
parameters pass to geom_bar
ggplot.
\donttest{ t1$plot_diff_bar(use_number = 1:20) }
plot_diff_cladogram()
Plot the cladogram using taxa with significant difference.
trans_diff$plot_diff_cladogram( color = RColorBrewer::brewer.pal(8, "Dark2"), use_taxa_num = 200, filter_taxa = NULL, use_feature_num = NULL, group_order = NULL, clade_label_level = 4, select_show_labels = NULL, only_select_show = FALSE, sep = "|", branch_size = 0.2, alpha = 0.2, clade_label_size = 2, clade_label_size_add = 5, clade_label_size_log = exp(1), node_size_scale = 1, node_size_offset = 1, annotation_shape = 22, annotation_shape_size = 5 )
color
default RColorBrewer::brewer.pal(8, "Dark2"); color palette used in the plot.
use_taxa_num
default 200; integer; The taxa number used in the background tree plot; select the taxa according to the mean abundance .
filter_taxa
default NULL; The mean relative abundance used to filter the taxa with low abundance.
use_feature_num
default NULL; integer; The feature number used in the plot; select the features according to the LDA score (method = "lefse") or MeanDecreaseGini (method = "rf") from high to low.
group_order
default NULL; a vector to order the legend and colors in plot; If NULL, the function can first check whether the group column of sample_table is factor. If yes, use the levels in it. If provided, this parameter can overwrite the levels in the group of sample_table.
clade_label_level
default 4; the taxonomic level for marking the label with letters, root is the largest.
select_show_labels
default NULL; character vector; The features to show in the plot with full label names, not the letters.
only_select_show
default FALSE; whether only use the the select features in the parameter select_show_labels.
sep
default "|"; the seperate character in the taxonomic information.
branch_size
default 0.2; numberic, size of branch.
alpha
default 0.2; shading of the color.
clade_label_size
default 2; basic size for the clade label; please also see clade_label_size_add and clade_label_size_log
clade_label_size_add
default 5; added basic size for the clade label; see the formula in clade_label_size_log parameter.
clade_label_size_log
default exp(1); the base of log function for added size of the clade label; the size formula: clade_label_size + log(clade_label_level + clade_label_size_add, base = clade_label_size_log); so use clade_label_size_log, clade_label_size_add and clade_label_size can totally control the label size for different taxonomic levels.
node_size_scale
default 1; scale for the node size.
node_size_offset
default 1; offset for the node size.
annotation_shape
default 22; shape used in the annotation legend.
annotation_shape_size
default 5; size used in the annotation legend.
ggplot.
\donttest{ t1$plot_diff_cladogram(use_taxa_num = 100, use_feature_num = 30, select_show_labels = NULL) }
print()
Print the trans_alpha object.
trans_diff$print()
clone()
The objects of this class are cloneable with this method.
trans_diff$clone(deep = FALSE)
deep
Whether to make a deep clone.
## ------------------------------------------------ ## Method `trans_diff$new` ## ------------------------------------------------ data(dataset) t1 <- trans_diff$new(dataset = dataset, method = "lefse", group = "Group") t1 <- trans_diff$new(dataset = dataset, method = "rf", group = "Group") t1 <- trans_diff$new(dataset = dataset, method = "metastat", group = "Group", taxa_level = "Genus") t1 <- trans_diff$new(dataset = dataset, method = "wilcox", group = "Group") t1 <- trans_diff$new(dataset = dataset, method = "KW_dunn", group = "Group", taxa_level = "Phylum") ## ------------------------------------------------ ## Method `trans_diff$plot_diff_abund` ## ------------------------------------------------ t1 <- trans_diff$new(dataset = dataset, method = "anova", group = "Group", taxa_level = "Genus") t1$plot_diff_abund(use_number = 1:10) t1$plot_diff_abund(use_number = 1:10, add_sig = TRUE) t1 <- trans_diff$new(dataset = dataset, method = "wilcox", group = "Group") t1$plot_diff_abund(use_number = 1:20) t1$plot_diff_abund(use_number = 1:20, add_sig = TRUE) t1 <- trans_diff$new(dataset = dataset, method = "lefse", group = "Group") t1$plot_diff_abund(use_number = 1:20) t1$plot_diff_abund(use_number = 1:20, add_sig = TRUE) ## ------------------------------------------------ ## Method `trans_diff$plot_diff_bar` ## ------------------------------------------------ t1$plot_diff_bar(use_number = 1:20) ## ------------------------------------------------ ## Method `trans_diff$plot_diff_cladogram` ## ------------------------------------------------ t1$plot_diff_cladogram(use_taxa_num = 100, use_feature_num = 30, select_show_labels = NULL)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.