microeco: trans_diff – R documentation

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

trans_diff

Create trans_diff object for the differential analysis on the taxonomic abundance.

Description

This class is a wrapper for a series of differential abundance test and indicator analysis methods, including LEfSe based on the Segata et al. (2011) <doi:10.1186/gb-2011-12-6-r60>, random forest <doi:10.1016/j.geoderma.2018.09.035>, metastat based on White et al. (2009) <doi:10.1371/journal.pcbi.1000352>, the method in R package metagenomeSeq Paulson et al. (2013) <doi:10.1038/nmeth.2658>, non-parametric Kruskal-Wallis Rank Sum Test, Dunn's Kruskal-Wallis Multiple Comparisons based on the FSA package, Wilcoxon Rank Sum and Signed Rank Tests, t test and anova.

Authors: Chi Liu, Yang Cao, Chenhao Li

Methods

Method `new()`

Usage

trans_diff$new(
  dataset = NULL,
  method = c("lefse", "rf", "metastat", "mseq", "KW", "KW_dunn", "wilcox", "t.test",
    "anova")[1],
  group = NULL,
  taxa_level = "all",
  filter_thres = 0,
  alpha = 0.05,
  p_adjust_method = "fdr",
  lefse_subgroup = NULL,
  lefse_min_subsam = 10,
  lefse_norm = 1e+06,
  nresam = 0.6667,
  boots = 30,
  rf_ntree = 1000,
  group_choose_paired = NULL,
  mseq_count = 1,
  ...
)

Arguments

dataset

the object of microtable Class.

method

default "lefse"; see the following available options:

'lefse': LEfSe method based on Segata et al. (2011) <doi:10.1186/gb-2011-12-6-r60>
'rf': random forest and non-parametric test method based on An et al. (2019) <doi:10.1016/j.geoderma.2018.09.035>
'metastat': Metastat method for all paired groups based on White et al. (2009) <doi:10.1371/journal.pcbi.1000352>
'mseq': zero-inflated log-normal model-based differential test method from metagenomeSeq package.
'KW': KW: Kruskal-Wallis Rank Sum Test for all groups (>= 2)
'KW_dunn': Dunn's Kruskal-Wallis Multiple Comparisons when group number > 2; see dunnTest function in FSA package
'wilcox': Wilcoxon Rank Sum and Signed Rank Tests for all paired groups
't.test': Student's t-Test for all paired groups
'anova': Duncan's multiple range test for anova

group

default NULL; sample group used for the comparision; a colname of microtable$sample_table.

taxa_level

default "all"; 'all' represents using abundance data at all taxonomic ranks; For testing at a specific rank, provide taxonomic rank name, such as "Genus"; this parameter can be applied when method != "mseq"; 'mseq' method is performed on the feature abudance, i.e. microtable$otu_table.

filter_thres

default 0; the relative abundance threshold used for method != "metastat" or "mseq".

alpha

default 0.05; differential significance threshold for method = "lefse" or "rf"; used to select taxa with significance across groups.

p_adjust_method

default "fdr"; p.adjust method; see method parameter of p.adjust function for other available options; NULL mean disuse the p value adjustment; So when p_adjust_method = NULL, P.adj is same with P.unadj.

lefse_subgroup

default NULL; sample sub group used for sub-comparision in lefse; Segata et al. (2011) <doi:10.1186/gb-2011-12-6-r60>.

lefse_min_subsam

default 10; sample numbers required in the subgroup test.

lefse_norm

default 1000000; scale value in lefse.

nresam

default 0.6667; sample number ratio used in each bootstrap for method = "lefse" or "rf".

boots

default 30; bootstrap test number for method = "lefse" or "rf".

rf_ntree

default 1000; see ntree in randomForest function of randomForest package when method = "rf".

group_choose_paired

default NULL; a vector used for selecting the required groups for paired testing, only used for method = "metastat" or "mseq".

mseq_count

default 1; Filter features to have at least 'counts' counts.; see the count parameter in MRcoefs function of metagenomeSeq package.

...

parameters passed to cal_diff function of trans_alpha class when method is one of "KW", "KW_dunn", "wilcox", "t.test" and "anova".

Returns

res_diff and res_abund.
res_abund includes mean abudance of each taxa (Mean), standard deviation (SD), standard error (SE) and sample number (N) in the group (Group).
res_diff is the detailed differential test result, containing:
"Comparison": The groups for the comparision, maybe all groups or paired groups. If this column is not found, all groups used;
"Group": Which group has the maximum median or mean value across the test groups; For non-parametric methods, median value; For t.test, mean value;
"Taxa": which taxa is used in this comparision;
"Method": Test method used in the analysis depending on the method input;
"LDA" or "MeanDecreaseGini": LDA: linear discriminant score in LEfSe; MeanDecreaseGini: mean decreasing gini index in random forest;
"P.unadj" and "P.adj": raw p value; P.adj: adjusted p value;
"qvalue": qvalue for metastat analysis.

Examples

\donttest{
data(dataset)
t1 <- trans_diff$new(dataset = dataset, method = "lefse", group = "Group")
t1 <- trans_diff$new(dataset = dataset, method = "rf", group = "Group")
t1 <- trans_diff$new(dataset = dataset, method = "metastat", group = "Group", taxa_level = "Genus")
t1 <- trans_diff$new(dataset = dataset, method = "wilcox", group = "Group")
t1 <- trans_diff$new(dataset = dataset, method = "KW_dunn", group = "Group", taxa_level = "Phylum")
}

Method `plot_diff_abund()`

Plotting the abundance of differential taxa.

Usage

trans_diff$plot_diff_abund(
  use_number = 1:20,
  color_values = RColorBrewer::brewer.pal(8, "Dark2"),
  select_group = NULL,
  select_taxa = NULL,
  simplify_names = TRUE,
  keep_prefix = TRUE,
  group_order = NULL,
  barwidth = 0.9,
  use_se = TRUE,
  add_sig = FALSE,
  add_sig_label = "Significance",
  add_sig_label_color = "black",
  add_sig_tip_length = 0.01,
  y_start = 1.01,
  y_increase = 0.05,
  text_y_size = 10,
  coord_flip = TRUE,
  ...
)

Arguments

use_number: default 1:20; numeric vector; the taxa numbers (1:n) used in the plot; If the n is larger than the number of total significant taxa, automatically use all the taxa.
color_values: default RColorBrewer::brewer.pal(8, "Dark2"); colors palette.
select_group: default NULL; this is used to select the paired groups. This parameter is especially useful when the comparision methods is applied to paired groups; The input select_group must be one of object$res_diff$Comparison.
select_taxa: default NULL; character vector to provide taxa names. The taxa names should be same with the names shown in the plot, not the 'Taxa' column names in object$res_diff$Taxa.
simplify_names: default TRUE; whether use the simplified taxonomic name.
keep_prefix: default TRUE; whether retain the taxonomic prefix.
group_order: default NULL; a vector to order groups, i.e. reorder the legend and colors in plot; If NULL, the function can first check whether the group column of sample_table is factor. If yes, use the levels in it. If provided, overlook the levels in the group of sample_table.
barwidth: default 0.9; the bar width in plot.
use_se: default TRUE; whether use SE in plot, if FALSE, use SD.
add_sig: default FALSE; whether add the significance label to the plot.
add_sig_label: default "Significance"; select a colname of object$res_diff for the label text, such as 'P.adj' or 'Significance'.
add_sig_label_color: default "black"; the color for the label text when add_sig = TRUE.
add_sig_tip_length: default 0.01; the tip length for the added line when add_sig = TRUE.
y_start: default 1.01; the y axis position from which to add the label; the default 1.01 means 1.01 * Value; For method != "anova", all the start positions are same, i.e. Value = max(Mean+SD or Mean+SE); For method = "anova"; the stat position is calculated for each point, i.e. Value = Mean+SD or Mean+SE.
y_increase: default 0.05; the increasing y axia space to add label for paired groups; the default 0.05 means 0.05 * y_start * Value; In addition, this parameter is also used to label the letters of anova result with the fixed (1 + y_increase) * y_start * Value.
text_y_size: default 10; the size for the y axis text.
coord_flip: default TRUE; whether flip cartesian coordinates so that horizontal becomes vertical, and vertical, horizontal.
...: parameters passed to ggsignif::stat_signif when add_sig = TRUE.

Returns

ggplot.

Examples

\donttest{
t1 <- trans_diff$new(dataset = dataset, method = "anova", group = "Group", taxa_level = "Genus")
t1$plot_diff_abund(use_number = 1:10)
t1$plot_diff_abund(use_number = 1:10, add_sig = TRUE)
t1 <- trans_diff$new(dataset = dataset, method = "wilcox", group = "Group")
t1$plot_diff_abund(use_number = 1:20)
t1$plot_diff_abund(use_number = 1:20, add_sig = TRUE)
t1 <- trans_diff$new(dataset = dataset, method = "lefse", group = "Group")
t1$plot_diff_abund(use_number = 1:20)
t1$plot_diff_abund(use_number = 1:20, add_sig = TRUE)
}

Method `plot_diff_bar()`

Bar plot for LDA score.

Usage

trans_diff$plot_diff_bar(
  color_values = RColorBrewer::brewer.pal(8, "Dark2"),
  use_number = 1:10,
  threshold = NULL,
  select_group = NULL,
  simplify_names = TRUE,
  keep_prefix = TRUE,
  group_order = NULL,
  axis_text_y = 12,
  plot_vertical = TRUE,
  ...
)

Arguments

color_values: default RColorBrewer::brewer.pal(8, "Dark2"); colors palette for different groups.
use_number: default 1:10; numeric vector; the taxa numbers used in the plot, i.e. 1:n.
threshold: default NULL; threshold value for selecting taxa, such as 3 for LDA score of LEfSe.
select_group: default NULL; this is used to select the paired group when multiple comparisions are generated; The input select_group must be one of object$res_diff$Comparison.
simplify_names: default TRUE; whether use the simplified taxonomic name.
keep_prefix: default TRUE; whether retain the taxonomic prefix.
group_order: default NULL; a vector to order the legend and colors in plot; If NULL, the function can first check whether the group column of sample_table is factor. If yes, use the levels in it. If provided, this parameter can overwrite the levels in the group of sample_table.
axis_text_y: default 12; the size for the y axis text.
plot_vertical: default TRUE; whether use vertical bar plot or horizontal.
...: parameters pass to geom_bar

Returns

ggplot.

Examples

\donttest{
t1$plot_diff_bar(use_number = 1:20)
}

Method `plot_diff_cladogram()`

Plot the cladogram using taxa with significant difference.

Usage

trans_diff$plot_diff_cladogram(
  color = RColorBrewer::brewer.pal(8, "Dark2"),
  use_taxa_num = 200,
  filter_taxa = NULL,
  use_feature_num = NULL,
  group_order = NULL,
  clade_label_level = 4,
  select_show_labels = NULL,
  only_select_show = FALSE,
  sep = "|",
  branch_size = 0.2,
  alpha = 0.2,
  clade_label_size = 2,
  clade_label_size_add = 5,
  clade_label_size_log = exp(1),
  node_size_scale = 1,
  node_size_offset = 1,
  annotation_shape = 22,
  annotation_shape_size = 5
)

Arguments

color: default RColorBrewer::brewer.pal(8, "Dark2"); color palette used in the plot.
use_taxa_num: default 200; integer; The taxa number used in the background tree plot; select the taxa according to the mean abundance .
filter_taxa: default NULL; The mean relative abundance used to filter the taxa with low abundance.
use_feature_num: default NULL; integer; The feature number used in the plot; select the features according to the LDA score (method = "lefse") or MeanDecreaseGini (method = "rf") from high to low.
group_order: default NULL; a vector to order the legend and colors in plot; If NULL, the function can first check whether the group column of sample_table is factor. If yes, use the levels in it. If provided, this parameter can overwrite the levels in the group of sample_table.
clade_label_level: default 4; the taxonomic level for marking the label with letters, root is the largest.
select_show_labels: default NULL; character vector; The features to show in the plot with full label names, not the letters.
only_select_show: default FALSE; whether only use the the select features in the parameter select_show_labels.
sep: default "|"; the seperate character in the taxonomic information.
branch_size: default 0.2; numberic, size of branch.
alpha: default 0.2; shading of the color.
clade_label_size: default 2; basic size for the clade label; please also see clade_label_size_add and clade_label_size_log
clade_label_size_add: default 5; added basic size for the clade label; see the formula in clade_label_size_log parameter.
clade_label_size_log: default exp(1); the base of log function for added size of the clade label; the size formula: clade_label_size + log(clade_label_level + clade_label_size_add, base = clade_label_size_log); so use clade_label_size_log, clade_label_size_add and clade_label_size can totally control the label size for different taxonomic levels.
node_size_scale: default 1; scale for the node size.
node_size_offset: default 1; offset for the node size.
annotation_shape: default 22; shape used in the annotation legend.
annotation_shape_size: default 5; size used in the annotation legend.

Returns

ggplot.

Examples

\donttest{
t1$plot_diff_cladogram(use_taxa_num = 100, use_feature_num = 30, select_show_labels = NULL)
}

Method `print()`

Print the trans_alpha object.

Usage

trans_diff$print()

Method `clone()`

The objects of this class are cloneable with this method.

Usage

trans_diff$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Examples

## ------------------------------------------------
## Method `trans_diff$new`
## ------------------------------------------------


data(dataset)
t1 <- trans_diff$new(dataset = dataset, method = "lefse", group = "Group")
t1 <- trans_diff$new(dataset = dataset, method = "rf", group = "Group")
t1 <- trans_diff$new(dataset = dataset, method = "metastat", group = "Group", taxa_level = "Genus")
t1 <- trans_diff$new(dataset = dataset, method = "wilcox", group = "Group")
t1 <- trans_diff$new(dataset = dataset, method = "KW_dunn", group = "Group", taxa_level = "Phylum")


## ------------------------------------------------
## Method `trans_diff$plot_diff_abund`
## ------------------------------------------------


t1 <- trans_diff$new(dataset = dataset, method = "anova", group = "Group", taxa_level = "Genus")
t1$plot_diff_abund(use_number = 1:10)
t1$plot_diff_abund(use_number = 1:10, add_sig = TRUE)
t1 <- trans_diff$new(dataset = dataset, method = "wilcox", group = "Group")
t1$plot_diff_abund(use_number = 1:20)
t1$plot_diff_abund(use_number = 1:20, add_sig = TRUE)
t1 <- trans_diff$new(dataset = dataset, method = "lefse", group = "Group")
t1$plot_diff_abund(use_number = 1:20)
t1$plot_diff_abund(use_number = 1:20, add_sig = TRUE)


## ------------------------------------------------
## Method `trans_diff$plot_diff_bar`
## ------------------------------------------------


t1$plot_diff_bar(use_number = 1:20)


## ------------------------------------------------
## Method `trans_diff$plot_diff_cladogram`
## ------------------------------------------------


t1$plot_diff_cladogram(use_taxa_num = 100, use_feature_num = 30, select_show_labels = NULL)

microeco

Microbial Community Ecology Data Analysis

v0.10.0

GPL-3

Authors

Chi Liu [aut, cre], Felipe R. P. Mansoldo [ctb], Umer Zeeshan Ijaz [ctb], Chenhao Li [ctb], Yang Cao [ctb], Minjie Yao [ctb], Xiangzhen Li [ctb]

Initial release

trans_diff

Description

Methods

Public methods

Method new()

Usage

Arguments

Returns

Examples

Method plot_diff_abund()

Usage

Arguments

Returns

Examples

Method plot_diff_bar()

Usage

Arguments

Returns

Examples

Method plot_diff_cladogram()

Usage

Arguments

Returns

Examples

Method print()

Usage

Method clone()

Usage

Arguments

Examples

microeco

We don't support your browser anymore

Method `new()`

Method `plot_diff_abund()`

Method `plot_diff_bar()`

Method `plot_diff_cladogram()`

Method `print()`

Method `clone()`