Package 'ClusterGVis' reference manual

Title:	One-Step to Cluster and Visualize Gene Expression Data
Description:	Streamlining the clustering and visualization of time-series gene expression data from RNA-Seq experiments, this tool supports fuzzy c-means and k-means clustering algorithms. It is compatible with outputs from widely-used packages such as 'Seurat', 'Monocle', and 'WGCNA', enabling seamless downstream visualization and analysis. See Lokesh Kumar and Matthias E Futschik (2007) <doi:10.6026/97320630002005> for more details.
Authors:	Jun Zhang [aut, cre]
Maintainer:	Jun Zhang <[email protected]>
License:	MIT + file LICENSE
Version:	0.1.2
Built:	2025-03-10 02:16:35 UTC
Source:	https://github.com/junjunlab/clustergvis

This is a test data for this package test data describtion

Description

This is a test data for this package test data describtion

Usage

BEAM_res
BEAM_res

Format

An object of class data.frame with 47192 rows and 8 columns.

Author(s)

JunZhang

Cluster Data Based on Different Methods

Description

Cluster Data Based on Different Methods

Usage

clusterData(
  obj = NULL,
  scaleData = TRUE,
  cluster.method = c("mfuzz", "TCseq", "kmeans", "wgcna"),
  TCseq_params_list = list(),
  object = NULL,
  min.std = 0,
  cluster.num = NULL,
  subcluster = NULL,
  seed = 5201314,
  ...
)
clusterData(
  obj = NULL,
  scaleData = TRUE,
  cluster.method = c("mfuzz", "TCseq", "kmeans", "wgcna"),
  TCseq_params_list = list(),
  object = NULL,
  min.std = 0,
  cluster.num = NULL,
  subcluster = NULL,
  seed = 5201314,
  ...
)

Arguments

`obj`	An input object that can take one of two types: - A cell_data_set object for trajectory analysis. - A matrix or data.frame containing expression data.
`scaleData`	Logical. Whether to scale the data (e.g., z-score normalization).
`cluster.method`	Character. Clustering method to use. Options are one of `"mfuzz"`, `"TCseq"`, `"kmeans"`, or `"wgcna"`.
`TCseq_params_list`	A list of additional parameters passed to the `TCseq::timeclust` function.
`object`	A pre-calculated object required when using `"wgcna"` as the clustering method.
`min.std`	Numeric. Minimum standard deviation for filtering expression data.
`cluster.num`	Integer. The number of clusters to identify.
`subcluster`	A numeric vector of specific cluster IDs to include in the results. If `NULL`, all clusters are included.
`seed`	An integer seed for reproducibility in clustering operations.
`...`	Additional arguments passed to internal functions such as `pre_pseudotime_matrix`.

Details

Depending on the selected cluster.method, different clustering algorithms are used:

"mfuzz": Applies Mfuzz soft clustering method, suitable for identifying overlapping clusters.
"TCseq": Uses TCseq clustering for time-series expression data with support for additional parameters.
"kmeans": Employs standard k-means clustering via base R's stats::kmeans.
"wgcna": Leverages pre-calculated WGCNA (Weighted Gene Co-expression Network Analysis) networks.

The function is designed to be flexible, allowing preprocessing (e.g., filtering by min.std), scaling the data (scaleData = TRUE), and generating results compatible with data visualization pipelines.

Value

A list containing the following clustering results:

wide.res: A wide-format data frame with clusters and normalized expression levels.
long.res: A long-format data frame for visualizations, containing cluster information, normalized values, cluster names, and memberships.
cluster.list: A list where each element contains genes belonging to a specific cluster.
type: The clustering method used ("mfuzz", "TCseq", "kmeans", or "wgcna").
geneMode: Currently set to "none" (reserved for future use).
geneType: Currently set to "none" (reserved for future use).

WGCNA Clustering

If the WGCNA method is selected, the object parameter must contain a pre-calculated WGCNA network object. This is typically obtained using the WGCNA package functions.

Subsetting Clusters

Use the subcluster parameter to focus on specific clusters. Cluster IDs not included in the subcluster vector will be excluded from the final results.

Author(s)

JunZhang

This function performs clustering on input data using one of four methods: mfuzz, TCseq, kmeans, or wgcna. The clustering results include metadata, normalized data, and cluster memberships.

Examples


data("exps")

# kmeans
ck <- clusterData(obj = exps,
                  cluster.method = "kmeans",
                  cluster.num = 8)

data("exps")

# kmeans
ck <- clusterData(obj = exps,
                  cluster.method = "kmeans",
                  cluster.num = 8)

Perform GO/KEGG Enrichment Analysis for Multiple Clusters

Description

Perform GO/KEGG Enrichment Analysis for Multiple Clusters

Usage

enrichCluster(
  object = NULL,
  type = c("BP", "MF", "CC", "KEGG", "ownSet"),
  TERM2GENE = NULL,
  TERM2NAME = NULL,
  OrgDb = NULL,
  id.trans = TRUE,
  fromType = "SYMBOL",
  toType = c("ENTREZID"),
  readable = TRUE,
  organism = "hsa",
  pvalueCutoff = 0.05,
  topn = 5,
  seed = 5201314,
  add.gene = FALSE,
  use_internal_data = FALSE,
  heatmap.type = c("plot_pseudotime_heatmap2", "plot_genes_branched_heatmap2",
    "plot_multiple_branches_heatmap2"),
  ...
)
enrichCluster(
  object = NULL,
  type = c("BP", "MF", "CC", "KEGG", "ownSet"),
  TERM2GENE = NULL,
  TERM2NAME = NULL,
  OrgDb = NULL,
  id.trans = TRUE,
  fromType = "SYMBOL",
  toType = c("ENTREZID"),
  readable = TRUE,
  organism = "hsa",
  pvalueCutoff = 0.05,
  topn = 5,
  seed = 5201314,
  add.gene = FALSE,
  use_internal_data = FALSE,
  heatmap.type = c("plot_pseudotime_heatmap2", "plot_genes_branched_heatmap2",
    "plot_multiple_branches_heatmap2"),
  ...
)

Arguments

`object`	An object containing clustering results. This is clusterData object. Alternatively, it can be a `CellDataSet` object, in which case the function can also visualize pseudotime data.
`type`	Character. The type of enrichment analysis to perform. Options include: `"BP"`: Biological Process (GO) `"MF"`: Molecular Function (GO) `"CC"`: Cellular Component (GO) `"KEGG"`: KEGG Pathway analysis `"ownSet"`: Custom gene set enrichment, requiring `TERM2GENE` and optionally `TERM2NAME`.
`TERM2GENE`	A data frame containing mappings of terms to genes. Required when `type = "ownSet"`. This must be a two-column data frame, where the first column is the term and the second column is the gene.
`TERM2NAME`	A data frame containing term-to-name mappings. Optional when `type = "ownSet"`. This must also be a two-column data frame, where the first column is the term and the second column is the name.
`OrgDb`	An organism database object (e.g., `org.Hs.eg.db` for human or `org.Mm.eg.db` for mouse), used for GO or KEGG enrichment analysis.
`id.trans`	Logical. Whether to perform gene ID transformation. Default is `TRUE`.
`fromType`	Character. The type of the input gene IDs (e.g., `"SYMBOL"`, `"ENSEMBL"`). Default is `"SYMBOL"`.
`toType`	Character. The target ID type for transformation using `clusterProfiler::bitr` (e.g., `"ENTREZID"`). Default is `"ENTREZID"`.
`readable`	Logical. Whether to convert the enrichment result IDs back to a readable format (e.g., SYMBOL). Only applicable for GO and KEGG analysis. Default is `TRUE`.
`organism`	Character. The KEGG organism code (e.g., `"hsa"` for human, `"mmu"` for mouse). Required when performing KEGG enrichment. Default is `"hsa"`.
`pvalueCutoff`	Numeric. The p-value cutoff for enriched terms to be included in the results. Default is `0.05`.
`topn`	Integer or vector. The number of top enrichment results to extract. If a single value, it is applied to all clusters. Otherwise, it should match the number of clusters. Default is `5`.
`seed`	Numeric. Seed for random operations to ensure reproducibility. Default is `5201314`.
`add.gene`	Logical. Whether to include the list of genes associated with each enriched term in the results. Default is `FALSE`.
`use_internal_data`	Logical, use KEGG.db or latest online KEGG data for enrichKEGG function. Default is `FALSE`.
`heatmap.type`	Character. The type of heatmap visualization to use when input data is a `CellDataSet` object. Options include: `"plot_pseudotime_heatmap2"` `"plot_genes_branched_heatmap2"` `"plot_multiple_branches_heatmap2"`
`...`	Additional arguments passed to plot_pseudotime_heatmap2/plot_genes_branched_heatmap2/plot_multiple_branches_heatmap2 functions.

Value

a data.frame.

Author(s)

JunZhang

This function performs Gene Ontology (GO) or KEGG enrichment analysis, or custom gene set enrichment, on clustered genes. It supports multiple clusters, incorporating cluster-specific results into its analysis.

Generic to access cds count matrix

Description

Generic to access cds count matrix

Usage

exprs(x)
exprs(x)

Arguments

`x`	A cell_data_set object.

Value

Count matrix.

Author(s)

https://github.com/cole-trapnell-lab/monocle3

Method to access cds count matrix

Description

Method to access cds count matrix

Usage

## S4 method for signature 'cell_data_set'
exprs(x)
## S4 method for signature 'cell_data_set'
exprs(x)

Arguments

`x`	A cell_data_set object.

Value

Count matrix.

This is a test data for this package test data describtion

Description

This is a test data for this package test data describtion

Usage

exps
exps

Format

An object of class data.frame with 3767 rows and 6 columns.

Author(s)

Junjun Lao

using filter.std to filter low expression genes

Description

using filter.std to filter low expression genes

Usage

filter.std(eset, min.std, visu = TRUE, verbose = TRUE)
filter.std(eset, min.std, visu = TRUE, verbose = TRUE)

Arguments

`eset`	expression matrix, default NULL.
`min.std`	min stand error, default 0.
`visu`	whether plot, default FALSE.
`verbose`	show filter information.

Value

matrix.

Determine Optimal Clusters for Gene Expression or Pseudotime Data

Description

Determine Optimal Clusters for Gene Expression or Pseudotime Data

Usage

getClusters(obj = NULL, ...)
getClusters(obj = NULL, ...)

Arguments

obj

A data object representing the gene expression data or pseudotime data:

If the input is a cell_data_set object (e.g., from Monocle3), the function preprocesses the data using pre_pseudotime_matrix.
If the input is a numeric matrix or a data.frame, it directly uses this data. Default is NULL.

...

Additional arguments passed to the preprocessing function pre_pseudotime_matrix (e.g., assays, normalize, etc.).

Value

A ggplot object visualizing the Elbow plot, where:

The x-axis represents the number of clusters tested.
The y-axis represents the WSS for each cluster number.

The optimal cluster number can be visually identified at the "elbow point," where the reduction in WSS diminishes sharply.

a ggplot.

Author(s)

JunZhang

The getClusters function identifies the optimal number of clusters for a given data object. It supports multiple input types, including gene expression matrices and objects such as cell_data_set. The function implements the Elbow method to evaluate within-cluster sum of squares (WSS) across a range of cluster numbers and visualizes the results.

This is a test data for this package test data describtion

Description

This is a test data for this package test data describtion

Usage

net
net

Format

An object of class list of length 10.

Author(s)

Junjun Lao

Return a size-factor normalized and (optionally) log-transformed expression

Description

Return a size-factor normalized and (optionally) log-transformed expression

Usage

normalized_counts(
  cds,
  norm_method = c("log", "binary", "size_only"),
  pseudocount = 1
)
normalized_counts(
  cds,
  norm_method = c("log", "binary", "size_only"),
  pseudocount = 1
)

Arguments

`cds`	A CDS object to calculate normalized expression matrix from.
`norm_method`	String indicating the normalization method. Options are "log" (Default), "binary" and "size_only".
`pseudocount`	A pseudocount to add before log transformation. Ignored if norm_method is not "log". Default is 1.

Value

Size-factor normalized, and optionally log-transformed, expression matrix.

Author(s)

https://github.com/cole-trapnell-lab/monocle3

matrix

Create a heatmap to demonstrate the bifurcation of gene expression along two branchs which is slightly modified in monocle2

Description

@description returns a heatmap that shows changes in both lineages at the same time. It also requires that you choose a branch point to inspect. Columns are points in pseudotime, rows are genes, and the beginning of pseudotime is in the middle of the heatmap. As you read from the middle of the heatmap to the right, you are following one lineage through pseudotime. As you read left, the other. The genes are clustered hierarchically, so you can visualize modules of genes that have similar lineage-dependent expression patterns.

Usage

plot_genes_branched_heatmap2(
  cds_subset = NULL,
  branch_point = 1,
  branch_states = NULL,
  branch_labels = c("Cell fate 1", "Cell fate 2"),
  cluster_rows = TRUE,
  hclust_method = "ward.D2",
  num_clusters = 6,
  hmcols = NULL,
  branch_colors = c("#979797", "#F05662", "#7990C8"),
  add_annotation_row = NULL,
  add_annotation_col = NULL,
  show_rownames = FALSE,
  use_gene_short_name = TRUE,
  scale_max = 3,
  scale_min = -3,
  norm_method = c("log", "vstExprs"),
  trend_formula = "~sm.ns(Pseudotime, df=3) * Branch",
  return_heatmap = FALSE,
  cores = 1,
  ...
)
plot_genes_branched_heatmap2(
  cds_subset = NULL,
  branch_point = 1,
  branch_states = NULL,
  branch_labels = c("Cell fate 1", "Cell fate 2"),
  cluster_rows = TRUE,
  hclust_method = "ward.D2",
  num_clusters = 6,
  hmcols = NULL,
  branch_colors = c("#979797", "#F05662", "#7990C8"),
  add_annotation_row = NULL,
  add_annotation_col = NULL,
  show_rownames = FALSE,
  use_gene_short_name = TRUE,
  scale_max = 3,
  scale_min = -3,
  norm_method = c("log", "vstExprs"),
  trend_formula = "~sm.ns(Pseudotime, df=3) * Branch",
  return_heatmap = FALSE,
  cores = 1,
  ...
)

Arguments

`cds_subset`	CellDataSet for the experiment (normally only the branching genes detected with branchTest)
`branch_point`	The ID of the branch point to visualize. Can only be used when reduceDimension is called with method = "DDRTree".
`branch_states`	The two states to compare in the heatmap. Mutually exclusive with branch_point.
`branch_labels`	The labels for the branchs.
`cluster_rows`	Whether to cluster the rows of the heatmap.
`hclust_method`	The method used by pheatmap to perform hirearchical clustering of the rows.
`num_clusters`	Number of clusters for the heatmap of branch genes
`hmcols`	The color scheme for drawing the heatmap.
`branch_colors`	The colors used in the annotation strip indicating the pre- and post-branch cells.
`add_annotation_row`	Additional annotations to show for each row in the heatmap. Must be a dataframe with one row for each row in the fData table of cds_subset, with matching IDs.
`add_annotation_col`	Additional annotations to show for each column in the heatmap. Must be a dataframe with one row for each cell in the pData table of cds_subset, with matching IDs.
`show_rownames`	Whether to show the names for each row in the table.
`use_gene_short_name`	Whether to use the short names for each row. If FALSE, uses row IDs from the fData table.
`scale_max`	The maximum value (in standard deviations) to show in the heatmap. Values larger than this are set to the max.
`scale_min`	The minimum value (in standard deviations) to show in the heatmap. Values smaller than this are set to the min.
`norm_method`	Determines how to transform expression values prior to rendering
`trend_formula`	A formula string specifying the model used in fitting the spline curve for each gene/feature.
`return_heatmap`	Whether to return the pheatmap object to the user.
`cores`	Number of cores to use when smoothing the expression curves shown in the heatmap.
`...`	Additional arguments passed to buildBranchCellDataSet

Value

A list of heatmap_matrix (expression matrix for the branch committment), ph (pheatmap heatmap object), annotation_row (annotation data.frame for the row), annotation_col (annotation data.frame for the column).

Create a heatmap to demonstrate the bifurcation of gene expression along multiple branches

Description

Create a heatmap to demonstrate the bifurcation of gene expression along multiple branches

Usage

plot_multiple_branches_heatmap2(
  cds = NULL,
  branches,
  branches_name = NULL,
  cluster_rows = TRUE,
  hclust_method = "ward.D2",
  num_clusters = 6,
  hmcols = NULL,
  add_annotation_row = NULL,
  add_annotation_col = NULL,
  show_rownames = FALSE,
  use_gene_short_name = TRUE,
  norm_method = c("vstExprs", "log"),
  scale_max = 3,
  scale_min = -3,
  trend_formula = "~sm.ns(Pseudotime, df=3)",
  return_heatmap = FALSE,
  cores = 1
)
plot_multiple_branches_heatmap2(
  cds = NULL,
  branches,
  branches_name = NULL,
  cluster_rows = TRUE,
  hclust_method = "ward.D2",
  num_clusters = 6,
  hmcols = NULL,
  add_annotation_row = NULL,
  add_annotation_col = NULL,
  show_rownames = FALSE,
  use_gene_short_name = TRUE,
  norm_method = c("vstExprs", "log"),
  scale_max = 3,
  scale_min = -3,
  trend_formula = "~sm.ns(Pseudotime, df=3)",
  return_heatmap = FALSE,
  cores = 1
)

Arguments

`cds`	CellDataSet for the experiment (normally only the branching genes detected with BEAM)
`branches`	The terminal branches (states) on the developmental tree you want to investigate.
`branches_name`	Name (for example, cell type) of branches you believe the cells on the branches are associated with.
`cluster_rows`	Whether to cluster the rows of the heatmap.
`hclust_method`	The method used by pheatmap to perform hirearchical clustering of the rows.
`num_clusters`	Number of clusters for the heatmap of branch genes
`hmcols`	The color scheme for drawing the heatmap.
`add_annotation_row`	Additional annotations to show for each row in the heatmap. Must be a dataframe with one row for each row in the fData table of cds_subset, with matching IDs.
`add_annotation_col`	Additional annotations to show for each column in the heatmap. Must be a dataframe with one row for each cell in the pData table of cds_subset, with matching IDs.
`show_rownames`	Whether to show the names for each row in the table.
`use_gene_short_name`	Whether to use the short names for each row. If FALSE, uses row IDs from the fData table.
`norm_method`	Determines how to transform expression values prior to rendering
`scale_max`	The maximum value (in standard deviations) to show in the heatmap. Values larger than this are set to the max.
`scale_min`	The minimum value (in standard deviations) to show in the heatmap. Values smaller than this are set to the min.
`trend_formula`	A formula string specifying the model used in fitting the spline curve for each gene/feature.
`return_heatmap`	Whether to return the pheatmap object to the user.
`cores`	Number of cores to use when smoothing the expression curves shown in the heatmap.

Value

Plots a pseudotime-ordered, row-centered heatmap which is slightly modified in monocle2

Description

The function plot_pseudotime_heatmap takes a CellDataSet object (usually containing a only subset of significant genes) and generates smooth expression curves much like plot_genes_in_pseudotime. Then, it clusters these genes and plots them using the pheatmap package. This allows you to visualize modules of genes that co-vary across pseudotime.

Usage

plot_pseudotime_heatmap2(
  cds_subset,
  cluster_rows = TRUE,
  hclust_method = "ward.D2",
  num_clusters = 6,
  hmcols = NULL,
  add_annotation_row = NULL,
  add_annotation_col = NULL,
  show_rownames = FALSE,
  use_gene_short_name = TRUE,
  norm_method = c("log", "vstExprs"),
  scale_max = 3,
  scale_min = -3,
  trend_formula = "~sm.ns(Pseudotime, df=3)",
  return_heatmap = FALSE,
  cores = 1
)
plot_pseudotime_heatmap2(
  cds_subset,
  cluster_rows = TRUE,
  hclust_method = "ward.D2",
  num_clusters = 6,
  hmcols = NULL,
  add_annotation_row = NULL,
  add_annotation_col = NULL,
  show_rownames = FALSE,
  use_gene_short_name = TRUE,
  norm_method = c("log", "vstExprs"),
  scale_max = 3,
  scale_min = -3,
  trend_formula = "~sm.ns(Pseudotime, df=3)",
  return_heatmap = FALSE,
  cores = 1
)

Arguments

`cds_subset`	CellDataSet for the experiment (normally only the branching genes detected with branchTest)
`cluster_rows`	Whether to cluster the rows of the heatmap.
`hclust_method`	The method used by pheatmap to perform hirearchical clustering of the rows.
`num_clusters`	Number of clusters for the heatmap of branch genes
`hmcols`	The color scheme for drawing the heatmap.
`add_annotation_row`	Additional annotations to show for each row in the heatmap. Must be a dataframe with one row for each row in the fData table of cds_subset, with matching IDs.
`add_annotation_col`	Additional annotations to show for each column in the heatmap. Must be a dataframe with one row for each cell in the pData table of cds_subset, with matching IDs.
`show_rownames`	Whether to show the names for each row in the table.
`use_gene_short_name`	Whether to use the short names for each row. If FALSE, uses row IDs from the fData table.
`norm_method`	Determines how to transform expression values prior to rendering
`scale_max`	The maximum value (in standard deviations) to show in the heatmap. Values larger than this are set to the max.
`scale_min`	The minimum value (in standard deviations) to show in the heatmap. Values smaller than this are set to the min.
`trend_formula`	A formula string specifying the model used in fitting the spline curve for each gene/feature.
`return_heatmap`	Whether to return the pheatmap object to the user.
`cores`	Number of cores to use when smoothing the expression curves shown in the heatmap.

Value

Calculate and return a smoothed pseudotime matrix for the given gene list

Description

This function takes in a monocle3 object and returns a smoothed pseudotime matrix for the given gene list, either in counts or normalized form. The function first matches the gene list with the rownames of the SummarizedExperiment object, and then orders the pseudotime information. The function then uses smooth.spline to apply smoothing to the data. Finally, the function normalizes the data by subtracting the mean and dividing by the standard deviation for each row.

Usage

pre_pseudotime_matrix(
  cds_obj = NULL,
  assays = c("counts", "normalized"),
  gene_list = NULL
)
pre_pseudotime_matrix(
  cds_obj = NULL,
  assays = c("counts", "normalized"),
  gene_list = NULL
)

Arguments

`cds_obj`	A monocle3 object
`assays`	Type of assay to be used for the analysis, either "counts" or "normalized"
`gene_list`	A vector of gene names

Value

A smoothed pseudotime matrix for the given gene list

Prepare scRNA Data for clusterGvis Analysis

Description

This function prepares single-cell RNA sequencing (scRNA-seq) data for differential gene expression analysis. It extracts the expression data for the specified cells and genes, and organizes them into a dataframe format suitable for downstream analysis.

Usage

prepareDataFromscRNA(
  object = NULL,
  diffData = NULL,
  showAverage = TRUE,
  cells = NULL,
  group.by = "ident",
  assays = "RNA",
  slot = "data",
  scale.data = TRUE,
  cluster.order = NULL,
  keep.uniqGene = TRUE,
  sep = "_"
)
prepareDataFromscRNA(
  object = NULL,
  diffData = NULL,
  showAverage = TRUE,
  cells = NULL,
  group.by = "ident",
  assays = "RNA",
  slot = "data",
  scale.data = TRUE,
  cluster.order = NULL,
  keep.uniqGene = TRUE,
  sep = "_"
)

Arguments

`object`	an object of class Seurat containing the scRNA-seq data.
`diffData`	a dataframe containing information about the differential expression analysis which can be output from function FindAllMarkers.
`showAverage`	a logical indicating whether to show the average gene expression across all cells.
`cells`	a vector of cell names to extract from the Seurat object. If NULL, all cells will be used.
`group.by`	a string specifying the grouping variable for differential expression analysis. Default is 'ident', which groups cells by their assigned clusters.
`assays`	a string or vector of strings specifying the assay(s) to extract from the Seurat object. Default is 'RNA'.
`slot`	a string specifying the slot name where the assay data is stored in the Seurat object. Default is 'data'.
`scale.data`	whether do Z-score for expression data, default TRUE.
`cluster.order`	the celltype orders.
`keep.uniqGene`	a logical indicating whether to keep only unique gene names. Default is TRUE.
`sep`	a character string to separate gene and cell names in the output dataframe. Default is "_".

Value

a dataframe containing the expression data for the specified genes and cells, organized in a format suitable for differential gene expression analysis.

Generic to extract pseudotime from CDS object

Description

Generic to extract pseudotime from CDS object

Usage

pseudotime(x, reduction_method = "UMAP")
pseudotime(x, reduction_method = "UMAP")

Arguments

`x`	A cell_data_set object.
`reduction_method`	Reduced dimension to extract pseudotime for.

Value

Pseudotime values.

Author(s)

https://github.com/cole-trapnell-lab/monocle3

Method to extract pseudotime from CDS object

Description

Method to extract pseudotime from CDS object

Usage

## S4 method for signature 'cell_data_set'
pseudotime(x, reduction_method = "UMAP")
## S4 method for signature 'cell_data_set'
pseudotime(x, reduction_method = "UMAP")

Arguments

`x`	A cell_data_set object.
`reduction_method`	Reduced dimension to extract clusters for.

Value

Pseudotime values.

This is a test data for this package test data describtion

Description

This is a test data for this package test data describtion

Usage

sig_gene_names
sig_gene_names

Format

An object of class character of length 1331.

Author(s)

JunZhang

Get the size factors from a cds object.

Description

A wrapper around colData(cds)$Size_Factor

Usage

size_factors(cds)
size_factors(cds)

Arguments

cds

A cell_data_set object.

Value

An updated cell_data_set object

This is a test data for this package test data describtion

Description

This is a test data for this package test data describtion

Usage

termanno
termanno

Format

An object of class data.frame with 24 rows and 2 columns.

Author(s)

Junjun Lao

This is a test data for this package test data describtion

Description

This is a test data for this package test data describtion

Usage

termanno2
termanno2

Format

An object of class data.frame with 24 rows and 3 columns.

Author(s)

Junjun Lao

traverseTree function

Description

traverseTree function

Usage

traverseTree(g, starting_cell, end_cells)
traverseTree(g, starting_cell, end_cells)

Arguments

`g`	NULL
`starting_cell`	NULL
`end_cells`	NULL

using visCluster to visualize cluster results from clusterData and enrichCluster output

Description

Visualize Clustered Gene Data Using Line Plots and Heatmaps

Usage

visCluster(
  object = NULL,
  ht.col.list = list(col_range = c(-2, 0, 2), col_color = c("#08519C", "white",
    "#A50F15")),
  border = TRUE,
  plot.type = c("line", "heatmap", "both"),
  ms.col = c("#0099CC", "grey90", "#CC3333"),
  line.size = 0.1,
  line.col = "grey90",
  add.mline = TRUE,
  mline.size = 2,
  mline.col = "#CC3333",
  ncol = 4,
  ctAnno.col = NULL,
  set.md = "median",
  textbox.pos = c(0.5, 0.8),
  textbox.size = 8,
  panel.arg = c(2, 0.25, 4, "grey90", NA),
  ggplot.panel.arg = c(2, 0.25, 4, "grey90", NA),
  annoTerm.data = NULL,
  annoTerm.mside = "right",
  termAnno.arg = c("grey95", "grey50"),
  add.bar = FALSE,
  bar.width = 8,
  textbar.pos = c(0.8, 0.8),
  go.col = NULL,
  go.size = NULL,
  by.go = "anno_link",
  annoKegg.data = NULL,
  annoKegg.mside = "right",
  keggAnno.arg = c("grey95", "grey50"),
  add.kegg.bar = FALSE,
  kegg.col = NULL,
  kegg.size = NULL,
  by.kegg = "anno_link",
  word_wrap = TRUE,
  add_new_line = TRUE,
  add.box = FALSE,
  boxcol = NULL,
  box.arg = c(0.1, "grey50"),
  add.point = FALSE,
  point.arg = c(19, "orange", "orange", 1),
  add.line = TRUE,
  line.side = "right",
  markGenes = NULL,
  markGenes.side = "right",
  genes.gp = c("italic", 10, NA),
  term.text.limit = c(10, 18),
  mulGroup = NULL,
  lgd.label = NULL,
  show_row_names = FALSE,
  subgroup.anno = NULL,
  annnoblock.text = TRUE,
  annnoblock.gp = c("white", 8),
  add.sampleanno = TRUE,
  sample.group = NULL,
  sample.col = NULL,
  sample.order = NULL,
  cluster.order = NULL,
  sample.cell.order = NULL,
  HeatmapAnnotation = NULL,
  column.split = NULL,
  cluster_columns = FALSE,
  pseudotime_col = NULL,
  gglist = NULL,
  row_annotation_obj = NULL,
  ...
)
visCluster(
  object = NULL,
  ht.col.list = list(col_range = c(-2, 0, 2), col_color = c("#08519C", "white",
    "#A50F15")),
  border = TRUE,
  plot.type = c("line", "heatmap", "both"),
  ms.col = c("#0099CC", "grey90", "#CC3333"),
  line.size = 0.1,
  line.col = "grey90",
  add.mline = TRUE,
  mline.size = 2,
  mline.col = "#CC3333",
  ncol = 4,
  ctAnno.col = NULL,
  set.md = "median",
  textbox.pos = c(0.5, 0.8),
  textbox.size = 8,
  panel.arg = c(2, 0.25, 4, "grey90", NA),
  ggplot.panel.arg = c(2, 0.25, 4, "grey90", NA),
  annoTerm.data = NULL,
  annoTerm.mside = "right",
  termAnno.arg = c("grey95", "grey50"),
  add.bar = FALSE,
  bar.width = 8,
  textbar.pos = c(0.8, 0.8),
  go.col = NULL,
  go.size = NULL,
  by.go = "anno_link",
  annoKegg.data = NULL,
  annoKegg.mside = "right",
  keggAnno.arg = c("grey95", "grey50"),
  add.kegg.bar = FALSE,
  kegg.col = NULL,
  kegg.size = NULL,
  by.kegg = "anno_link",
  word_wrap = TRUE,
  add_new_line = TRUE,
  add.box = FALSE,
  boxcol = NULL,
  box.arg = c(0.1, "grey50"),
  add.point = FALSE,
  point.arg = c(19, "orange", "orange", 1),
  add.line = TRUE,
  line.side = "right",
  markGenes = NULL,
  markGenes.side = "right",
  genes.gp = c("italic", 10, NA),
  term.text.limit = c(10, 18),
  mulGroup = NULL,
  lgd.label = NULL,
  show_row_names = FALSE,
  subgroup.anno = NULL,
  annnoblock.text = TRUE,
  annnoblock.gp = c("white", 8),
  add.sampleanno = TRUE,
  sample.group = NULL,
  sample.col = NULL,
  sample.order = NULL,
  cluster.order = NULL,
  sample.cell.order = NULL,
  HeatmapAnnotation = NULL,
  column.split = NULL,
  cluster_columns = FALSE,
  pseudotime_col = NULL,
  gglist = NULL,
  row_annotation_obj = NULL,
  ...
)

Arguments

`object`	clusterData object, default NULL.
`ht.col.list`	list of heatmap col_range and col_color, default list(col_range = c(-2, 0, 2),col_color = c("#08519C", "white", "#A50F15")).
`border`	whether add border for heatmap, default TRUE.
`plot.type`	the plot type to choose which incuding "line","heatmap" and "both".
`ms.col`	membership line color form Mfuzz cluster method results, default c('#0099CC','grey90','#CC3333').
`line.size`	line size for line plot, default 0.1.
`line.col`	line color for line plot, default "grey90".
`add.mline`	whether add median line on plot, default TRUE.
`mline.size`	median line size, default 2.
`mline.col`	median line color, default "#CC3333".
`ncol`	the columns for facet plot with line plot, default 4.
`ctAnno.col`	the heatmap cluster annotation bar colors, default NULL.
`set.md`	the represent line method on heatmap-line plot(mean/median), default "median".
`textbox.pos`	the relative position of text in left-line plot, default c(0.5,0.8).
`textbox.size`	the text size of the text in left-line plot, default 8.
`panel.arg`	the settings for the left-line panel which are panel size,gap,width,fill and col, default c(2,0.25,4,"grey90",NA).
`ggplot.panel.arg`	the settings for the ggplot2 object plot panel which are panel size,gap,width,fill and col, default c(2,0.25,4,"grey90",NA).
`annoTerm.data`	the GO term annotation for the clusters, default NULL.
`annoTerm.mside`	the wider GO term annotation box side, default "right".
`termAnno.arg`	the settings for GO term panel annotations which are fill and col, default c("grey95","grey50").
`add.bar`	whether add bar plot for GO enrichment, default FALSE.
`bar.width`	the GO enrichment bar width, default 8.
`textbar.pos`	the barplot text relative position, default c(0.8,0.8).
`go.col`	the GO term text colors, default NULL.
`go.size`	the GO term text size(numeric or "pval"), default NULL.
`by.go`	the GO term text box style("anno_link" or "anno_block"), default "anno_link".
`annoKegg.data`	the KEGG term annotation for the clusters, default NULL.
`annoKegg.mside`	the wider KEGG term annotation box side, default "right".
`keggAnno.arg`	the settings for KEGG term panel annotations which are fill and col, default c("grey95","grey50").
`add.kegg.bar`	whether add bar plot for KEGG enrichment, default FALSE.
`kegg.col`	the KEGG term text colors, default NULL.
`kegg.size`	the KEGG term text size(numeric or "pval"), default NULL.
`by.kegg`	the KEGG term text box style("anno_link" or "anno_block"), default "anno_link".
`word_wrap`	whether wrap the text, default TRUE.
`add_new_line`	whether add new line when text is long, default TRUE.
`add.box`	whether add boxplot, default FALSE.
`boxcol`	the box fill colors, default NULL.
`box.arg`	this is related to boxplot width and border color, default c(0.1,"grey50").
`add.point`	whether add point, default FALSE.
`point.arg`	this is related to point shape,fill,color and size, default c(19,"orange","orange",1).
`add.line`	whether add line, default TRUE.
`line.side`	the line annotation side, default "right".
`markGenes`	the gene names to be added on plot, default NULL.
`markGenes.side`	the gene label side, default "right".
`genes.gp`	gene labels graphics settings, default c('italic',10,NA).
`term.text.limit`	the GO term text size limit, default c(10,18).
`mulGroup`	to draw multiple lines annotation, supply the groups numbers with vector, default NULL.
`lgd.label`	the lines annotation legend labels, default NULL.
`show_row_names`	whether to show row names, default FALSE.
`subgroup.anno`	the sub-cluster for annotation, supply sub-cluster id, default NULL.
`annnoblock.text`	whether add cluster numbers on right block annotation, default TRUE.
`annnoblock.gp`	right block annotation text color and size, default c("white",8).
`add.sampleanno`	whether add column annotation, default TRUE.
`sample.group`	the column sample groups, default NULL.
`sample.col`	column annotation colors, default NULL.
`sample.order`	the orders for column samples, default NULL.
`cluster.order`	the row cluster orders for user's own defination, default NULL.
`sample.cell.order`	the celltype order when input is scRNA data and "showAverage = FALSE" for prepareDataFromscRNA.
`HeatmapAnnotation`	the 'HeatmapAnnotation' object from 'ComplexHeatmap' when you have multiple annotations, default NULL.
`column.split`	how to split the columns when supply multiple column annotations, default NULL.
`cluster_columns`	whether cluster the columns, default FALSE.
`pseudotime_col`	the branch color control for monocle input data.
`gglist`	a list of ggplot object to annotate each cluster, default NULL.
`row_annotation_obj`	Row annotation for heatmap, it is a `ComplexHeatmap::rowAnnotation()` object when "markGenes.side" or ”line.side“ is "right". Otherwise is a list of named vectors.
`...`	othe aruguments passed by Heatmap fuction.

Details

This function visualizes clustered gene expression data as line plots, heatmaps, or a combination of both, using the ComplexHeatmap and ggplot2 frameworks. Gene annotations, sample annotations, and additional features like custom color schemes and annotations for GO/KEGG terms are supported for visualization.

Value

a ggplot2 or Heatmap object.

Author(s)

JunZhang

Examples


data("exps")

# mfuzz
cm <- clusterData(obj = exps,
                  cluster.method = "kmeans",
                  cluster.num = 8)

# plot
visCluster(object = cm,
           plot.type = "line")

data("exps")

# mfuzz
cm <- clusterData(obj = exps,
                  cluster.method = "kmeans",
                  cluster.num = 8)

# plot
visCluster(object = cm,
           plot.type = "line")

Package 'ClusterGVis'

Help Index

This is a test data for this package test data describtion

Description

Usage

Format

Author(s)

Cluster Data Based on Different Methods

Description

Usage

Arguments

Details

Value

WGCNA Clustering

Subsetting Clusters

Author(s)

Examples

Perform GO/KEGG Enrichment Analysis for Multiple Clusters

Description

Usage

Arguments

Value

Author(s)

Generic to access cds count matrix

Description

Usage

Arguments

Value

Author(s)

Method to access cds count matrix

Description

Usage

Arguments

Value

This is a test data for this package test data describtion

Description

Usage

Format

Author(s)

using filter.std to filter low expression genes

Description

Usage

Arguments

Value

Determine Optimal Clusters for Gene Expression or Pseudotime Data

Description

Usage

Arguments

Value

Author(s)

This is a test data for this package test data describtion

Description

Usage

Format

Author(s)

Return a size-factor normalized and (optionally) log-transformed expression

Description

Usage

Arguments

Value

Author(s)

Create a heatmap to demonstrate the bifurcation of gene expression along two branchs which is slightly modified in monocle2

Description

Usage

Arguments

Value

Create a heatmap to demonstrate the bifurcation of gene expression along multiple branches

Description

Usage

Arguments

Value

Plots a pseudotime-ordered, row-centered heatmap which is slightly modified in monocle2

Description

Usage

Arguments

Value

Calculate and return a smoothed pseudotime matrix for the given gene list

Description

Usage

Arguments