R/phenograph.R
phenograph.Rd
Phenograph
Used to cluster high dimensional data. An R wrapper around the Python Phenograph module found at https://github.com/jacoblevine/PhenoGraph
phenograph(
rdf,
k = 30,
directed = FALSE,
prune = FALSE,
min_cluster_size = 10,
jaccard = TRUE,
primary_metric = "euclidean",
n_jobs = NULL,
q_tol = 0.001,
louvain_time_limit = 2000,
nn_method = "kdtree"
)
data to cluster, or sparse matrix of k-nearest neighbor graph If ndarray, n-by-d array of n cells in d dimensions If sparse matrix, n-by-n adjacency matrix
Number of nearest neighbors to use in first step of graph construction (default = 30)
Whether to use a symmetric (default) or asymmetric ("directed") graph. The graph construction process produces a directed graph, which is symmetrized by one of two methods (see below)
Whether to symmetrize by taking the average (prune=FALSE) or product (prune=TRUE) between the graph and its transpose
Cells that end up in a cluster smaller than min_cluster_size are considered outliers and are assigned to -1 in the cluster labels
If TRUE, use Jaccard metric between k-neighborhoods to build graph. If FALSE, use a Gaussian kernel.
Distance metric to define nearest neighbors. Options include: 'euclidean', 'manhattan', 'correlation', 'cosine' Note that performance will be slower for correlation and cosine.
Nearest Neighbors and Jaccard coefficients will be computed in parallel using n_jobs. If n_jobs=NULL, the number of jobs is determined automatically
Tolerance (i.e., precision) for monitoring modularity optimization
Maximum number of seconds to run modularity optimization. If exceeded the best result so far is returned
Whether to use brute force or kdtree for nearest neighbor search. For very large high-dimensional data sets, brute force (with parallel computation) performs faster than kdtree.
data.frame with community membership infomation