Phenograph Used to cluster high dimensional data. An R wrapper around the Python Phenograph module found at https://github.com/jacoblevine/PhenoGraph

Phenograph

Used to cluster high dimensional data. An R wrapper around the Python Phenograph module found at https://github.com/jacoblevine/PhenoGraph

phenograph(
  rdf,
  k = 30,
  directed = FALSE,
  prune = FALSE,
  min_cluster_size = 10,
  jaccard = TRUE,
  primary_metric = "euclidean",
  n_jobs = NULL,
  q_tol = 0.001,
  louvain_time_limit = 2000,
  nn_method = "kdtree"
)

Arguments

rdf: data to cluster, or sparse matrix of k-nearest neighbor graph If ndarray, n-by-d array of n cells in d dimensions If sparse matrix, n-by-n adjacency matrix
k: Number of nearest neighbors to use in first step of graph construction (default = 30)
directed: Whether to use a symmetric (default) or asymmetric ("directed") graph. The graph construction process produces a directed graph, which is symmetrized by one of two methods (see below)
prune: Whether to symmetrize by taking the average (prune=FALSE) or product (prune=TRUE) between the graph and its transpose
min_cluster_size: Cells that end up in a cluster smaller than min_cluster_size are considered outliers and are assigned to -1 in the cluster labels
jaccard: If TRUE, use Jaccard metric between k-neighborhoods to build graph. If FALSE, use a Gaussian kernel.
primary_metric: Distance metric to define nearest neighbors. Options include: 'euclidean', 'manhattan', 'correlation', 'cosine' Note that performance will be slower for correlation and cosine.
n_jobs: Nearest Neighbors and Jaccard coefficients will be computed in parallel using n_jobs. If n_jobs=NULL, the number of jobs is determined automatically
q_tol: Tolerance (i.e., precision) for monitoring modularity optimization
louvain_time_limit: Maximum number of seconds to run modularity optimization. If exceeded the best result so far is returned
nn_method: Whether to use brute force or kdtree for nearest neighbor search. For very large high-dimensional data sets, brute force (with parallel computation) performs faster than kdtree.

Value

data.frame with community membership infomation