Cluster mapping data into mineral species
cluster_xmap(
xmap,
centers,
elements = intersect(names(xmap), colnames(centers)),
saving = TRUE,
suffix = "_.*",
...
)
Arguments
xmap |
A qm_xmap class object returned by read_xmap() |
centers |
c-by-p matrix returned by find_centers() or by manually;
c clusters and p features.
Used to guess initial centers (or centroids) of clusters.
A value returned by , typically data.frame or matrix ,
indicating initial guess centers (or centroids) or clusters.
See find_centers() . |
elements |
A character vector to chose elements to be utilized in cluster analysis.
NULL (default) selects as much elements as possible. |
saving |
TRUE or FALSE to save result.
Specifying xte coerces saving to be FALSE .
|
suffix |
A regular expression of suffix of cluster names.
Clusters with the same prefix comprise a super cluster.
For example, "Pl_NaRich" and "Pl_NaPoor" becomes "Pl" cluster if
suffix = "_.*" (default). |
... |
Arguments passed on to PoiClaClu::Classify
xte A m-by-p data matrix: m test observations and p features. The classifier
fit on the training data set x will be tested on this data set. If NULL,
then testing will be performed on the training set.
rho Tuning parameter controlling the amount of soft thresholding performed,
i.e. the level of sparsity, i.e. number of nonzero features in
classifier. Rho=0 means that there is no soft-thresolding, i.e. all
features used in classifier. Larger rho means that fewer features will
be used.
beta A smoothing term. A Gamma(beta,beta) prior is used to fit the
Poisson model.
Recommendation is to just leave it at 1, the default value.
rhos A vector of tuning parameters that control the amount of soft thresholding
performed. If "rhos" is provided then a number of
models will be fit (one for each element of "rhos"), and a number of
predicted class labels will be output (one for each element of "rhos").
type How should the observations be normalized within the
Poisson model, i.e. how should the size factors be estimated?
Options are "quantile" or "deseq" (more robust) or "mle" (less
robust). In greater detail: "quantile" is quantile normalization approach
of Bullard et al 2010 BMC Bioinformatics, "deseq" is median of the
ratio of an observation to a pseudoreference obtained by taking the
geometric mean, described in Anders and Huber 2010 Genome Biology and
implemented in Bioconductor package "DESeq", and "mle" is the sum of
counts for each sample; this is the maximum likelihood estimate
under a simple Poisson model.
prior Vector of length equal to the number of classes, representing prior
probabilities
for each class. If NULL then uniform priors are used (i.e. each
class is equally likely).
transform Should data matrices x and xte first be power transformed so that it more
closely fits
the Poisson model? TRUE or FALSE. Power transformation is
especially
useful if the data are overdispersed relative to the Poisson
model.
alpha If transform=TRUE, this determines the power to which the data
matrices x and xte are transformed.
If alpha=NULL then the transformation that
makes the Poisson model best fit the data matrix x is
computed. (Note that alpha is computed based on x, not based on
xte). Or a value of
alpha, 0<alpha<=1, can be entered by the user.
|