R/zeitzeiger_predict.R
zeitzeigerBatch.Rd
Train and test a predictor on multiple datasets independently, using
sva::ComBat()
to correct for batch effects prior to running zeitzeiger()
.
This function requires the metapredict
package.
zeitzeigerBatch( ematList, trainStudyNames, sampleMetadata, studyColname, batchColname, timeColname, nKnots = 3, nTime = 10, useSpc = TRUE, sumabsv = 2, orth = TRUE, nSpc = 2, timeRange = seq(0, 1 - 0.01, 0.01), covariateName = NA, featuresExclude = NULL, dopar = TRUE )
ematList | Named list of matrices of measurements, one for each dataset, some of which will be for training, others for testing. Each matrix should have rownames corresponding to sample names and colnames corresponding to feature names. |
---|---|
trainStudyNames | Character vector of names in |
sampleMetadata | data.frame containing relevant information for each
sample across all datasets. Must have a column named |
studyColname | Name of column in |
batchColname | Name of column in |
timeColname | Name of column in |
nKnots | Number of internal knots to use for the periodic smoothing spline. |
nTime | Number of time-points by which to discretize the time-dependent behavior of each feature. Corresponds to the number of rows in the matrix for which the SPCs will be calculated. |
useSpc | Logical indicating whether to use |
sumabsv | L1-constraint on the SPCs, passed to |
orth | Logical indicating whether to require left singular vectors
be orthogonal to each other, passed to |
nSpc | Vector of the number of SPCs to use for prediction. If |
timeRange | Vector of values of the periodic variable at which to calculate likelihood. The time with the highest likelihood is used as the initial value for the MLE optimizer. |
covariateName | Name of column(s) in |
featuresExclude | Named list of character vectors corresponding to features to exclude from being used for prediction for the respective test datasets. |
dopar | Logical indicating whether to process the folds in parallel. Use
|
List of output from zeitzeigerSpc()
, one for each test
dataset.
3-D array of likelihood, with dimensions for each test
observation (across all datasets), each element of nSpc
, and each element
of timeRange
.
List (for each element in nSpc
) of lists (for each test
observation) of mle2
objects.
Matrix of predicted times for test observations by values of
nSpc
.