Title: | Closed Testing with Globaltest for Pathway Analysis |
---|---|
Description: | A shortcut procedure is proposed to implement closed testing for large-scale multiple testings, especially with the global test. This shortcut is asymptotically equivalent to closed testing and post hoc. Users could detect any possible sets of features or pathways with family-wise error rate controlled. The global test is powerful to detect associations between a group of features and an outcome of interest. |
Authors: | Ningning Xu |
Maintainer: | Ningning Xu <[email protected]> |
License: | GPL (>= 2) |
Version: | 2.0.1 |
Built: | 2025-03-13 04:00:41 UTC |
Source: | https://github.com/cran/ctgt |
A shortcut procedure for closed testing with the global test is presented.
See examples in actgt function.
Ningning Xu
Maintainer: Ningning Xu <[email protected]; [email protected]>
Ningning Xu, Aldo solari, Jelle Goeman, Clsoed testing with global test, with applications on metabolomics data, arXiv:2001.01541, https://arxiv.org/abs/2001.01541 Jelle J. Goeman, Sara A. van de Geer, Floor de Kort, Hans C. van Houwelingen, A global test for groups of genes: testing association with a clinical outcome, Bioinformatics, Volume 20, Issue 1, 1 January 2004, Pages 93-99, https://doi.org/10.1093/bioinformatics/btg382
Robbins and Pitman Algorithm to calculate the criticalvalue given eigenvalue vector and alpha level.
criticalvalue(lam, alpha = 0.05)
criticalvalue(lam, alpha = 0.05)
lam |
The numeric vector with eigenvalues as elements. |
alpha |
The type I error rate allowed. The default is 0.05. |
Returns a real number.
Ningning Xu
Maintainer: Ningning Xu <[email protected]; [email protected]>
Ningning Xu, Aldo solari, Jelle Goeman, Clsoed testing with global test, with applications on metabolomics data, arXiv:2001.01541, https://arxiv.org/abs/2001.01541
To detect the significance of the set of features after correcting for multiple global tests, with family-wise error rate controlled.
actgt (y, X, xs, hyps, maxit = 0, alpha = 0.05)
actgt (y, X, xs, hyps, maxit = 0, alpha = 0.05)
y |
The response vector (numeric vector). |
X |
The full design matrix, whose columns are named by the covariates. |
xs |
The name vector of all covariates (character vector). |
hyps |
The name vector of the covariates in the pathway of interest (character vector). |
maxit |
An optional integer to denote the maximal interations for branch and bound method. The default value 0 means the single-step shortcut without branch and bound method. Note that larger value is more time-consuming. |
alpha |
The type I error rate allowed. The default is 0.05. |
Returns a list of rejection indicator and the number of iterations.
Ningning Xu
Maintainer: Ningning Xu <[email protected]; [email protected]>
Ningning Xu, Aldo solari, Jelle Goeman, Clsoed testing with global test, with applications on metabolomics data, arXiv:2001.01541, https://arxiv.org/abs/2001.01541
#Generate the design matrix and response vector for logistic regression models n= 100 m = 5 X = matrix(data = 0, nrow = n, ncol = m,byrow = TRUE ) for ( i in 1:n){ set.seed(1234+i) X[i,] = as.vector(arima.sim(model = list(order = c(1, 0, 0), ar = 0.2), n = m) ) } y = rbinom(n,1,0.6) X[which(y==1),1:3] = X[which(y==1),1:3] + 0.8 xs = paste("x",seq(1,m,1),sep="") colnames(X) = xs hyps=xs[1] #The sinle-step ctgt procedure actgt(y = y, X = X, xs = xs, hyps = hyps, maxit = 0, alpha = 0.05) #Result Iterations #"unsure" "0" # The iterative ctgt procedure with more iterations actgt(y = y, X = X, xs = xs, hyps = hyps, maxit = 0, alpha = 0.05) #Result Iterations #"reject" "2" #which means that x1 is rejected by closed testing within two more iterations of the shortcut # For a group of feature sets mysets = list(xs[1:5], xs[c(1,4)], xs[c(1,4,5)]) sapply(mysets, function(i) actgt(y = y, X = X, xs = xs, hyps = i, maxit = 0, alpha = 0.05)) #Result "reject" "unsure" "reject" #Iterations "0" "0" "0" mysets = list(xs[1:5], xs[c(1,4)], xs[c(1,4,5)]) sapply(mysets, function(i) actgt(y = y, X = X, xs = xs, hyps = i, maxit = 0, alpha = 0.05)) #Result "reject" "reject" "reject" #Iterations "0" "2" "0"
#Generate the design matrix and response vector for logistic regression models n= 100 m = 5 X = matrix(data = 0, nrow = n, ncol = m,byrow = TRUE ) for ( i in 1:n){ set.seed(1234+i) X[i,] = as.vector(arima.sim(model = list(order = c(1, 0, 0), ar = 0.2), n = m) ) } y = rbinom(n,1,0.6) X[which(y==1),1:3] = X[which(y==1),1:3] + 0.8 xs = paste("x",seq(1,m,1),sep="") colnames(X) = xs hyps=xs[1] #The sinle-step ctgt procedure actgt(y = y, X = X, xs = xs, hyps = hyps, maxit = 0, alpha = 0.05) #Result Iterations #"unsure" "0" # The iterative ctgt procedure with more iterations actgt(y = y, X = X, xs = xs, hyps = hyps, maxit = 0, alpha = 0.05) #Result Iterations #"reject" "2" #which means that x1 is rejected by closed testing within two more iterations of the shortcut # For a group of feature sets mysets = list(xs[1:5], xs[c(1,4)], xs[c(1,4,5)]) sapply(mysets, function(i) actgt(y = y, X = X, xs = xs, hyps = i, maxit = 0, alpha = 0.05)) #Result "reject" "unsure" "reject" #Iterations "0" "0" "0" mysets = list(xs[1:5], xs[c(1,4)], xs[c(1,4,5)]) sapply(mysets, function(i) actgt(y = y, X = X, xs = xs, hyps = i, maxit = 0, alpha = 0.05)) #Result "reject" "reject" "reject" #Iterations "0" "2" "0"
Internal functions of ctgt.
## iterative shortcut with branch and bound actgt_it(y,Tmatrix, Cmatrix,fxs, sxs,Tf,Lamf,Cf,Ts,Lams,Cs,count=1,maxIt=1,a = 0.05) ## to check whether tmin is above cmax tacmax(tmins,levels,tw, cf,lf,ls,alp ) ## to check whether tmin is above ctrue tactrue(tmins,hyxs,cfull,Wmatrix,alp )
## iterative shortcut with branch and bound actgt_it(y,Tmatrix, Cmatrix,fxs, sxs,Tf,Lamf,Cf,Ts,Lams,Cs,count=1,maxIt=1,a = 0.05) ## to check whether tmin is above cmax tacmax(tmins,levels,tw, cf,lf,ls,alp ) ## to check whether tmin is above ctrue tactrue(tmins,hyxs,cfull,Wmatrix,alp )
y |
The response vector (numeric vector). |
Tmatrix |
The matrix used to calculate the test statistics. |
Cmatrix |
The matrix used to calculate the critical values. |
fxs |
The name vector of upper model (character vector). |
sxs |
The name vector of lower model (character vector). |
Tf , Lamf , Cf
|
Test statistic, eigenvalues, and critical value of fxs. |
Ts , Lams , Cs
|
Test statistic, eigenvalues, and critical value of sxs. |
count |
count the branches, default is 1. |
maxIt |
maximal number of branches chosen by user, default is 1. |
a , alp
|
alpha level. |
tmins |
Minimum test statistics. |
levels |
levels |
tw , cf , lf , ls
|
sorted weights, critical values and level for fxs and sxs. |
hyxs |
The name vector of the covariates of interest (character vector). |
cfull |
critical value of full model. |
Wmatrix |
matrix to calculate majorizing vector. |
Ningning Xu
Maintainer: Ningning Xu <[email protected]; [email protected]>
To get the majorizing vector at a specific level, given the upbound and lowbound.
getL (ub, lb, level)
getL (ub, lb, level)
ub |
upper bound. |
lb |
lower bound. |
level |
level of interest. |
Returns a numeric vector with the same length as ub and lb.
Ningning Xu
Maintainer: Ningning Xu <[email protected]; [email protected]>
Ningning Xu, Aldo solari, Jelle Goeman, Clsoed testing with global test, with applications on metabolomics data, arXiv:2001.01541, https://arxiv.org/abs/2001.01541
This is the sencond version of the globaltest, the non-standardized globaltest
## a powerful variant of globaltest gt2 (y, X, hyps, alpha = 0.05)
## a powerful variant of globaltest gt2 (y, X, hyps, alpha = 0.05)
y |
The response vector (numeric vector). |
X |
The full design matrix, whose columns are named by the covariates. |
hyps |
The name vector of the covariates in the pathway of interest (character vector). |
alpha |
The type I error rate allowed. The default is 0.05. |
Returns the p-value, the observed and expected test statistics and the number of covariates.
Ningning Xu
Maintainer: Ningning Xu <[email protected]; [email protected]>
Ningning Xu, Aldo solari, Jelle Goeman, Clsoed testing with global test, with applications on metabolomics data, arXiv:2001.01541, https://arxiv.org/abs/2001.01541
#Generate the design matrix and response vector for logistic regression models n= 100 m = 5 X = matrix(data = 0, nrow = n, ncol = m,byrow = TRUE ) for ( i in 1:n){ set.seed(1234+i) X[i,] = as.vector(arima.sim(model = list(order = c(1, 0, 0), ar = 0.2), n = m) ) } y = rbinom(n,1,0.6) X[which(y==1),1:3] = X[which(y==1),1:3] + 0.8 xs = paste("x",seq(1,m,1),sep="") colnames(X) = xs hyps=xs[1] #The raw p-values of globaltest gt2(y = y, X = X, hyps = hyps, alpha = 0.05) #p-value Statistic Expected #Cov #7.64e-03 2.30e+02 1.24e+02 1.00e+00
#Generate the design matrix and response vector for logistic regression models n= 100 m = 5 X = matrix(data = 0, nrow = n, ncol = m,byrow = TRUE ) for ( i in 1:n){ set.seed(1234+i) X[i,] = as.vector(arima.sim(model = list(order = c(1, 0, 0), ar = 0.2), n = m) ) } y = rbinom(n,1,0.6) X[which(y==1),1:3] = X[which(y==1),1:3] + 0.8 xs = paste("x",seq(1,m,1),sep="") colnames(X) = xs hyps=xs[1] #The raw p-values of globaltest gt2(y = y, X = X, hyps = hyps, alpha = 0.05) #p-value Statistic Expected #Cov #7.64e-03 2.30e+02 1.24e+02 1.00e+00
Robbins and Pitman Algorithm to calculate the p-value given the observed value and the eigenvalue vector.
pv(x, lam)
pv(x, lam)
x |
The observed value that is used to calculate the coresponding the p-value. |
lam |
The numeric vector with eigenvalues as elements. |
Returns a value between 0 and 1.
Ningning Xu
Maintainer: Ningning Xu <[email protected]; [email protected]>
Ningning Xu, Aldo solari, Jelle Goeman, Clsoed testing with global test, with applications on metabolomics data, arXiv:2001.01541, https://arxiv.org/abs/2001.01541
To count the number of true discoveries within a given pathway or feature set of interest.
discoveries (y, X, xs, hyps, maxit = 0, alpha = 0.05)
discoveries (y, X, xs, hyps, maxit = 0, alpha = 0.05)
y |
The response vector (numeric vector). |
X |
The full design matrix, whose columns are named by the covariates. |
xs |
The name vector of all covariates (character vector). |
hyps |
The name vector of the covariates in the pathway of interest (character vector). |
maxit |
An optional integer to denote the maximal interations for branch and bound method. The default value 0 means the single-step shortcut without branch and bound method. Note that larger value is more time-consuming. |
alpha |
The type I error rate allowed. The default is 0.05. |
Returns a non-negative interger.
Ningning Xu
Maintainer: Ningning Xu <[email protected]; [email protected]>
Ningning Xu, Aldo solari, Jelle Goeman, Clsoed testing with global test, with applications on metabolomics data, arXiv:2001.01541, https://arxiv.org/abs/2001.01541
#Generate the design matrix and response vector for logistic regression models n= 100 m = 5 X = matrix(data = 0, nrow = n, ncol = m,byrow = TRUE ) for ( i in 1:n){ set.seed(1234+i) X[i,] = as.vector(arima.sim(model = list(order = c(1, 0, 0), ar = 0.2), n = m) ) } y = rbinom(n,1,0.6) X[which(y==1),1:3] = X[which(y==1),1:3] + 0.8 xs = paste("x",seq(1,m,1),sep="") colnames(X) = xs # For standarized data X = scale(x = X,center = FALSE,scale = TRUE)/sqrt(n-1) interest = xs discoveries(y=y, X = X, xs = xs, hyps = interest) #2 discoveries(y=y, X = X, xs = xs, hyps = interest, maxit=10) #2
#Generate the design matrix and response vector for logistic regression models n= 100 m = 5 X = matrix(data = 0, nrow = n, ncol = m,byrow = TRUE ) for ( i in 1:n){ set.seed(1234+i) X[i,] = as.vector(arima.sim(model = list(order = c(1, 0, 0), ar = 0.2), n = m) ) } y = rbinom(n,1,0.6) X[which(y==1),1:3] = X[which(y==1),1:3] + 0.8 xs = paste("x",seq(1,m,1),sep="") colnames(X) = xs # For standarized data X = scale(x = X,center = FALSE,scale = TRUE)/sqrt(n-1) interest = xs discoveries(y=y, X = X, xs = xs, hyps = interest) #2 discoveries(y=y, X = X, xs = xs, hyps = interest, maxit=10) #2