Package 'pamr' reference manual

Title:	Pam: Prediction Analysis for Microarrays
Description:	Some functions for sample classification in microarrays.
Authors:	Trevor Hastie [aut], Rob Tibshirani [aut], Balasubramanian Narasimhan [aut, cre], Gilbert Chu [aut]
Maintainer:	Balasubramanian Narasimhan <[email protected]>
License:	GPL-2
Version:	1.57
Built:	2025-02-26 04:14:48 UTC
Source:	https://github.com/bnaras/pamr

Khan microarray data

Description

The khan data frame has 2308 rows and 65 columns. These are one of the datasets data used in the Tibshirani et al paper in PNAS on nearest shrunken centroids.

Details

The first two columns of gene ids and names and the remaining columns are gene expression values for 63 samples. An attribute cancer_type contains the cancer type for each sample.

A function to adaptive choose threshold scales, for use in pamr.train

Description

A function to adaptive choose threshold scales, for use in pamr.train

Usage

pamr.adaptthresh(object, ntries = 10, reduction.factor = 0.9, full.out = FALSE)
pamr.adaptthresh(object, ntries = 10, reduction.factor = 0.9, full.out = FALSE)

Arguments

`object`	The result of a call to pamr.train
`ntries`	Number of iterations to use in algorithm
`reduction.factor`	Amount by which a scaling is reduced in one step of the algorithm
`full.out`	Should full output be returned? Default FALSE

Details

pamr.adaptthresh Adaptively searches for set of good threshold scales. The baseline (default) scale is 1 for each class. The idea is that for easy to classify classes, the threshold scale can be increased without increasing the error rate for that class, and resulting in fewer genes needed for the classification rule. The scalings from pamr.adaptthresh are then used in pamr.train, and pamr.cv. The results may be better than those obtained with the default values of threshold.scale.

Author(s)

Trevor Hastie, Robert Tibshirani, Balasubramanian Narasimhan, and Gilbert Chu

References

Robert Tibshirani, Trevor Hastie, Balasubramanian Narasimhan, and Gilbert Chu. "Diagnosis of multiple cancer types by shrunken centroids of gene expression" PNAS 2002 99:6567-6572 (May 14).

Robert Tibshirani, Trevor Hastie, Balasubramanian Narasimhan, and Gilbert Chu (2002). Class prediction by nearest shrunken centroids,with applications to DNA microarrays. Stanford tech report.

Examples


suppressWarnings(RNGversion("3.5.0"))
set.seed(120)
x <- matrix(rnorm(1000*20),ncol=20)
y <- sample(c(1:4),size=20,replace=TRUE)
mydata <- list(x=x,y=y)
mytrain <-   pamr.train(mydata)
new.scales <- pamr.adaptthresh(mytrain)

 
mytrain2 <- pamr.train(mydata, threshold.scale=new.scales)

myresults2 <- pamr.cv(mytrain2, mydata)


suppressWarnings(RNGversion("3.5.0"))
set.seed(120)
x <- matrix(rnorm(1000*20),ncol=20)
y <- sample(c(1:4),size=20,replace=TRUE)
mydata <- list(x=x,y=y)
mytrain <-   pamr.train(mydata)
new.scales <- pamr.adaptthresh(mytrain)

 
mytrain2 <- pamr.train(mydata, threshold.scale=new.scales)

myresults2 <- pamr.cv(mytrain2, mydata)

A function to mean-adjust microarray data by batches

Description

A function to mean-adjust microarray data by batches

Usage

pamr.batchadjust(data)
pamr.batchadjust(data)

Arguments

data

The input data. A list with components: x- an expression genes in the rows, samples in the columns, and y- a vector of the class labels for each sample, and batchlabels- a vector of batch labels for each sample.

Details

pamr.batchadjust does a genewise one-way ANOVA adjustment for expression values. Let $x(i,j)$ be the expression for gene $i$ in sample $j$ . Suppose sample $j$ in in batch $b$ , and let $B$ be the set of all samples in batch $b$ . Then pamr.batchadjust adjusts $x(i,j)$ to $x(i,j) - mean[x(i,j)]$ where the mean is taken over all samples $j$ in $B$ .

Value

A data object of the same form as the input data, with x replaced by the adjusted x

Author(s)

Trevor Hastie,Robert Tibshirani, Balasubramanian Narasimhan, and Gilbert Chu

Examples


suppressWarnings(RNGversion("3.5.0"))
set.seed(120)
#generate some data
x <- matrix(rnorm(1000*20),ncol=20)
y <- sample(c(1:4),size=20,replace=TRUE)
batchlabels <- sample(c(1:5),size=20,replace=TRUE)
mydata <- list(x=x,y=factor(y),batchlabels=factor(batchlabels))

mydata2 <- pamr.batchadjust(mydata)

suppressWarnings(RNGversion("3.5.0"))
set.seed(120)
#generate some data
x <- matrix(rnorm(1000*20),ncol=20)
y <- sample(c(1:4),size=20,replace=TRUE)
batchlabels <- sample(c(1:5),size=20,replace=TRUE)
mydata <- list(x=x,y=factor(y),batchlabels=factor(batchlabels))

mydata2 <- pamr.batchadjust(mydata)

A function giving a table of true versus predicted values, from a nearest shrunken centroid fit.

Description

A function giving a table of true versus predicted values, from a nearest shrunken centroid fit.

Usage

pamr.confusion(fit, threshold, extra = TRUE)
pamr.confusion(fit, threshold, extra = TRUE)

Arguments

`fit`	The result of a call to pamr.train or pamr.cv
`threshold`	The desired threshold value
`extra`	Should the classwise and overall error rates be returned? Default TRUE

Details

pamr.confusion Gives a cross-tabulation of true versus predicted classes for the fit returned by pamr.train or pamr.cv, at the specified threshold.

Author(s)

Trevor Hastie, Robert Tibshirani, Balasubramanian Narasimhan, and Gilbert Chu

Examples


suppressWarnings(RNGversion("3.5.0"))
set.seed(120)
x <- matrix(rnorm(1000*20),ncol=20)
y <- sample(c(1:4),size=20,replace=TRUE)
mydata <- list(x=x,y=y)
mytrain <-   pamr.train(mydata)
mycv <- pamr.cv(mytrain,mydata)
pamr.confusion(mytrain,  threshold=2)
pamr.confusion(mycv,  threshold=2)
 

suppressWarnings(RNGversion("3.5.0"))
set.seed(120)
x <- matrix(rnorm(1000*20),ncol=20)
y <- sample(c(1:4),size=20,replace=TRUE)
mydata <- list(x=x,y=y)
mytrain <-   pamr.train(mydata)
mycv <- pamr.cv(mytrain,mydata)
pamr.confusion(mytrain,  threshold=2)
pamr.confusion(mycv,  threshold=2)

Compute confusin matrix from pamr survival fit

Description

computes confusion matrix for (survival.time,censoring) outcome based on fit object "fit" and class predictions "yhat" soft response probabilities for (survival.time,censoring) are first estimated using Kaplan-Meier method applied to training data

Usage

pamr.confusion.survival(fit, survival.time, censoring.status, yhat)
pamr.confusion.survival(fit, survival.time, censoring.status, yhat)

Arguments

`fit`	The result of a call to pamr.train or pamr.cv
`survival.time`	Survival time
`censoring.status`	censoring status
`yhat`	class predictions

Author(s)

Trevor Hastie, Robert Tibshirani, Balasubramanian Narasimhan, and Gilbert Chu

A function to cross-validate the nearest shrunken centroid classifier

Description

A function to cross-validate the nearest shrunken centroid classifier produced by pamr.train

Usage

pamr.cv(fit, data, nfold = NULL, folds = NULL, ...)
pamr.cv(fit, data, nfold = NULL, folds = NULL, ...)

Arguments

`fit`	The result of a call to pamr.train
`data`	A list with at least two components: x- an expression genes in the rows, samples in the columns), and y- a vector of the class labels for each sample. Same form as data object used by pamr.train.
`nfold`	Number of cross-validation folds. Default is the smallest class size
`folds`	A list with nfold components, each component a vector of indices of the samples in that fold. By default a (random) balanced cross-validation is used
`...`	Any additional arguments that are to be passed to pamr.train

Details

pamr.cv carries out cross-validation for a nearest shrunken centroid classifier.

Value

A list with components

`threshold`	A vector of the thresholds tried in the shrinkage
`errors`	The number of cross-validation errors for each threshold value
`loglik`	The cross-validated multinomial log-likelihood value for each threshold value
`size`	A vector of the number of genes that survived the thresholding, for each threshold value tried.

`yhat`	A matrix of size n by nthreshold, containing the cross-validated class predictions for each threshold value, in each column
`prob`	A matrix of size n by nthreshold, containing the cross-validated class probabilities for each threshold value, in each column
`folds`	The cross-validation folds used
`cv.objects`	Train objects (output of pamr.train), from each of the CV folds
`call`	The calling sequence used

Author(s)

Trevor Hastie,Robert Tibshirani, Balasubramanian Narasimhan, and Gilbert Chu

Examples


suppressWarnings(RNGversion("3.5.0"))
set.seed(120)
x <- matrix(rnorm(1000*20),ncol=20)
y <- sample(c(1:4),size=20,replace=TRUE)

mydata <- list(x=x,y=factor(y), geneid=as.character(1:nrow(x)),
 genenames=paste("g",as.character(1:nrow(x)),sep=""))

mytrain <-   pamr.train(mydata)
mycv <- pamr.cv(mytrain,mydata)

suppressWarnings(RNGversion("3.5.0"))
set.seed(120)
x <- matrix(rnorm(1000*20),ncol=20)
y <- sample(c(1:4),size=20,replace=TRUE)

mydata <- list(x=x,y=factor(y), geneid=as.character(1:nrow(x)),
 genenames=paste("g",as.character(1:nrow(x)),sep=""))

mytrain <-   pamr.train(mydata)
mycv <- pamr.cv(mytrain,mydata)

A function to decorrelate (adjust) the feature matrix with respect to some additional predictors

Description

A function to decorrelate (adjust) the feature matrix with respect to some additional predictors

Usage

pamr.decorrelate(
  x,
  adjusting.predictors,
  xtest = NULL,
  adjusting.predictors.test = NULL
)
pamr.decorrelate(
  x,
  adjusting.predictors,
  xtest = NULL,
  adjusting.predictors.test = NULL
)

Arguments

`x`	Matrix of training set feature values, with genes in the rows, samples in the columns
`adjusting.predictors`	List of training set predictors to be used for adjustment
`xtest`	Optional matrix of test set feature values, to be adjusted in the same way as the training set
`adjusting.predictors.test`	Optional list of test set predictors to be used for adjustment

Details

pamr.decorrelate Does a least squares regression of each row of x on the adjusting predictors, and returns the residuals. If xtest is provided, it also returns the adjusted version of xtest, using the training set least squares regression model for adjustment

Value

A list with components

`x.adj`	Adjusted x matrix
`xtest.adj`	Adjusted xtest matrix, if xtest we provided

Author(s)

Trevor Hastie,Robert Tibshirani, Balasubramanian Narasimhan, and Gilbert Chu

References

Robert Tibshirani, Trevor Hastie, Balasubramanian Narasimhan, and Gilbert Chu Diagnosis of multiple cancer types by shrunken centroids of gene expression PNAS 99: 6567-6572. Available at www.pnas.org

Examples


#generate some data
suppressWarnings(RNGversion("3.5.0"))
set.seed(120)

x<-matrix(rnorm(1000*20),ncol=20)
y<-c(rep(1,10),rep(2,10))
adjusting.predictors=list(pred1=rnorm(20), pred2=as.factor(sample(c(1,2),replace
=TRUE,size=20)))
xtest=matrix(rnorm(1000*10),ncol=10)
adjusting.predictors.test=list(pred1=rnorm(10), pred2=as.factor(sample(c(1,2),replace
=TRUE,size=10)))

# decorrelate training x wrt adjusting predictors

x.adj=pamr.decorrelate(x,adjusting.predictors)$x.adj
# train classifier with adjusted x

d=list(x=x.adj,y=y)
a<-pamr.train(d)

# decorrelate training and test x wrt adjusting predictors, then make
#predictions for test set

temp <- pamr.decorrelate(x,adjusting.predictors, xtest=xtest,
                         adjusting.predictors.test=adjusting.predictors.test)

d=list(x=temp$x.adj,y=y)
a<-pamr.train(d)
aa<-pamr.predict(a,temp$xtest.adj, threshold=.5)

#generate some data
suppressWarnings(RNGversion("3.5.0"))
set.seed(120)

x<-matrix(rnorm(1000*20),ncol=20)
y<-c(rep(1,10),rep(2,10))
adjusting.predictors=list(pred1=rnorm(20), pred2=as.factor(sample(c(1,2),replace
=TRUE,size=20)))
xtest=matrix(rnorm(1000*10),ncol=10)
adjusting.predictors.test=list(pred1=rnorm(10), pred2=as.factor(sample(c(1,2),replace
=TRUE,size=10)))

# decorrelate training x wrt adjusting predictors

x.adj=pamr.decorrelate(x,adjusting.predictors)$x.adj
# train classifier with adjusted x

d=list(x=x.adj,y=y)
a<-pamr.train(d)

# decorrelate training and test x wrt adjusting predictors, then make
#predictions for test set

temp <- pamr.decorrelate(x,adjusting.predictors, xtest=xtest,
                         adjusting.predictors.test=adjusting.predictors.test)

d=list(x=temp$x.adj,y=y)
a<-pamr.train(d)
aa<-pamr.predict(a,temp$xtest.adj, threshold=.5)

A function to estimate false discovery rates for the nearest shrunken centroid classifier

Description

A function to estimate false discovery rates for the nearest shrunken centroid classifier

Usage

pamr.fdr(
  trained.obj,
  data,
  nperms = 100,
  xl.mode = c("regular", "firsttime", "onetime", "lasttime"),
  xl.time = NULL,
  xl.prevfit = NULL
)
pamr.fdr(
  trained.obj,
  data,
  nperms = 100,
  xl.mode = c("regular", "firsttime", "onetime", "lasttime"),
  xl.time = NULL,
  xl.prevfit = NULL
)

Arguments

`trained.obj`	The result of a call to pamr.train
`data`	Data object; same as the one passed to pamr.train
`nperms`	Number of permutations for estimation of FDRs. Default is 100
`xl.mode`	Used by Excel interface
`xl.time`	Used by Excel interface
`xl.prevfit`	Used by Excel interface

Details

pamr.fdr estimates false discovery rates for a nearest shrunken centroid classifier

Value

A list with components:

`results`	Matrix of estimates FDRs for various various threshold values. Reported are both the median and 90th percentile of the FDR over permutations
`pi0`	The estimated proportion of genes that are null, i.e. not significantly different

Author(s)

Trevor Hastie,Robert Tibshirani, Balasubramanian Narasimhan, and Gilbert Chu

Examples


suppressWarnings(RNGversion("3.5.0"))
set.seed(120)
x <- matrix(rnorm(1000*20),ncol=20)
y <- sample(c(1:4),size=20,replace=TRUE)

mydata <- list(x=x,y=factor(y), geneid=as.character(1:nrow(x)),
               genenames=paste("g",as.character(1:nrow(x)),sep=""))

mytrain <-   pamr.train(mydata)
myfdr <- pamr.fdr(mytrain, mydata)

suppressWarnings(RNGversion("3.5.0"))
set.seed(120)
x <- matrix(rnorm(1000*20),ncol=20)
y <- sample(c(1:4),size=20,replace=TRUE)

mydata <- list(x=x,y=factor(y), geneid=as.character(1:nrow(x)),
               genenames=paste("g",as.character(1:nrow(x)),sep=""))

mytrain <-   pamr.train(mydata)
myfdr <- pamr.fdr(mytrain, mydata)

A function to plot the genes that surive the thresholding from the nearest shrunken centroid classifier

Description

A function to plot the genes that survive the thresholding, from the nearest shrunken centroid classifier produced by pamr.train

Usage

pamr.geneplot(fit, data, threshold)
pamr.geneplot(fit, data, threshold)

Arguments

`fit`	The result of a call to pamr.train
`data`	The input data. In the same format as the input data for pamr.train
`threshold`	The desired threshold value

Details

pamr.geneplot Plots the raw gene expression for genes that survive the specified threshold. Plot is stratified by class. Plot is set up to display only up to about 20 or 25 genes, otherwise it gets too crowded. Hence threshold should be chosen to yield at most about 20 or 25 genes.

Author(s)

Trevor Hastie, Robert Tibshirani, Balasubramanian Narasimhan, and Gilbert Chu

Examples


suppressWarnings(RNGversion("3.5.0"))
set.seed(120)
x <- matrix(rnorm(1000*20),ncol=20)
y <- sample(c(1:4),size=20,replace=TRUE)
mydata <- list(x=x,y=y)
mytrain <-   pamr.train(mydata)
pamr.geneplot(mytrain, mydata, threshold=1.6)
 

suppressWarnings(RNGversion("3.5.0"))
set.seed(120)
x <- matrix(rnorm(1000*20),ncol=20)
y <- sample(c(1:4),size=20,replace=TRUE)
mydata <- list(x=x,y=y)
mytrain <-   pamr.train(mydata)
pamr.geneplot(mytrain, mydata, threshold=1.6)

A function that takes estimate class probabilities and produces a class prediction or indeterminate prediction

Description

A function that takes estimate class probabilities and produces a class prediction or indeterminate prediction

Usage

pamr.indeterminate(prob, mingap = 0)
pamr.indeterminate(prob, mingap = 0)

Arguments

`prob`	Estimated class probabilities, from pamr.predict with type="posterior")
`mingap`	Minimum difference between highest and second highest probability. If difference is < mingap, prediction is set to indeterminate (NA)

Author(s)

Trevor Hastie, Robert Tibshirani, Balasubramanian Narasimhan, and Gilbert Chu

Examples


suppressWarnings(RNGversion("3.5.0"))
set.seed(120)
x <- matrix(rnorm(1000*20),ncol=20)
y <- sample(c(1:4),size=20,replace=TRUE)
mydata <- list(x=x,y=y)
mytrain <-   pamr.train(mydata)
prob<- pamr.predict(mytrain, mydata$x , threshold=1, type="posterior")
pamr.indeterminate(prob,mingap=.75)

suppressWarnings(RNGversion("3.5.0"))
set.seed(120)
x <- matrix(rnorm(1000*20),ncol=20)
y <- sample(c(1:4),size=20,replace=TRUE)
mydata <- list(x=x,y=y)
mytrain <-   pamr.train(mydata)
prob<- pamr.predict(mytrain, mydata$x , threshold=1, type="posterior")
pamr.indeterminate(prob,mingap=.75)

A function to list the genes that survive the thresholding, from the nearest shrunken centroid classifier

Description

A function to list the genes that survive the thresholding, from the nearest shrunken centroid classifier produced by pamr.train

Usage

pamr.listgenes(fit, data, threshold, fitcv = NULL, genenames = FALSE)
pamr.listgenes(fit, data, threshold, fitcv = NULL, genenames = FALSE)

Arguments

`fit`	The result of a call to pamr.train
`data`	The input data. In the same format as the input data for pamr.train
`threshold`	The desired threshold value
`fitcv`	Optional object, result of a call to pamr.cv
`genenames`	Include genenames in the list? If yes, they are taken from "data". Default is false (geneid is always included in the list).

Details

pamr.listgenes List the geneids, and standardized centroids for each class, for genes surviving at the given threshold. If fitcv is provided, the function also reports the average rank of the gene in the cross-validation folds, and the proportion of times that the gene is chosen (at the given threshold) in the cross-validation folds.

Author(s)

Trevor Hastie, Robert Tibshirani, Balasubramanian Narasimhan, and Gilbert Chu

Examples



#generate some data
suppressWarnings(RNGversion("3.5.0"))
set.seed(120)
x <- matrix(rnorm(1000*20),ncol=20)
y <- sample(c(1:4),size=20,replace=TRUE)

mydata <- list(x=x,y=factor(y), geneid=as.character(1:nrow(x)),
               genenames=paste("g",as.character(1:nrow(x)),sep=""))


#train classifier
mytrain<-   pamr.train(mydata)

pamr.listgenes(mytrain, mydata, threshold=1.6)
 

#generate some data
suppressWarnings(RNGversion("3.5.0"))
set.seed(120)
x <- matrix(rnorm(1000*20),ncol=20)
y <- sample(c(1:4),size=20,replace=TRUE)

mydata <- list(x=x,y=factor(y), geneid=as.character(1:nrow(x)),
               genenames=paste("g",as.character(1:nrow(x)),sep=""))


#train classifier
mytrain<-   pamr.train(mydata)

pamr.listgenes(mytrain, mydata, threshold=1.6)

A function to interactively define classes from a clustering tree

Description

function to interactively define classes from a clustering tree

Usage

pamr.makeclasses(data, sort.by.class = FALSE, ...)
pamr.makeclasses(data, sort.by.class = FALSE, ...)

Arguments

`data`	The input data. A list with components: x- an expression genes in the rows, samples in the columns, and y- a vector of the class labels for each sample, and batchlabels- a vector of batch labels for each sample. This object if the same form as that produced by pamr.from.excel.
`sort.by.class`	Optional argument. If true, the clustering tree is forced to put all samples in the same class (as defined by the class labels y in ‘data’) together in the tree. This is useful if a regrouping of classes is desired. Eg: given classes 1,2,3,4 you want to define new classes (1,3) vs (2,4) or 2 vs (1,3)
`...`	Any additional arguments to be passed to hclust

Details

pamr.makeclasses Using this function the user interactively defines a new set of classes, to be used in pamr.train, pamr.cv etc. After invoking pamr.makeclasses, a clustering tree is drawn. This callss the R function hclust, and any arguments for hclust can be passed to it. Using the left button, the user clicks at the junction point defining the subgroup 1. More groups can be added to class 1 by clicking on further junction points. The user ends the definition of class 1 by clicking on the rightmost button (in Windows, an additional menu appears and he chooses Stop). This process is continued for classes 2,3 etc. Note that some sample may be left out of the new classes. Two consecutive clicks of the right button ends the definition for all classes.

At the end, the clustering is redrawn, with the new class labels shown.

Note: this function is "fragile". The user must click close to the junction point, to avoid confusion with other junction points. Classes 1,2,3.. cannot have samples in common (if they do, an Error message will appear). If the function is confused about the desired choices, it will complain and ask the user to rerun pamr.makeclasses. The user should also check that the labels on the final redrawn cluster tree agrees with the desired classes.

Value

A vector of class labels 1,2,3... If a component is NA (missing), then the sample is not assigned to any class. This vector should be assigned to the newy component of data, for use in pamr.train etc. Note that pamr.train uses the class labels in the component ⁠newy'' if it is present. Otherwise it uses the data labels ⁠y”.

Author(s)

Trevor Hastie, Robert Tibshirani, Balasubramanian Narasimhan, and Gilbert Chu

Examples


suppressWarnings(RNGversion("3.5.0"))
set.seed(120)
#generate some data
x <- matrix(rnorm(1000*40),ncol=40)
y <- sample(c(1:4),size=40,replace=TRUE)
batchlabels <- sample(c(1:5),size=40,replace=TRUE)

mydata <- list(x=x,y=factor(y),batchlabels=factor(batchlabels),
               geneid=as.character(1:nrow(x)),
               genenames=paste("g",as.character(1:nrow(x)),sep=""))

# mydata$newy <- pamr.makeclasses(mydata) Run this and define some new classes

train <- pamr.train(mydata)

suppressWarnings(RNGversion("3.5.0"))
set.seed(120)
#generate some data
x <- matrix(rnorm(1000*40),ncol=40)
y <- sample(c(1:4),size=40,replace=TRUE)
batchlabels <- sample(c(1:5),size=40,replace=TRUE)

mydata <- list(x=x,y=factor(y),batchlabels=factor(batchlabels),
               geneid=as.character(1:nrow(x)),
               genenames=paste("g",as.character(1:nrow(x)),sep=""))

# mydata$newy <- pamr.makeclasses(mydata) Run this and define some new classes

train <- pamr.train(mydata)

A function that interactively leads the user through a PAM analysis

Description

A function that interactively leads the user through a PAM analysis

Usage

pamr.menu(data)
pamr.menu(data)

Arguments

data

A list with at least two components: x- an expression genes in the rows, samples in the columns), and y- a vector of the class labels for each sample. Same form as data object used by pamr.train.

Details

pamr.menu provides a menu for training, cross-validating and plotting a nearest shrunken centroid analysis.

Author(s)

Trevor Hastie, Robert Tibshirani, Balasubramanian Narasimhan, and Gilbert Chu

Examples


suppressWarnings(RNGversion("3.5.0"))
set.seed(120)
x <- matrix(rnorm(1000*20),ncol=20)
y <- sample(c(1:4),size=20,replace=TRUE)
mydata <- list(x=x,y=y)
#  pamr.menu(mydata)

suppressWarnings(RNGversion("3.5.0"))
set.seed(120)
x <- matrix(rnorm(1000*20),ncol=20)
y <- sample(c(1:4),size=20,replace=TRUE)
mydata <- list(x=x,y=y)
#  pamr.menu(mydata)

A function to plot the shrunken class centroids, from the nearest shrunken centroid classifier

Description

A function to plot the shrunken class centroids, from the nearest shrunken centroid classifier produced by pamr.train

Usage

pamr.plotcen(fit, data, threshold)
pamr.plotcen(fit, data, threshold)

Arguments

`fit`	The result of a call to pamr.train
`data`	The input data, in the same form as that used by pamr.train
`threshold`	The desired threshold value

Details

pamr.plotcen plots the shrunken class centroids for each class, for genes surviving the threshold for at least once class. If genenames are included in "data", they are added to the plot. Note: for many classes and long gene names, this plot may need some manual prettying.

Author(s)

Trevor Hastie, Robert Tibshirani, Balasubramanian Narasimhan, and Gilbert Chu

Examples


suppressWarnings(RNGversion("3.5.0"))
set.seed(120)
x <- matrix(rnorm(1000*20),ncol=20)
y <- sample(c(1:4),size=20,replace=TRUE)
mydata <- list(x=x,y=y,genenames=as.character(1:1000))
mytrain <-   pamr.train(mydata)
mycv <- pamr.cv(mytrain,mydata)
pamr.plotcen(mytrain, mydata,threshold=1.6)
 

suppressWarnings(RNGversion("3.5.0"))
set.seed(120)
x <- matrix(rnorm(1000*20),ncol=20)
y <- sample(c(1:4),size=20,replace=TRUE)
mydata <- list(x=x,y=y,genenames=as.character(1:1000))
mytrain <-   pamr.train(mydata)
mycv <- pamr.cv(mytrain,mydata)
pamr.plotcen(mytrain, mydata,threshold=1.6)

A function to plot the cross-validated error curves from the nearest shrunken centroid classifier

Description

A function to plot the cross-validated error curves the nearest shrunken centroid classifier

Usage

pamr.plotcv(fit)
pamr.plotcv(fit)

Arguments

fit

The result of a call to pamr.cv

Details

pamr.plotcv plots the cross-validated misclassification error curves, from nearest shrunken centroid classifier. An overall plot, and a plot by class, are produced.

Author(s)

Trevor Hastie,Robert Tibshirani, Balasubramanian Narasimhan, and Gilbert Chu

Examples


suppressWarnings(RNGversion("3.5.0"))
set.seed(120)
x <- matrix(rnorm(1000*20),ncol=20)
y <- sample(c(1:4),size=20,replace=TRUE)
mydata <- list(x=x,y=y)
mytrain <-   pamr.train(mydata)
mycv <-  pamr.cv(mytrain, mydata)
pamr.plotcv(mycv)

suppressWarnings(RNGversion("3.5.0"))
set.seed(120)
x <- matrix(rnorm(1000*20),ncol=20)
y <- sample(c(1:4),size=20,replace=TRUE)
mydata <- list(x=x,y=y)
mytrain <-   pamr.train(mydata)
mycv <-  pamr.cv(mytrain, mydata)
pamr.plotcv(mycv)

A function to plot the cross-validated sample probabilities from the nearest shrunken centroid classifier

Description

A function to plot the cross-validated sample probabilities from the nearest shrunken centroid classifier

Usage

pamr.plotcvprob(fit, data, threshold)
pamr.plotcvprob(fit, data, threshold)

Arguments

`fit`	The result of a call to pamr.cv
`data`	A list with at least two components: x- an expression genes in the rows, samples in the columns), and y- a vector of the class labels for each sample. Same form as data object used by pamr.train.
`threshold`	Threshold value to be used

Details

pamr.plotcvprob plots the cross-validated sample probabilities the from nearest shrunken centroid classifier, stratified by the true classses.

Author(s)

Trevor Hastie,Robert Tibshirani, Balasubramanian Narasimhan, and Gilbert Chu

Examples


suppressWarnings(RNGversion("3.5.0"))
set.seed(120)
x <- matrix(rnorm(1000*20),ncol=20)
y <- sample(c(1:4),size=20,replace=TRUE)
mydata <- list(x=x,y=y)
mytrain <-   pamr.train(mydata)
mycv <-  pamr.cv(mytrain,mydata)
pamr.plotcvprob(mycv,mydata,threshold=1.6)




suppressWarnings(RNGversion("3.5.0"))
set.seed(120)
x <- matrix(rnorm(1000*20),ncol=20)
y <- sample(c(1:4),size=20,replace=TRUE)
mydata <- list(x=x,y=y)
mytrain <-   pamr.train(mydata)
mycv <-  pamr.cv(mytrain,mydata)
pamr.plotcvprob(mycv,mydata,threshold=1.6)

A function to plot the FDR curve from the nearest shrunken centroid classifier

Description

A function to plot the FDR curve the nearest shrunken centroid classifier

Usage

pamr.plotfdr(fdrfit, call.win.metafile = FALSE)
pamr.plotfdr(fdrfit, call.win.metafile = FALSE)

Arguments

`fdrfit`	The result of a call to pamr.fdr
`call.win.metafile`	Used by Excel interface

Details

pamr.plotfdr plots the FDR curves from nearest shrunken centroid classifier. The median FDR (solid line) and upper 90 percentile (broken line) are shown

Author(s)

Trevor Hastie,Robert Tibshirani, Balasubramanian Narasimhan, and Gilbert Chu

Examples


suppressWarnings(RNGversion("3.5.0"))
set.seed(120)
x <- matrix(rnorm(1000*20),ncol=20)
y <- sample(c(1:2),size=20,replace=TRUE)
x[1:50,y==2]=x[1:50,y==2]+3
mydata <- list(x=x,y=y)
mytrain <-   pamr.train(mydata)
myfdr <-  pamr.fdr(mytrain, mydata)
pamr.plotfdr(myfdr)

suppressWarnings(RNGversion("3.5.0"))
set.seed(120)
x <- matrix(rnorm(1000*20),ncol=20)
y <- sample(c(1:2),size=20,replace=TRUE)
x[1:50,y==2]=x[1:50,y==2]+3
mydata <- list(x=x,y=y)
mytrain <-   pamr.train(mydata)
myfdr <-  pamr.fdr(mytrain, mydata)
pamr.plotfdr(myfdr)

A function to plot the survival curves in each Kaplan Meier stratum

Description

A function to plot the survival curves in each Kaplan Meier stratum

Usage

pamr.plotstrata(fit, survival.time, censoring.status)
pamr.plotstrata(fit, survival.time, censoring.status)

Arguments

`fit`	The result of a call to pamr.train
`survival.time`	Vector of survival times
`censoring.status`	Vector of censoring status values

Author(s)

Trevor Hastie,Robert Tibshirani, Balasubramanian Narasimhan, and Gilbert Chu

Examples



gendata<-function(n=100, p=2000){
  tim <- 3*abs(rnorm(n))
  u<-runif(n,min(tim),max(tim))
  y<-pmin(tim,u)
   ic<-1*(tim<u)
m <- median(tim)
x<-matrix(rnorm(p*n),ncol=n)
  x[1:100, tim>m] <-  x[1:100, tim>m]+3
  return(list(x=x,y=y,ic=ic))
}

# generate training data; 2000 genes, 100 samples

junk<-gendata(n=100)
y<-junk$y
ic<-junk$ic
x<-junk$x
d <- list(x=x,survival.time=y, censoring.status=ic,
geneid=as.character(1:nrow(x)), genenames=paste("g",as.character(1:nrow(x)),sep=
""))

# train model
a3<- pamr.train(d, ngroup.survival=2)


pamr.plotstrata(a3, d$survival.time, d$censoring.status)

gendata<-function(n=100, p=2000){
  tim <- 3*abs(rnorm(n))
  u<-runif(n,min(tim),max(tim))
  y<-pmin(tim,u)
   ic<-1*(tim<u)
m <- median(tim)
x<-matrix(rnorm(p*n),ncol=n)
  x[1:100, tim>m] <-  x[1:100, tim>m]+3
  return(list(x=x,y=y,ic=ic))
}

# generate training data; 2000 genes, 100 samples

junk<-gendata(n=100)
y<-junk$y
ic<-junk$ic
x<-junk$x
d <- list(x=x,survival.time=y, censoring.status=ic,
geneid=as.character(1:nrow(x)), genenames=paste("g",as.character(1:nrow(x)),sep=
""))

# train model
a3<- pamr.train(d, ngroup.survival=2)


pamr.plotstrata(a3, d$survival.time, d$censoring.status)

A function to plots Kaplan-Meier curves stratified by a group variable

Description

A function to plots Kaplan-Meier curves stratified by a group variable

Usage

pamr.plotsurvival(group, survival.time, censoring.status)
pamr.plotsurvival(group, survival.time, censoring.status)

Arguments

`group`	A grouping factor
`survival.time`	Vector of survival times
`censoring.status`	Vector of censoring status values: 1=died, 0=censored

Author(s)

Trevor Hastie,Robert Tibshirani, Balasubramanian Narasimhan, and Gilbert Chu

Examples



gendata<-function(n=100, p=2000){
  tim <- 3*abs(rnorm(n))
  u<-runif(n,min(tim),max(tim))
  y<-pmin(tim,u)
   ic<-1*(tim<u)
m <- median(tim)
x<-matrix(rnorm(p*n),ncol=n)
  x[1:100, tim>m] <-  x[1:100, tim>m]+3
  return(list(x=x,y=y,ic=ic))
}

# generate training data; 2000 genes, 100 samples

junk<-gendata(n=100)
y<-junk$y
ic<-junk$ic
x<-junk$x
d <- list(x=x,survival.time=y, censoring.status=ic,
geneid=as.character(1:nrow(x)), genenames=paste("g",as.character(1:nrow(x)),sep=
""))

# train model
a3<- pamr.train(d, ngroup.survival=2)

#make class predictions

yhat <- pamr.predict(a3,d$x, threshold=1.0)

pamr.plotsurvival(yhat, d$survival.time, d$censoring.status)


gendata<-function(n=100, p=2000){
  tim <- 3*abs(rnorm(n))
  u<-runif(n,min(tim),max(tim))
  y<-pmin(tim,u)
   ic<-1*(tim<u)
m <- median(tim)
x<-matrix(rnorm(p*n),ncol=n)
  x[1:100, tim>m] <-  x[1:100, tim>m]+3
  return(list(x=x,y=y,ic=ic))
}

# generate training data; 2000 genes, 100 samples

junk<-gendata(n=100)
y<-junk$y
ic<-junk$ic
x<-junk$x
d <- list(x=x,survival.time=y, censoring.status=ic,
geneid=as.character(1:nrow(x)), genenames=paste("g",as.character(1:nrow(x)),sep=
""))

# train model
a3<- pamr.train(d, ngroup.survival=2)

#make class predictions

yhat <- pamr.predict(a3,d$x, threshold=1.0)

pamr.plotsurvival(yhat, d$survival.time, d$censoring.status)

A function giving prediction information, from a nearest shrunken centroid fit.

Description

A function giving prediction information, from a nearest shrunken centroid fit

Usage

pamr.predict(
  fit,
  newx,
  threshold,
  type = c("class", "posterior", "centroid", "nonzero"),
  prior = fit$prior,
  threshold.scale = fit$threshold.scale
)
pamr.predict(
  fit,
  newx,
  threshold,
  type = c("class", "posterior", "centroid", "nonzero"),
  prior = fit$prior,
  threshold.scale = fit$threshold.scale
)

Arguments

`fit`	The result of a call to pamr.train
`newx`	Matrix of features at which predictions are to be made
`threshold`	The desired threshold value
`type`	Type of prediction desired: class predictions, posterior probabilities, (unshrunken) class centroids, vector of genes surviving the threshold
`prior`	Prior probabilities for each class. Default is that specified in "fit"
`threshold.scale`	Additional scaling factors to be applied to the thresholds. Vector of length equal to the number of classes. Default is that specified in "fit".

Details

pamr.predict Give a cross-tabulation of true versus predicted classes for the fit returned by pamr.train or pamr.cv, at the specified threshold

Author(s)

Trevor Hastie, Robert Tibshirani, Balasubramanian Narasimhan, and Gilbert Chu

Examples


suppressWarnings(RNGversion("3.5.0"))
set.seed(120)
x <- matrix(rnorm(1000*20),ncol=20)
y <- sample(c(1:4),size=20,replace=TRUE)
mydata <- list(x=x,y=y)
mytrain <-   pamr.train(mydata)
mycv <- pamr.cv(mytrain,mydata)
pamr.predict(mytrain, mydata$x , threshold=1)
 

suppressWarnings(RNGversion("3.5.0"))
set.seed(120)
x <- matrix(rnorm(1000*20),ncol=20)
y <- sample(c(1:4),size=20,replace=TRUE)
mydata <- list(x=x,y=y)
mytrain <-   pamr.train(mydata)
mycv <- pamr.cv(mytrain,mydata)
pamr.predict(mytrain, mydata$x , threshold=1)

A function giving prediction information for many threshold values, from a nearest shrunken centroid fit.

Description

A function giving prediction information for many threshold values, from a nearest shrunken centroid fit

Usage

pamr.predictmany(
  fit,
  newx,
  threshold = fit$threshold,
  prior = fit$prior,
  threshold.scale = fit$threshold.scale,
  ...
)
pamr.predictmany(
  fit,
  newx,
  threshold = fit$threshold,
  prior = fit$prior,
  threshold.scale = fit$threshold.scale,
  ...
)

Arguments

`fit`	The result of a call to pamr.train
`newx`	Matrix of features at which predictions are to be made
`threshold`	The desired threshold values
`prior`	Prior probabilities for each class. Default is that specified in "fit"
`threshold.scale`	Additional scaling factors to be applied to the thresholds. Vector of length equal to the number of classes. Default is that specified in "fit".
`...`	Additional arguments to be passed to pamr.predict

Author(s)

Trevor Hastie, Robert Tibshirani, Balasubramanian Narasimhan, and Gilbert Chu

Examples


suppressWarnings(RNGversion("3.5.0"))
set.seed(120)
x <- matrix(rnorm(1000*20),ncol=20)
y <- sample(c(1:4),size=20,replace=TRUE)
mydata <- list(x=x,y=y)
mytrain <-   pamr.train(mydata)

pamr.predictmany(mytrain, mydata$x)
 

suppressWarnings(RNGversion("3.5.0"))
set.seed(120)
x <- matrix(rnorm(1000*20),ncol=20)
y <- sample(c(1:4),size=20,replace=TRUE)
mydata <- list(x=x,y=y)
mytrain <-   pamr.train(mydata)

pamr.predictmany(mytrain, mydata$x)

A function to assign observations to categories, based on their survival times.

Description

A function to assign observations to categories, based on their survival times.

Usage

pamr.surv.to.class2(
  y,
  icens,
  cutoffs = NULL,
  n.class = NULL,
  class.names = NULL,
  newy = y,
  newic = icens
)
pamr.surv.to.class2(
  y,
  icens,
  cutoffs = NULL,
  n.class = NULL,
  class.names = NULL,
  newy = y,
  newic = icens
)

Arguments

`y`	vector of survival times
`icens`	Vector of censorng status values: 1=died, 0=censored
`cutoffs`	Survival time cutoffs for categories. Default NULL
`n.class`	Number of classes to create: if cutoffs is NULL, n.class equal classes are created.
`class.names`	Character names for classes
`newy`	New set of survival times, for which probabilities are computed (see below). Default is y
`newic`	New set of censoring statuses, for which probabilities are computed (see below). Default is icens

Details

pamr.pamr.surv.to.class2 splits observations into categories based on their survival times and the Kaplan-Meier estimates. For example if n.class=2, it makes two categories, one below the median survival, the other above. For each observation (newy, ic), it then computes the probability of that observation falling in each category. For an uncensored observation that probability is just 1 or 0 depending on when the death occurred. For a censored observation, the probabilities are based on the Kaplan Meier and are typically between 0 and 1.

Value

`class`	The category labels
`prob`	The estimates class probabilities
`cutoffs`	The cutoffs used

Author(s)

Trevor Hastie, Robert Tibshirani, Balasubramanian Narasimhan, and Gilbert Chu

Examples



gendata<-function(n=100, p=2000){
  tim <- 3*abs(rnorm(n))
  u<-runif(n,min(tim),max(tim))
  y<-pmin(tim,u)
   ic<-1*(tim<u)
m <- median(tim)
x<-matrix(rnorm(p*n),ncol=n)
  x[1:100, tim>m] <-  x[1:100, tim>m]+3
  return(list(x=x,y=y,ic=ic))
}

# generate training data; 2000 genes, 100 samples

junk<-gendata(n=100)
y<-junk$y
ic<-junk$ic
x<-junk$x
d <- list(x=x,survival.time=y, censoring.status=ic,
geneid=as.character(1:nrow(x)), genenames=paste("g",as.character(1:nrow(x)),sep=
""))

# train model
a3<- pamr.train(d, ngroup.survival=2)

# generate test data
junkk<- gendata(n=500)

dd <- list(x=junkk$x, survival.time=junkk$y, censoring.status=junkk$ic)

# compute soft labels
proby <-  pamr.surv.to.class2(dd$survival.time, dd$censoring.status,
             n.class=a3$ngroup.survival)$prob


gendata<-function(n=100, p=2000){
  tim <- 3*abs(rnorm(n))
  u<-runif(n,min(tim),max(tim))
  y<-pmin(tim,u)
   ic<-1*(tim<u)
m <- median(tim)
x<-matrix(rnorm(p*n),ncol=n)
  x[1:100, tim>m] <-  x[1:100, tim>m]+3
  return(list(x=x,y=y,ic=ic))
}

# generate training data; 2000 genes, 100 samples

junk<-gendata(n=100)
y<-junk$y
ic<-junk$ic
x<-junk$x
d <- list(x=x,survival.time=y, censoring.status=ic,
geneid=as.character(1:nrow(x)), genenames=paste("g",as.character(1:nrow(x)),sep=
""))

# train model
a3<- pamr.train(d, ngroup.survival=2)

# generate test data
junkk<- gendata(n=500)

dd <- list(x=junkk$x, survival.time=junkk$y, censoring.status=junkk$ic)

# compute soft labels
proby <-  pamr.surv.to.class2(dd$survival.time, dd$censoring.status,
             n.class=a3$ngroup.survival)$prob

A function giving a table of true versus predicted values, from a nearest shrunken centroid fit from survival data.

Description

A function giving a table of true versus predicted values, from a nearest shrunken centroid fit from survival data.

Usage

pamr.test.errors.surv.compute(proby, yhat)
pamr.test.errors.surv.compute(proby, yhat)

Arguments

`proby`	Survival class probabilities, from pamr.surv.to.class2
`yhat`	Estimated class labels, from pamr.predict

Details

pamr.test.errors.surv.compute computes the erros between the true 'soft" class labels proby and the estimated ones "yhat"

Author(s)

Trevor Hastie, Robert Tibshirani, Balasubramanian Narasimhan, and Gilbert Chu

Examples


 
gendata<-function(n=100, p=2000){
  tim <- 3*abs(rnorm(n))
  u<-runif(n,min(tim),max(tim))
  y<-pmin(tim,u)
   ic<-1*(tim<u)
m <- median(tim)
x<-matrix(rnorm(p*n),ncol=n)
  x[1:100, tim>m] <-  x[1:100, tim>m]+3
  return(list(x=x,y=y,ic=ic))
}

# generate training data; 2000 genes, 100 samples

junk<-gendata(n=100)
y<-junk$y
ic<-junk$ic
x<-junk$x
d <- list(x=x,survival.time=y, censoring.status=ic, 
geneid=as.character(1:nrow(x)), genenames=paste("g",as.character(1:nrow(x)),sep=
""))

# train model
a3<- pamr.train(d, ngroup.survival=2)

# generate test data
junkk<- gendata(n=500)

dd <- list(x=junkk$x, survival.time=junkk$y, censoring.status=junkk$ic)

# compute soft labels
proby <-  pamr.surv.to.class2(dd$survival.time, dd$censoring.status,
             n.class=a3$ngroup.survival)$prob


# make class predictions for test data
yhat <- pamr.predict(a3,dd$x, threshold=1.0)

# compute test errors

pamr.test.errors.surv.compute(proby, yhat)


gendata<-function(n=100, p=2000){
  tim <- 3*abs(rnorm(n))
  u<-runif(n,min(tim),max(tim))
  y<-pmin(tim,u)
   ic<-1*(tim<u)
m <- median(tim)
x<-matrix(rnorm(p*n),ncol=n)
  x[1:100, tim>m] <-  x[1:100, tim>m]+3
  return(list(x=x,y=y,ic=ic))
}

# generate training data; 2000 genes, 100 samples

junk<-gendata(n=100)
y<-junk$y
ic<-junk$ic
x<-junk$x
d <- list(x=x,survival.time=y, censoring.status=ic, 
geneid=as.character(1:nrow(x)), genenames=paste("g",as.character(1:nrow(x)),sep=
""))

# train model
a3<- pamr.train(d, ngroup.survival=2)

# generate test data
junkk<- gendata(n=500)

dd <- list(x=junkk$x, survival.time=junkk$y, censoring.status=junkk$ic)

# compute soft labels
proby <-  pamr.surv.to.class2(dd$survival.time, dd$censoring.status,
             n.class=a3$ngroup.survival)$prob


# make class predictions for test data
yhat <- pamr.predict(a3,dd$x, threshold=1.0)

# compute test errors

pamr.test.errors.surv.compute(proby, yhat)

A function to train a nearest shrunken centroid classifier

Description

A function that computes a nearest shrunken centroid for gene expression (microarray) data

Usage

pamr.train(
  data,
  gene.subset = NULL,
  sample.subset = NULL,
  threshold = NULL,
  n.threshold = 30,
  scale.sd = TRUE,
  threshold.scale = NULL,
  se.scale = NULL,
  offset.percent = 50,
  hetero = NULL,
  prior = NULL,
  remove.zeros = TRUE,
  sign.contrast = "both",
  ngroup.survival = 2
)
pamr.train(
  data,
  gene.subset = NULL,
  sample.subset = NULL,
  threshold = NULL,
  n.threshold = 30,
  scale.sd = TRUE,
  threshold.scale = NULL,
  se.scale = NULL,
  offset.percent = 50,
  hetero = NULL,
  prior = NULL,
  remove.zeros = TRUE,
  sign.contrast = "both",
  ngroup.survival = 2
)

Arguments

`data`	The input data. A list with components: x- an expression genes in the rows, samples in the columns), and y- a vector of the class labels for each sample. Optional components- genenames, a vector of gene names, and geneid- a vector of gene identifiers.
`gene.subset`	Subset of genes to be used. Can be either a logical vector of length total number of genes, or a list of integers of the row numbers of the genes to be used
`sample.subset`	Subset of samples to be used. Can be either a logical vector of length total number of samples, or a list of integers of the column numbers of the samples to be used.
`threshold`	A vector of threshold values for the centroid shrinkage.Default is a set of 30 values chosen by the software
`n.threshold`	Number of threshold values desired (default 30)
`scale.sd`	Scale each threshold by the wthin class standard deviations? Default: true
`threshold.scale`	Additional scaling factors to be applied to the thresholds. Vector of length equal to the number of classes. Default- a vectors of ones.
`se.scale`	Vector of scaling factors for the within class standard errors. Default is sqrt(1/n.class-1/n), where n is the overall sample size and n.class is the sample sizes in each class. This default adjusts for different class sizes.
`offset.percent`	Fudge factor added to the denominator of each t-statistic, expressed as a percentile of the gene standard deviation values. This is a small positive quantity to penalize genes with expression values near zero, which can result in very large ratios. This factor is expecially impotant for Affy data. Default is the median of the standard deviations of each gene.
`hetero`	Should a heterogeneity transformation be done? If yes, hetero must be set to one of the class labels (see Details below). Default is no (hetero=NULL)
`prior`	Vector of length the number of classes, representing prior probabilities for each of the classes. The prior is used in Bayes rule for making class prediction. Default is NULL, and prior is then taken to be n.class/n, where n is the overall sample size and n.class is the sample sizes in each class.
`remove.zeros`	Remove threshold values yielding zero genes? Default TRUE
`sign.contrast`	Directions of allowed deviations of class-wise average gene expression from the overall average gene expression. Default is `⁠both'' (positive or negative). Can also be set to ⁠`positive” or “negative”.
`ngroup.survival`	Number of groups formed for survival data. Default 2

Details

pamr.train fits a nearest shrunken centroid classifier to gene expression data. Details may be found in the PNAS paper referenced below. One feature not described there is "heterogeneity analysis". Suppose there are two classes labelled "A" and "B". CLass "A" is considered a normal class, and "B" an abnormal class. Setting hetero="A" transforms expression values $x[i,j]$ to $|x[i,j]- mean(x[i,j])|$ where the mean is taken only over samples in class "A". The transformed feature values are then used in Pam. This is useful when the abnormal class "B" is heterogeneous, i.e. a given gene might have higher expresion than normal for some class "B" samples, and lower for others. With more than 2 classes, each class is centered on the class specified by hetero.

Value

A list with components

`y`	The outcome classes.
`yhat`	A matrix of predicted classes, each column representing the results from one threshold.

`prob`	A array of predicted class probabilities. of dimension n by nclass by n.threshold. n is the number samples, nclass is the number of classes, n.threshold is the number of thresholds tried
`centroids`	A matrix of (unshrunken) class centroids, n by nclass
`hetero`	Value of hetero used in call to pamr.train
`norm.cent`	Centroid of "normal" group, if hetero was specified
`centroid.overall`	A vector containing the (unshrunken) overall centroid (all classes together)
`sd`	A vector of the standard deviations for each gene
`threshold`	A vector of the threshold tried in the shrinkage
`nonzero`	A vector of the number of genes that survived the thresholding, for each threshold value tried
`threshold.scale`	A vector of threshold scale factors that were used
`se.scale`	A vector of standard error scale factors that were used
`call`	The calling sequence used
`prior`	The prior probabilities used
`errors`	The number of trainin errors for each threshold value

Author(s)

Trevor Hastie,Robert Tibshirani, Balasubramanian Narasimhan, and Gilbert Chu

References

Robert Tibshirani, Trevor Hastie, Balasubramanian Narasimhan, and Gilbert Chu Diagnosis of multiple cancer types by shrunken centroids of gene expression PNAS 99: 6567-6572. Available at www.pnas.org

Examples


#generate some data
suppressWarnings(RNGversion("3.5.0"))
set.seed(120)
x <- matrix(rnorm(1000*20),ncol=20)
y <- sample(c(1:4),size=20,replace=TRUE)
mydata <- list(x=x,y=factor(y))

#train classifier
results<-   pamr.train(mydata)

# train classifier on all  data except class 4
results2 <- pamr.train(mydata,sample.subset=(mydata$y!=4))
 
# train classifier on  only the first 500 genes
results3 <- pamr.train(mydata,gene.subset=1:500)


#generate some data
suppressWarnings(RNGversion("3.5.0"))
set.seed(120)
x <- matrix(rnorm(1000*20),ncol=20)
y <- sample(c(1:4),size=20,replace=TRUE)
mydata <- list(x=x,y=factor(y))

#train classifier
results<-   pamr.train(mydata)

# train classifier on all  data except class 4
results2 <- pamr.train(mydata,sample.subset=(mydata$y!=4))
 
# train classifier on  only the first 500 genes
results3 <- pamr.train(mydata,gene.subset=1:500)

Package 'pamr'

Help Index

Khan microarray data

Description

Details

A function to adaptive choose threshold scales, for use in pamr.train

Description

Usage

Arguments

Details

Author(s)

References

Examples

A function to mean-adjust microarray data by batches

Description

Usage

Arguments

Details

Value

Author(s)

Examples

A function giving a table of true versus predicted values, from a nearest shrunken centroid fit.

Description

Usage

Arguments

Details

Author(s)

Examples

Compute confusin matrix from pamr survival fit

Description

Usage

Arguments

Author(s)

A function to cross-validate the nearest shrunken centroid classifier

Description

Usage

Arguments

Details

Value

Author(s)

Examples

A function to decorrelate (adjust) the feature matrix with respect to some additional predictors

Description

Usage

Arguments

Details

Value

Author(s)

References

Examples

A function to estimate false discovery rates for the nearest shrunken centroid classifier

Description

Usage

Arguments

Details

Value

Author(s)

Examples

A function to plot the genes that surive the thresholding from the nearest shrunken centroid classifier

Description

Usage

Arguments

Details

Author(s)

Examples

A function that takes estimate class probabilities and produces a class prediction or indeterminate prediction

Description

Usage

Arguments

Author(s)

Examples

A function to list the genes that survive the thresholding, from the nearest shrunken centroid classifier

Description

Usage

Arguments

Details

Author(s)

Examples

A function to interactively define classes from a clustering tree

Description