Hi Spencer,
The an1 syntax is adding regression coefficients (or NAs where a
regression could not be done) to the downloaded and processed data, which
ends up a matrix. The cbind function adds the regression coefficients to
the last column of the matrix (i.e. bind the columns of the inputs in the
order given). Simple example below. Not actually any need for the separate
cbind commands, could have just used an1=cbind(an,p,t). The cbind function
expects all the columns to be of the same length, hence the use of the
tryCatch function to capture NA's for failed regression attempts, ensuring
that p and t correspond row by row with the matrix.
x=seq(1,5)
y=seq(6,10)
z=seq(1,5)
xyz=cbind(x,y,z)
xyz
x y z
[1,] 1 6 1
[2,] 2 7 2
[3,] 3 8 3
[4,] 4 9 4
[5,] 5 10 5
dangs=rep(NA,5)
xyzd=cbind(xyz,dangs)
xyzd
x y z dangs
[1,] 1 6 1 NA
[2,] 2 7 2 NA
[3,] 3 8 3 NA
[4,] 4 9 4 NA
[5,] 5 10 5 NA
-----Original Message-----
From: R-help <r-help-bounces at r-project.org> On Behalf Of Spencer Brackett
Sent: February 14, 2019 12:32 AM
To: R-help <r-help at r-project.org>; Sarah Goslee <sarah.goslee at gmail.com>;
Caitlin Gibbons <bioprogrammer at gmail.com>; Jeff Newmiller <
jdnewmil at dcn.davis.ca.us>
Subject: [R] R Data
Hello everyone,
The following is a portion of coding that a colleague sent. Given my lack
of experience in R, I am not quite sure what the significance of the
following arguments. Could anyone help me translate? For context, I am
aware of the downloading portion of the script... library(data.table) etc.,
but am not familiar with the portion pertaining to an1 .
library(data.table)
anno = as.data.frame(fread(file =
"/rsrch1/bcb/kchen_group/v_mohanty/data/TCGA/450K/mapper.txt", sep ="\t",
header = T)) meth = read.table(file = "/rsrch1/bcb/kchen_group/v_
mohanty/data/TCGA/27K/GBM.txt", sep ="\t", header = T, row.names = 1)
meth = as.matrix(meth) """ the loop just formats the methylation column
names to match format"""
colnames(meth) = sapply(colnames(meth), function(i){
c1 = strsplit(i,split = '.', fixed = T)[[1]]
c1[4] = paste(strsplit(c1[4],split = "",fixed = T)[[1]][1:2],collapse =
"")
paste(c1,collapse = ".")
})
exp = read.table(file =
"/rsrch1/bcb/kchen_group/v_mohanty/data/TCGA/RNAseq/GBM.txt", sep = "\t",
header = T, row.names = 1) exp = as.matrix(exp) c = intersect(colnames(exp),
colnames(meth))
exp = exp[,c]
meth = meth[,c]
m = apply(meth, 1, function(i){
log2(i/(1-i))
})
m = t(as.matrix(m))
an = anno[anno$probe %in% rownames(m),]
an = an[an$gene %in% rownames(exp),]
an = an[an$location %in% c("TSS200","TSS1500"),]
p = apply(an,1,function(i){
tryCatch(summary(lm(exp[as.character(i[2]),] ~ m[as.character(i[1]),]))$coefficient[2,4],
error= function(e)NA)
})
t = apply(an,1,function(i){
tryCatch(summary(lm(exp[as.character(i[2]),] ~ m[as.character(i[1]),]))$coefficient[2,3],
error= function(e)NA)
})
an1 =cbind(an,p)
an1 = cbind(an1,t)
an1$q = p.adjust(as.numeric(an1$p))
summary(lm(exp["MAOB",] ~ m["cg00121904",]$coefficient[2,c(3:4)]
###############################################
[[alternative HTML version deleted]]