Hello all, I have a .csv file like below. Tool,Step_Number,Data1,Data2... etc up to 100 columns. A,1,0,1 A,2,3,1 A,3,2,1 . . B,1,3,2 B,2,1,2 B,3,3,2 . . ...... so on upto 50 rows where the column "*Tool*" has distinct steps in second column "*Step_Number*",but both have same entries in Step_Number column. I want the output like below. Tool_1,Data1_1,Data2_1,Tool_2,Data1_2,Data2_2,Tool_3,Data1_3,Data2_3... so on A,0,1,A,3,1,A,2,1 B,3,2,B,1,2,B,3,2 ...... so on. basically I am transposing entire data based on a specific column row values and renaming the column headers. I have a shell script based on awk which can do this task, but the script is taking exceptionally higher processing time. So I am looking for a script in R which can save the time. "Please revert to me if the problem description is not clear." Regards Sidda ----- Sidda Business Analyst Lead Applied Materials Inc. -- View this message in context: http://r.789695.n4.nabble.com/transforming-a-csv-file-column-names-as-per-a-particular-column-rows-using-R-code-tp4646137.html Sent from the R help mailing list archive at Nabble.com.
transforming a .csv file column names as per a particular column rows using R code
8 messages · siddu479, Jeff Newmiller, Rui Barradas +2 more
Not clear to me what your successive rows indicate in your output.
However, you may get something similar to what you want using the reshape2
library:
library(reshape2)
dta1 <- read.csv( text=
"Tool,Step_Number,Data1,Data2
A,1,0,1
A,2,3,2
A,3,2,3
B,1,3,2
B,2,1,2
B,3,3,2
")
# put into long form
dta1l <- melt(dta1,c("Tool","Step_Number"))
# re-cast to wide form
# assuming first row is for tool A, second row is for tool B, etc.
dta2 <- dcast(dta1l,Tool~variable+Step_Number)
On Sun, 14 Oct 2012, siddu479 wrote:
Hello all, I have a .csv file like below. Tool,Step_Number,Data1,Data2... etc up to 100 columns. A,1,0,1 A,2,3,1 A,3,2,1 . . B,1,3,2 B,2,1,2 B,3,3,2 . . ...... so on upto 50 rows where the column "*Tool*" has distinct steps in second column "*Step_Number*",but both have same entries in Step_Number column. I want the output like below. Tool_1,Data1_1,Data2_1,Tool_2,Data1_2,Data2_2,Tool_3,Data1_3,Data2_3... so on A,0,1,A,3,1,A,2,1 B,3,2,B,1,2,B,3,2 ...... so on. basically I am transposing entire data based on a specific column row values and renaming the column headers. I have a shell script based on awk which can do this task, but the script is taking exceptionally higher processing time. So I am looking for a script in R which can save the time. "Please revert to me if the problem description is not clear." Regards Sidda ----- Sidda Business Analyst Lead Applied Materials Inc. -- View this message in context: http://r.789695.n4.nabble.com/transforming-a-csv-file-column-names-as-per-a-particular-column-rows-using-R-code-tp4646137.html Sent from the R help mailing list archive at Nabble.com.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
Hi, If you don't want the column "Tool" to be repeated, try this: dat1<-read.table(text=" Tool Step_Number Data1 Data2 A 1 0 1 A 2 3 1 A 3 2 1 B 1 3 2 B 2 1 2 B 3 3 2 ",sep="",header=TRUE,stringsAsFactors=FALSE)??? ?with(dat1,aggregate(cbind(Data1,Data2),list(Tool=Tool),function(x) x)) ? Tool Data1.1 Data1.2 Data1.3 Data2.1 Data2.2 Data2.3 #1??? A?????? 0?????? 3?????? 2?????? 1?????? 1?????? 1 #2??? B?????? 3?????? 1?????? 3?????? 2?????? 2?????? 2A.K. ----- Original Message ----- From: siddu479 <onlyfordigitalstuff at gmail.com> To: r-help at r-project.org Cc: Sent: Sunday, October 14, 2012 12:13 PM Subject: [R] transforming a .csv file column names as per a particular column rows using R code Hello all, I have a .csv file like below. Tool,Step_Number,Data1,Data2... etc up to 100 columns. A,1,0,1 A,2,3,1 A,3,2,1 . . B,1,3,2 B,2,1,2 B,3,3,2 . . ...... so on upto 50 rows where the column "*Tool*" has distinct steps in second column "*Step_Number*",but both have same entries in Step_Number column. I want the output like below. Tool_1,Data1_1,Data2_1,Tool_2,Data1_2,Data2_2,Tool_3,Data1_3,Data2_3... so on A,0,1,A,3,1,A,2,1 B,3,2,B,1,2,B,3,2 ...... so on. basically I am transposing entire data based on a specific column row values and renaming the column headers. I have a shell script based on awk which can do this task, but the script is taking exceptionally higher processing time. So I am looking for a script in R which can save the time. "Please revert to me if the problem description is not clear." Regards Sidda ----- Sidda Business Analyst Lead Applied Materials Inc. -- View this message in context: http://r.789695.n4.nabble.com/transforming-a-csv-file-column-names-as-per-a-particular-column-rows-using-R-code-tp4646137.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hello, Try the following. dta1 <- read.csv( text= "Tool,Step_Number,Data1,Data2 A,1,0,1 A,2,3,2 A,3,2,3 B,1,3,2 B,2,1,2 B,3,3,2 ") sp <- split(dta1[-2], dta1$Tool) dta2 <- do.call(rbind, lapply(sp, function(x) as.vector(unlist(t(x))))) dta2 <- data.frame(dta2, stringsAsFactors = FALSE) idx <- rep(c(FALSE, TRUE, TRUE), ncol(dta2) %/% 3) dta2[, idx] <- sapply(dta2[, idx], as.integer) Hope this helps, Rui Barradas Em 14-10-2012 17:13, siddu479 escreveu:
Hello all, I have a .csv file like below. Tool,Step_Number,Data1,Data2... etc up to 100 columns. A,1,0,1 A,2,3,1 A,3,2,1 . . B,1,3,2 B,2,1,2 B,3,3,2 . . ...... so on upto 50 rows where the column "*Tool*" has distinct steps in second column "*Step_Number*",but both have same entries in Step_Number column. I want the output like below. Tool_1,Data1_1,Data2_1,Tool_2,Data1_2,Data2_2,Tool_3,Data1_3,Data2_3... so on A,0,1,A,3,1,A,2,1 B,3,2,B,1,2,B,3,2 ...... so on. basically I am transposing entire data based on a specific column row values and renaming the column headers. I have a shell script based on awk which can do this task, but the script is taking exceptionally higher processing time. So I am looking for a script in R which can save the time. "Please revert to me if the problem description is not clear." Regards Sidda ----- Sidda Business Analyst Lead Applied Materials Inc. -- View this message in context: http://r.789695.n4.nabble.com/transforming-a-csv-file-column-names-as-per-a-particular-column-rows-using-R-code-tp4646137.html Sent from the R help mailing list archive at Nabble.com.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
HI,
If you need the "Tool" column repeated,
Try this:
dat1<-read.table(text="
Tool Step_Number Data1 Data2 Data3
A 1 0 1 2
A 2 3 1 4?
A 3 2 1 3
B 1 3 2 1
B 2 1 2 3
B 3 3 2 0
",sep="",header=TRUE,stringsAsFactors=FALSE)??
?
dat3<-with(dat1,aggregate(dat1[,3:5],list(Tool=Tool),function(x) x))
dat3
#Tool Data1.1 Data1.2 Data1.3 Data2.1 Data2.2 Data2.3 Data3.1 Data3.2 Data3.3
#1??? A?????? 0?????? 3?????? 2?????? 1?????? 1?????? 1?????? 2?????? 4?????? 3
#2??? B?????? 3?????? 1?????? 3?????? 2?????? 2?????? 2?????? 1?????? 3?????? 0
dat4<-do.call(data.frame,lapply(lapply(dat3[,-1],function(x) x),function(x) cbind(Tool=dat3[,1],x)))
colnames(dat4)<-gsub(".*\\.(Tool)","\\1",gsub("[V]","",colnames(dat4)))
dat4[grepl("Data.*",colnames(dat4))]<-sapply(dat4[grepl("Data.*",colnames(dat4))],function(x)as.numeric(as.character(x)))
?dat4
#? Tool Data1.2 Data1.3 Data1.4 Tool Data2.2 Data2.3 Data2.4 Tool Data3.2
#1??? A?????? 0?????? 3?????? 2??? A?????? 1?????? 1?????? 1??? A?????? 2
#2??? B?????? 3?????? 1?????? 3??? B?????? 2?????? 2?????? 2??? B?????? 1
? Data3.3 Data3.4
1?????? 4?????? 3
2?????? 3?????? 0
?str(dat4)
#'data.frame':??? 2 obs. of? 12 variables:
# $ Tool?? : Factor w/ 2 levels "A","B": 1 2
# $ Data1.2: num? 0 3
# $ Data1.3: num? 3 1
?#$ Data1.4: num? 2 3
?#$ Tool?? : Factor w/ 2 levels "A","B": 1 2
?#$ Data2.2: num? 1 2
?#$ Data2.3: num? 1 2
?#$ Data2.4: num? 1 2
# $ Tool?? : Factor w/ 2 levels "A","B": 1 2
# $ Data3.2: num? 2 1
# $ Data3.3: num? 4 3
# $ Data3.4: num? 3 0
A.K.
----- Original Message -----
From: siddu479 <onlyfordigitalstuff at gmail.com>
To: r-help at r-project.org
Cc:
Sent: Sunday, October 14, 2012 12:13 PM
Subject: [R] transforming a .csv file column names as per a particular column rows using R code
Hello all,
I have a .csv file like below.
Tool,Step_Number,Data1,Data2... etc up to 100 columns.
A,1,0,1
A,2,3,1
A,3,2,1
.
.
B,1,3,2
B,2,1,2
B,3,3,2
.
.
...... so on upto 50 rows
where the column "*Tool*" has distinct steps in second column
"*Step_Number*",but both have same entries in Step_Number column.
I want the output like below.
Tool_1,Data1_1,Data2_1,Tool_2,Data1_2,Data2_2,Tool_3,Data1_3,Data2_3... so
on
A,0,1,A,3,1,A,2,1
B,3,2,B,1,2,B,3,2
......
so on.
basically I am transposing entire data based on a specific column row values
and renaming the column headers.
I have a shell script based on awk which can do this task, but the script is
taking exceptionally higher processing time.
So I am looking for a script in R which can save the time.
"Please revert to me if the problem description is not clear."
Regards
Sidda
-----
Sidda
Business Analyst Lead
Applied Materials Inc.
--
View this message in context: http://r.789695.n4.nabble.com/transforming-a-csv-file-column-names-as-per-a-particular-column-rows-using-R-code-tp4646137.html
Sent from the R help mailing list archive at Nabble.com.
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
HI,
Just a modification of the column names.? I noticed that the "Data." column starts with "2".?
Try this:
dat1<-read.table(text="
Tool Step_Number Data1 Data2 Data3
A 1 0 1 2
A 2 3 1 4
A 3 2 1 3
B 1 3 2 1
B 2 1 2 3
B 3 3 2 0
",sep="",header=TRUE,stringsAsFactors=FALSE)?
?
dat3<-with(dat1,aggregate(dat1[,3:5],list(Tool=Tool),function(x) x))
dat3
dat4<-do.call(data.frame,lapply(lapply(dat3[,-1],function(x) x),function(x) cbind(Tool=dat3[,1],x)))
colnames(dat4)<-gsub(".*\\.(Tool)","\\1",gsub("[V]","",colnames(dat4)))
dat4[grepl("Data.*",colnames(dat4))]<-sapply(dat4[grepl("Data.*",colnames(dat4))],function(x)as.numeric(as.character(x)))
colnames(dat4)[colnames(dat4)%in%colnames(dat4[grep("Data",colnames(dat4))])]<-paste(rep(names(lapply(lapply(dat3[,-1],function(x) x),function(x) cbind(Tool=dat3[,1],x))),each=length(unique(dat1$Step_Number))),"_",1:length(unique(dat1$Step_Number)),sep="")
dat4
# Tool Data1_1 Data1_2 Data1_3 Tool Data2_1 Data2_2 Data2_3 Tool Data3_1
#1??? A?????? 0?????? 3?????? 2??? A?????? 1?????? 1?????? 1??? A?????? 2
#2??? B?????? 3?????? 1?????? 3??? B?????? 2?????? 2?????? 2??? B?????? 1
?# Data3_2 Data3_3
#1?????? 4?????? 3
#2?????? 3?????? 0
A.K.
----- Original Message -----
From: siddu479 <onlyfordigitalstuff at gmail.com>
To: r-help at r-project.org
Cc:
Sent: Sunday, October 14, 2012 12:13 PM
Subject: [R] transforming a .csv file column names as per a particular column rows using R code
Hello all,
I have a .csv file like below.
Tool,Step_Number,Data1,Data2... etc up to 100 columns.
A,1,0,1
A,2,3,1
A,3,2,1
.
.
B,1,3,2
B,2,1,2
B,3,3,2
.
.
...... so on upto 50 rows
where the column "*Tool*" has distinct steps in second column
"*Step_Number*",but both have same entries in Step_Number column.
I want the output like below.
Tool_1,Data1_1,Data2_1,Tool_2,Data1_2,Data2_2,Tool_3,Data1_3,Data2_3... so
on
A,0,1,A,3,1,A,2,1
B,3,2,B,1,2,B,3,2
......
so on.
basically I am transposing entire data based on a specific column row values
and renaming the column headers.
I have a shell script based on awk which can do this task, but the script is
taking exceptionally higher processing time.
So I am looking for a script in R which can save the time.
"Please revert to me if the problem description is not clear."
Regards
Sidda
-----
Sidda
Business Analyst Lead
Applied Materials Inc.
--
View this message in context: http://r.789695.n4.nabble.com/transforming-a-csv-file-column-names-as-per-a-particular-column-rows-using-R-code-tp4646137.html
Sent from the R help mailing list archive at Nabble.com.
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
You can also get the Tool column repeated by tricking reshape() into letting you use Tool in the v.name and idvar arguments:
dta1 <- read.csv( text=
+ "Tool,Step_Number,Data1,Data2 + A,1,0,1 + A,2,3,2 + A,3,2,3 + B,1,3,2 + B,2,1,2 + B,3,3,2 + ") # Repeat the first column. R will append .1 to the second copy.
dta2 <- reshape(dta1[, c(1, 1:4)], v.names=c("Tool", "Data1", "Data2"),
timevar="Step_Number", idvar="Tool.1", direction="wide")
rownames(dta2) <- NULL
colnames(dta2) <- gsub("\\.", "_", colnames(dta2))
dta2
Tool_1 Data1_1 Data2_1 Tool_2 Data1_2 Data2_2 Tool_3 Data1_3 Data2_3 1 A 0 1 A 3 2 A 2 3 2 B 3 2 B 1 2 B 3 2 ---------------------------------------------- David L Carlson Associate Professor of Anthropology Texas A&M University College Station, TX 77843-4352
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
project.org] On Behalf Of arun
Sent: Sunday, October 14, 2012 1:29 PM
To: siddu479
Cc: R help
Subject: Re: [R] transforming a .csv file column names as per a
particular column rows using R code
HI,
If you need the "Tool" column repeated,
Try this:
dat1<-read.table(text="
Tool Step_Number Data1 Data2 Data3
A 1 0 1 2
A 2 3 1 4
A 3 2 1 3
B 1 3 2 1
B 2 1 2 3
B 3 3 2 0
",sep="",header=TRUE,stringsAsFactors=FALSE)
dat3<-with(dat1,aggregate(dat1[,3:5],list(Tool=Tool),function(x) x))
dat3
#Tool Data1.1 Data1.2 Data1.3 Data2.1 Data2.2 Data2.3 Data3.1 Data3.2
Data3.3
#1??? A?????? 0?????? 3?????? 2?????? 1?????? 1?????? 1?????? 2
4?????? 3
#2??? B?????? 3?????? 1?????? 3?????? 2?????? 2?????? 2?????? 1
3?????? 0
dat4<-do.call(data.frame,lapply(lapply(dat3[,-1],function(x)
x),function(x) cbind(Tool=dat3[,1],x)))
colnames(dat4)<-gsub(".*\\.(Tool)","\\1",gsub("[V]","",colnames(dat4)))
dat4[grepl("Data.*",colnames(dat4))]<-
sapply(dat4[grepl("Data.*",colnames(dat4))],function(x)as.numeric(as.ch
aracter(x)))
?dat4
#? Tool Data1.2 Data1.3 Data1.4 Tool Data2.2 Data2.3 Data2.4 Tool
Data3.2
#1??? A?????? 0?????? 3?????? 2??? A?????? 1?????? 1?????? 1??? A
2
#2??? B?????? 3?????? 1?????? 3??? B?????? 2?????? 2?????? 2??? B
1
? Data3.3 Data3.4
1?????? 4?????? 3
2?????? 3?????? 0
?str(dat4)
#'data.frame':??? 2 obs. of? 12 variables:
# $ Tool?? : Factor w/ 2 levels "A","B": 1 2
# $ Data1.2: num? 0 3
# $ Data1.3: num? 3 1
?#$ Data1.4: num? 2 3
?#$ Tool?? : Factor w/ 2 levels "A","B": 1 2
?#$ Data2.2: num? 1 2
?#$ Data2.3: num? 1 2
?#$ Data2.4: num? 1 2
# $ Tool?? : Factor w/ 2 levels "A","B": 1 2
# $ Data3.2: num? 2 1
# $ Data3.3: num? 4 3
# $ Data3.4: num? 3 0
A.K.
----- Original Message -----
From: siddu479 <onlyfordigitalstuff at gmail.com>
To: r-help at r-project.org
Cc:
Sent: Sunday, October 14, 2012 12:13 PM
Subject: [R] transforming a .csv file column names as per a particular
column rows using R code
Hello all,
I have a .csv file like below.
Tool,Step_Number,Data1,Data2... etc up to 100 columns.
A,1,0,1
A,2,3,1
A,3,2,1
.
.
B,1,3,2
B,2,1,2
B,3,3,2
.
.
...... so on upto 50 rows
where the column "*Tool*" has distinct steps in second column
"*Step_Number*",but both have same entries in Step_Number column.
I want the output like below.
Tool_1,Data1_1,Data2_1,Tool_2,Data1_2,Data2_2,Tool_3,Data1_3,Data2_3...
so
on
A,0,1,A,3,1,A,2,1
B,3,2,B,1,2,B,3,2
......
so on.
basically I am transposing entire data based on a specific column row
values
and renaming the column headers.
I have a shell script based on awk which can do this task, but the
script is
taking exceptionally higher processing time.
So I am looking for a script in R which can save the time.
"Please revert to me if the problem description is not clear."
Regards
Sidda
-----
Sidda
Business Analyst Lead
Applied Materials Inc.
--
View this message in context:
http://r.789695.n4.nabble.com/transforming-a-csv-file-column-names-as-
per-a-particular-column-rows-using-R-code-tp4646137.html
Sent from the R help mailing list archive at Nabble.com.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code.
Hi Jeff Newmiller, The script given by you is perfect for my requirement. The rest of the scripts given by other authors are failing to format the file header which are giving like eachdatacolumn.1 or eachdatacolumn.2 etc, whereas I need them as eachdatacolumn_firstactualstepnumber, eachdatacolumn_secondactualstepnumber etc. in my output. ----- Sidda Business Analyst Lead Applied Materials Inc. -- View this message in context: http://r.789695.n4.nabble.com/transforming-a-csv-file-column-names-as-per-a-particular-column-rows-using-R-code-tp4646137p4646188.html Sent from the R help mailing list archive at Nabble.com.