Data reshaping with conditions
Hi sri,
As your problem involves a few logical steps, I found it easier to
approach it in a stepwise way. Perhaps there are more elegant ways to
accomplish this.
svdat<-read.table(text="Count id name type
117 335 sally A
19 335 sally A
167 335 sally B
18 340 susan A
56 340 susan A
22 340 susan B
53 340 susan B
135 351 lee A
114 351 lee A
84 351 lee A
80 351 lee A
19 351 lee A
8 351 lee A
21 351 lee A
88 351 lee B
111 351 lee B
46 351 lee B
108 351 lee B",header=TRUE)
# you can also do this with other reshape functions
library(prettyR)
svdatstr<-stretch_df(svdat,"id",c("Count","type"))
count_ind<-grep("Count",names(svdatstr))
type_ind<-grep("type",names(svdatstr))
svdatstr$maxA<-NA
svdatstr$maxB<-NA
svdatstr$x<-NA
svdatstr$y<-NA
for(row in 1:nrow(svdatstr)) {
svdatstr[row,"maxA"]<-
max(svdatstr[row,count_ind[as.logical(match(svdatstr[1,type_ind],"A",0))]])
svdatstr[row,"maxB"]<-
max(svdatstr[row,count_ind[as.logical(match(svdatstr[1,type_ind],"B",0))]])
svdatstr[row,"x"]<-svdatstr[row,"maxA"] < svdatstr[row,"maxB"]
svdatstr[row,"y"]<-!svdatstr[row,"x"]
}
svdatstr
You can then just extract the columns that you need.
Jim
On Wed, Apr 20, 2016 at 3:03 PM, sri vathsan <srivibish at gmail.com> wrote:
Dear All, I am trying to reshape the data with some conditions. A small part of the data looks like below. Like this there will be more data with repeating ID. Count id name type 117 335 sally A 19 335 sally A 167 335 sally B 18 340 susan A 56 340 susan A 22 340 susan B 53 340 susan B 135 351 lee A 114 351 lee A 84 351 lee A 80 351 lee A 19 351 lee A 8 351 lee A 21 351 lee A 88 351 lee B 111 351 lee B 46 351 lee B 108 351 lee B
From the above data I am expecting an output like below.
id name type count_of_B Max of count B x y
335 sally B 167 167 117,19 NA
340 susan B 22,53 53 18 56
351 lee B 88,111,46,108 111 84,80,19,8,2 135,114
Where, the column x and column y are:
x = Count_A_less_than_max of (Count type B)
y = Count_A_higher_than_max of (Count type B).
*1)* I tried with dplyr with the following code for the initial step to get
the values for each column.
*2)* I thought to transpose the columns which has the unique ID alone.
I tried with the following code and I am struck with the intial step
itself. The code is executed but higher and lower value of A is not coming.
Expected_output= data %>%
group_by(id, Type) %>%
mutate(Count_of_B = paste(unlist(count[Type=="B"]), collapse = ","))%>%
mutate(Max_of_count_B = ifelse(Type == "B", max(count[Type ==
"B"]),max(count[Type == "A"]))) %>%
mutate(count_type_A_lesser = ifelse
(Type=="B",(paste(unlist(count[Type=="A"]) < Max_of_count_B[Type=="B"],
collapse = ",")), "NA"))%>%
mutate(count_type_A_higher =
ifelse(Type=="B",(paste(unlist(count[Type=="A"]) >
Max_of_count_B[Type=="B"], collapse = ",")), "NA"))
I hope I make my point clear. Please bare with the code, as I am new to
this.
Regards,
sri
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.