Skip to content

conditional Dataframe filling

10 messages · Blaser Nello, arun, Camilo Mora

#
Hi everyone:

This may be trivial but I just have not been able to figure it out.

Imagine the following dataframe:
a     b     c     d
TRUE  TRUE  TRUE  TRUE
FALSE FALSE FALSE TRUE
FALSE  TRUE  FALSE  FALSE

I would like to create a new dataframe, in which TRUE gets 0 but if  
false then add 1 to the cell to the left. So the results for the  
example above should be something like:

a     b     c     d
0     0     0     0
1     2     3     0
1     0     1     2

I wonder if you may know?.

Thanks,

Camilo




Camilo Mora, Ph.D.
Department of Geography, University of Hawaii
Currently available in Colombia
Phone:   Country code: 57
          Provider code: 313
          Phone 776 2282
          From the USA or Canada you have to dial 011 57 313 776 2282
http://www.soc.hawaii.edu/mora/
#
Here's a possible solution. 

dd <- structure(list(a = c(TRUE, FALSE, FALSE),
                     b = c(TRUE, FALSE, TRUE),
                     c = c(TRUE, FALSE, FALSE),
                     d = c(TRUE, TRUE, FALSE)), 
                .Names = c("a", "b", "c", "d"), 
                row.names = c(NA, -3L), 
                class = "data.frame")

ds <- as.data.frame(t(apply(!dd, 1, cumsum)-apply(dd, 1, cumsum)))
ds[as.matrix(dd)] <- 0
ds

Best,
Nello

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Camilo Mora
Sent: Mittwoch, 27. M?rz 2013 09:32
To: r-help at r-project.org
Subject: [R] conditional Dataframe filling

Hi everyone:

This may be trivial but I just have not been able to figure it out.

Imagine the following dataframe:
a     b     c     d
TRUE  TRUE  TRUE  TRUE
FALSE FALSE FALSE TRUE
FALSE  TRUE  FALSE  FALSE

I would like to create a new dataframe, in which TRUE gets 0 but if false then add 1 to the cell to the left. So the results for the example above should be something like:

a     b     c     d
0     0     0     0
1     2     3     0
1     0     1     2

I wonder if you may know?.

Thanks,

Camilo




Camilo Mora, Ph.D.
Department of Geography, University of Hawaii Currently available in Colombia
Phone:   Country code: 57
          Provider code: 313
          Phone 776 2282
          From the USA or Canada you have to dial 011 57 313 776 2282 http://www.soc.hawaii.edu/mora/

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
Hi,
You could try:
dat1<- read.table(text="
a??? b??? c??? d
TRUE? TRUE? TRUE? TRUE
FALSE FALSE FALSE TRUE
FALSE? TRUE? FALSE? FALSE
",sep="",header=TRUE)
dat2<-dat1
?dat2[]<-t(apply(1*!dat1,1,function(x) unlist(lapply(split(x,cumsum(c(0,abs(diff(x))))),cumsum))))
?dat2
#? a b c d
#1 0 0 0 0
#2 1 2 3 0
#3 1 0 1 2
A.K.


----- Original Message -----
From: Camilo Mora <cmora at dal.ca>
To: r-help at r-project.org
Cc: 
Sent: Wednesday, March 27, 2013 4:31 AM
Subject: [R] conditional Dataframe filling

Hi everyone:

This may be trivial but I just have not been able to figure it out.

Imagine the following dataframe:
a? ?  b? ?  c? ?  d
TRUE? TRUE? TRUE? TRUE
FALSE FALSE FALSE TRUE
FALSE? TRUE? FALSE? FALSE

I would like to create a new dataframe, in which TRUE gets 0 but if false then add 1 to the cell to the left. So the results for the example above should be something like:

a? ?  b? ?  c? ?  d
0? ?  0? ?  0? ?  0
1? ?  2? ?  3? ?  0
1? ?  0? ?  1? ?  2

I wonder if you may know?.

Thanks,

Camilo




Camilo Mora, Ph.D.
Department of Geography, University of Hawaii
Currently available in Colombia
Phone:?  Country code: 57
? ? ? ?  Provider code: 313
? ? ? ?  Phone 776 2282
? ? ? ?  From the USA or Canada you have to dial 011 57 313 776 2282
http://www.soc.hawaii.edu/mora/

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
HI,

Just a correction:

:

dat2[]<-t(apply(!dat1,1,function(x) unlist(lapply(split(x,cumsum(c(0,abs(diff(x))))),cumsum))))? #should also work
A.K.



----- Original Message -----
From: arun <smartpink111 at yahoo.com>
To: Camilo Mora <cmora at dal.ca>
Cc: R help <r-help at r-project.org>
Sent: Wednesday, March 27, 2013 9:09 AM
Subject: Re: [R] conditional Dataframe filling



Hi,
You could try:
dat1<- read.table(text="
a??? b??? c??? d
TRUE? TRUE? TRUE? TRUE
FALSE FALSE FALSE TRUE
FALSE? TRUE? FALSE? FALSE
",sep="",header=TRUE)
dat2<-dat1
?dat2[]<-t(apply(1*!dat1,1,function(x) unlist(lapply(split(x,cumsum(c(0,abs(diff(x))))),cumsum))))
?dat2
#? a b c d
#1 0 0 0 0
#2 1 2 3 0
#3 1 0 1 2
A.K.


----- Original Message -----
From: Camilo Mora <cmora at dal.ca>
To: r-help at r-project.org
Cc: 
Sent: Wednesday, March 27, 2013 4:31 AM
Subject: [R] conditional Dataframe filling

Hi everyone:

This may be trivial but I just have not been able to figure it out.

Imagine the following dataframe:
a? ?? b? ?? c? ?? d
TRUE? TRUE? TRUE? TRUE
FALSE FALSE FALSE TRUE
FALSE? TRUE? FALSE? FALSE

I would like to create a new dataframe, in which TRUE gets 0 but if false then add 1 to the cell to the left. So the results for the example above should be something like:

a? ?? b? ?? c? ?? d
0? ?? 0? ?? 0? ?? 0
1? ?? 2? ?? 3? ?? 0
1? ?? 0? ?? 1? ?? 2

I wonder if you may know?.

Thanks,

Camilo




Camilo Mora, Ph.D.
Department of Geography, University of Hawaii
Currently available in Colombia
Phone:?? Country code: 57
? ? ? ?? Provider code: 313
? ? ? ?? Phone 776 2282
? ? ? ?? From the USA or Canada you have to dial 011 57 313 776 2282
http://www.soc.hawaii.edu/mora/

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
Dear Arun,

Thank you very  much for your help with this.I did not know where to  
start looking to solve that problem, so I truly appreciate your input.

The line of code you sent seems to work but it duplicates the results.  
Do you know why that may happen?
Below is a larger database, to which I apply your line of code.

Thank you very much again,
Camilo


dat1 <- structure(list(
w = c(TRUE,TRUE,TRUE,TRUE,TRUE,FALSE,FALSE,FALSE,FALSE,TRUE,TRUE,TRUE,TRUE),
x = c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA),
y =  
c(FALSE,FALSE,FALSE,FALSE,FALSE,TRUE,TRUE,FALSE,FALSE,TRUE,TRUE,TRUE,FALSE),
z = c(TRUE,TRUE,TRUE,TRUE,FALSE,TRUE,TRUE,TRUE,FALSE,TRUE,TRUE,TRUE,FALSE)),
row.names = c(NA, -13L),
class = "data.frame")

dat1<-t(dat1)
colnames(dat1)<-c("a","b","c","d","e","f","g","h","i","j","k", "l","m")

dat2<-dat1

dat2[]<-t(apply(!dat1,1,function(x)  
unlist(lapply(split(x,cumsum(c(0,abs(diff(x))))),cumsum))))













Camilo Mora, Ph.D.
Department of Geography, University of Hawaii
Currently available in Colombia
Phone:   Country code: 57
          Provider code: 313
          Phone 776 2282
          From the USA or Canada you have to dial 011 57 313 776 2282
http://www.soc.hawaii.edu/mora/



Quoting arun <smartpink111 at yahoo.com>:
#
Dear Camilo,

How do you want to deal with the NAs?

If I remove the NAs:
dat1 <- structure(list(
w = c(TRUE,TRUE,TRUE,TRUE,TRUE,FALSE,FALSE,FALSE,FALSE,TRUE,TRUE,TRUE,TRUE),
x = c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA),
y = c(FALSE,FALSE,FALSE,FALSE,FALSE,TRUE,TRUE,FALSE,FALSE,TRUE,TRUE,TRUE,FALSE),
z = c(TRUE,TRUE,TRUE,TRUE,FALSE,TRUE,TRUE,TRUE,FALSE,TRUE,TRUE,TRUE,FALSE)),
row.names = c(NA, -13L),
class = "data.frame")

dat1<-t(dat1)
colnames(dat1)<-c("a","b","c","d","e","f","g","h","i","j","k", "l","m")
dat1<- as.data.frame(na.omit(dat1))
dat2<-dat1
dat2[]<-t(apply(!dat1,1,function(x) unlist(lapply(split(x,cumsum(c(0,abs(diff(x))))),cumsum))))
?dat2
#? a b c d e f g h i j k l m
#w 0 0 0 0 0 1 2 3 4 0 0 0 0
#y 1 2 3 4 5 0 0 1 2 0 0 0 1
#z 0 0 0 0 1 0 0 0 1 0 0 0 1


?dat1
#????? a???? b???? c???? d???? e???? f???? g???? h???? i??? j??? k??? l???? m
#w? TRUE? TRUE? TRUE? TRUE? TRUE FALSE FALSE FALSE FALSE TRUE TRUE TRUE? TRUE
#y FALSE FALSE FALSE FALSE FALSE? TRUE? TRUE FALSE FALSE TRUE TRUE TRUE FALSE
#z? TRUE? TRUE? TRUE? TRUE FALSE? TRUE? TRUE? TRUE FALSE TRUE TRUE TRUE FALSE


A.K.





----- Original Message -----
From: Camilo Mora <cmora at dal.ca>
To: arun <smartpink111 at yahoo.com>
Cc: R help <r-help at r-project.org>
Sent: Wednesday, March 27, 2013 3:27 PM
Subject: Re: [R] conditional Dataframe filling

Dear Arun,

Thank you very? much for your help with this.I did not know where to start looking to solve that problem, so I truly appreciate your input.

The line of code you sent seems to work but it duplicates the results. Do you know why that may happen?
Below is a larger database, to which I apply your line of code.

Thank you very much again,
Camilo


dat1 <- structure(list(
w = c(TRUE,TRUE,TRUE,TRUE,TRUE,FALSE,FALSE,FALSE,FALSE,TRUE,TRUE,TRUE,TRUE),
x = c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA),
y = c(FALSE,FALSE,FALSE,FALSE,FALSE,TRUE,TRUE,FALSE,FALSE,TRUE,TRUE,TRUE,FALSE),
z = c(TRUE,TRUE,TRUE,TRUE,FALSE,TRUE,TRUE,TRUE,FALSE,TRUE,TRUE,TRUE,FALSE)),
row.names = c(NA, -13L),
class = "data.frame")

dat1<-t(dat1)
colnames(dat1)<-c("a","b","c","d","e","f","g","h","i","j","k", "l","m")

dat2<-dat1

dat2[]<-t(apply(!dat1,1,function(x) unlist(lapply(split(x,cumsum(c(0,abs(diff(x))))),cumsum))))













Camilo Mora, Ph.D.
Department of Geography, University of Hawaii
Currently available in Colombia
Phone:?  Country code: 57
? ? ? ?  Provider code: 313
? ? ? ?  Phone 776 2282
? ? ? ?  From the USA or Canada you have to dial 011 57 313 776 2282
http://www.soc.hawaii.edu/mora/



Quoting arun <smartpink111 at yahoo.com>:
#
Thanks Arun,

Well that is interesting. My intention was to have a dataframe with  
the same number of rows in the original data, and for the rows with  
NAs, then return NA (If there are NAs, often the entire row has NAs).  
What is interesting is that in your code with NAs, the row that has  
NAs gets NAs in the output, which is what I am looking for.

I guess a solution is to subset complete rows and then run your line  
of code. Unless there is an alternative, to tell cumsum to leave NAs  
as NAs?

Thanks again,

Camilo


Camilo Mora, Ph.D.
Department of Geography, University of Hawaii
Currently available in Colombia
Phone:   Country code: 57
          Provider code: 313
          Phone 776 2282
          From the USA or Canada you have to dial 011 57 313 776 2282
http://www.soc.hawaii.edu/mora/



Quoting arun <smartpink111 at yahoo.com>:
#
Dear Camilo,

You can do this:
dat1 <- structure(list(
w = c(TRUE,TRUE,TRUE,TRUE,TRUE,FALSE,FALSE,FALSE,FALSE,TRUE,TRUE,TRUE,TRUE),
x = c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA),
y = c(FALSE,FALSE,FALSE,FALSE,FALSE,TRUE,TRUE,FALSE,FALSE,TRUE,TRUE,TRUE,FALSE),
z = c(TRUE,TRUE,TRUE,TRUE,FALSE,TRUE,TRUE,TRUE,FALSE,TRUE,TRUE,TRUE,FALSE)),
row.names = c(NA, -13L),
class = "data.frame")

dat1<-t(dat1)
colnames(dat1)<-c("a","b","c","d","e","f","g","h","i","j","k", "l","m")
dat1<- as.data.frame(dat1)
dat2<-dat1
dat2[rowSums(is.na(dat2))==0,]<- t(apply(!dat1[rowSums(is.na(dat1))==0,],1,function(x) unlist(lapply(split(x,cumsum(c(0,abs(diff(x))))),cumsum))))

dat2
#?? a? b? c? d? e? f? g? h? i? j? k? l? m
#w? 0? 0? 0? 0? 0? 1? 2? 3? 4? 0? 0? 0? 0
#x NA NA NA NA NA NA NA NA NA NA NA NA NA
#y? 1? 2? 3? 4? 5? 0? 0? 1? 2? 0? 0? 0? 1
#z? 0? 0? 0? 0? 1? 0? 0? 0? 1? 0? 0? 0? 1


Suppose if NAs are there but not for the entire row (if I understand correctly), you wanted to have the whole row NA, right.

datNew<- structure(list(a = c(TRUE, NA, FALSE, TRUE, TRUE), b = c(TRUE, 
NA, FALSE, TRUE, TRUE), c = c(TRUE, NA, FALSE, TRUE, FALSE), 
??? d = c(TRUE, NA, FALSE, TRUE, FALSE), e = c(TRUE, NA, FALSE, 
??? FALSE, NA), f = c(FALSE, NA, TRUE, TRUE, NA), g = c(FALSE, 
??? NA, TRUE, TRUE, TRUE), h = c(FALSE, NA, FALSE, TRUE, FALSE
??? ), i = c(FALSE, NA, FALSE, FALSE, NA), j = c(TRUE, NA, TRUE, 
??? TRUE, TRUE), k = c(TRUE, NA, TRUE, TRUE, FALSE), l = c(TRUE, 
??? NA, TRUE, TRUE, FALSE), m = c(TRUE, NA, FALSE, FALSE, TRUE
??? )), .Names = c("a", "b", "c", "d", "e", "f", "g", "h", "i", 
"j", "k", "l", "m"), row.names = c("w", "x", "y", "z", "u"), class = "data.frame")

datNew
#????? a???? b???? c???? d???? e???? f???? g???? h???? i??? j???? k???? l???? m
#w? TRUE? TRUE? TRUE? TRUE? TRUE FALSE FALSE FALSE FALSE TRUE? TRUE? TRUE? TRUE
#x??? NA??? NA??? NA??? NA??? NA??? NA??? NA??? NA??? NA?? NA??? NA??? NA??? NA
#y FALSE FALSE FALSE FALSE FALSE? TRUE? TRUE FALSE FALSE TRUE? TRUE? TRUE FALSE
#z? TRUE? TRUE? TRUE? TRUE FALSE? TRUE? TRUE? TRUE FALSE TRUE? TRUE? TRUE FALSE
#u? TRUE? TRUE FALSE FALSE??? NA??? NA? TRUE FALSE??? NA TRUE FALSE FALSE? TRUE

dat2New<- datNew
dat2New[rowSums(is.na(dat2New))==0,]<-t(apply(!datNew[rowSums(is.na(datNew))==0,],1,function(x) unlist(lapply(split(x,cumsum(c(0,abs(diff(x))))),cumsum))))
dat2New[rowSums(is.na(dat2New))!=0 & rowSums(is.na(dat2New))!=ncol(dat2New),]<-NA
?dat2New
#?? a? b? c? d? e? f? g? h? i? j? k? l? m
#w? 0? 0? 0? 0? 0? 1? 2? 3? 4? 0? 0? 0? 0
#x NA NA NA NA NA NA NA NA NA NA NA NA NA
#y? 1? 2? 3? 4? 5? 0? 0? 1? 2? 0? 0? 0? 1
#z? 0? 0? 0? 0? 1? 0? 0? 0? 1? 0? 0? 0? 1
#u NA NA NA NA NA NA NA NA NA NA NA NA NA
A.K.






----- Original Message -----
From: Camilo Mora <cmora at dal.ca>
To: arun <smartpink111 at yahoo.com>
Cc: R help <r-help at r-project.org>
Sent: Wednesday, March 27, 2013 4:10 PM
Subject: Re: [R] conditional Dataframe filling

Thanks Arun,

Well that is interesting. My intention was to have a dataframe with? 
the same number of rows in the original data, and for the rows with? 
NAs, then return NA (If there are NAs, often the entire row has NAs).? 
What is interesting is that in your code with NAs, the row that has? 
NAs gets NAs in the output, which is what I am looking for.

I guess a solution is to subset complete rows and then run your line? 
of code. Unless there is an alternative, to tell cumsum to leave NAs? 
as NAs?

Thanks again,

Camilo


Camilo Mora, Ph.D.
Department of Geography, University of Hawaii
Currently available in Colombia
Phone:?  Country code: 57
? ? ? ? ? Provider code: 313
? ? ? ? ? Phone 776 2282
? ? ? ? ? From the USA or Canada you have to dial 011 57 313 776 2282
http://www.soc.hawaii.edu/mora/



Quoting arun <smartpink111 at yahoo.com>:
#
Nice!.

Thanks,

Camilo

Camilo Mora, Ph.D.
Department of Geography, University of Hawaii
Currently available in Colombia
Phone:   Country code: 57
          Provider code: 313
          Phone 776 2282
          From the USA or Canada you have to dial 011 57 313 776 2282
http://www.soc.hawaii.edu/mora/



Quoting arun <smartpink111 at yahoo.com>:
#
Hi Camilo,
No problem.

In case, you wanted to process the partial NA rows, this could help:
datNew<- structure(list(a = c(TRUE, NA, FALSE, TRUE, TRUE), b = c(TRUE,
NA, FALSE, TRUE, TRUE), c = c(TRUE, NA, FALSE, TRUE, FALSE),
??? d = c(TRUE, NA, FALSE, TRUE, FALSE), e = c(TRUE, NA, FALSE,
??? FALSE, NA), f = c(FALSE, NA, TRUE, TRUE, NA), g = c(FALSE,
??? NA, TRUE, TRUE, TRUE), h = c(FALSE, NA, FALSE, TRUE, FALSE
??? ), i = c(FALSE, NA, FALSE, FALSE, NA), j = c(TRUE, NA, TRUE,
??? TRUE, TRUE), k = c(TRUE, NA, TRUE, TRUE, FALSE), l = c(TRUE,
??? NA, TRUE, TRUE, FALSE), m = c(TRUE, NA, FALSE, FALSE, TRUE
??? )), .Names = c("a", "b", "c", "d", "e", "f", "g", "h", "i",
"j", "k", "l", "m"), row.names = c("w", "x", "y", "z", "u"), class = "data.frame")
dat2New<- datNew

dat2New[rowSums(is.na(dat2New))==0 | rowSums(is.na(dat2New))!=ncol(dat2New),]<- t(apply(!datNew[rowSums(is.na(datNew))==0 | rowSums(is.na(datNew))!=ncol(datNew),],1,function(x) {x[!is.na(x)]<- unlist(lapply(split(x[!is.na(x)],cumsum(c(0,abs(diff(x[!is.na(x)]))))),cumsum));x}))
?dat2New
#?? a? b? c? d? e? f? g? h? i? j? k? l? m
#w? 0? 0? 0? 0? 0? 1? 2? 3? 4? 0? 0? 0? 0
#x NA NA NA NA NA NA NA NA NA NA NA NA NA
#y? 1? 2? 3? 4? 5? 0? 0? 1? 2? 0? 0? 0? 1
#z? 0? 0? 0? 0? 1? 0? 0? 0? 1? 0? 0? 0? 1
#u? 0? 0? 1? 2 NA NA? 0? 1 NA? 0? 1? 2? 0
?datNew
#????? a???? b???? c???? d???? e???? f???? g???? h???? i??? j???? k???? l???? m
#w? TRUE? TRUE? TRUE? TRUE? TRUE FALSE FALSE FALSE FALSE TRUE? TRUE? TRUE? TRUE
#x??? NA??? NA??? NA??? NA??? NA??? NA??? NA??? NA??? NA?? NA??? NA??? NA??? NA
#y FALSE FALSE FALSE FALSE FALSE? TRUE? TRUE FALSE FALSE TRUE? TRUE? TRUE FALSE
#z? TRUE? TRUE? TRUE? TRUE FALSE? TRUE? TRUE? TRUE FALSE TRUE? TRUE? TRUE FALSE
#u? TRUE? TRUE FALSE FALSE??? NA??? NA? TRUE FALSE??? NA TRUE FALSE FALSE? TRUE
?
A.K.






----- Original Message -----
From: Camilo Mora <cmora at Dal.Ca>
To: arun <smartpink111 at yahoo.com>
Cc: R help <r-help at r-project.org>
Sent: Wednesday, March 27, 2013 4:49 PM
Subject: Re: [R] conditional Dataframe filling

Nice!.

Thanks,

Camilo

Camilo Mora, Ph.D.
Department of Geography, University of Hawaii
Currently available in Colombia
Phone:?  Country code: 57
? ? ? ? ? Provider code: 313
? ? ? ? ? Phone 776 2282
? ? ? ? ? From the USA or Canada you have to dial 011 57 313 776 2282
http://www.soc.hawaii.edu/mora/



Quoting arun <smartpink111 at yahoo.com>: