how can I convert a long to wide matrix?
Here is a stab in the dark. I agree with Jim that the description of the
problem is hard to follow. The original posting being in HTML format did
not help.
#########
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(tidyr)
# indenting was just a side-effect of me cleaning up the HTML mess
dat <- structure( list( ID = structure( c( 1L, 1L, 1L, 2L, 2L)
, .Label = c("id_X","id_Y")
, class = "factor"
)
, EventDate = structure( c( 4L, 5L, 2L
, 3L, 1L )
, .Label = c( "9/15/16"
, "9/15/17"
, "9/7/16"
, "9/8/16"
, "9/9/16"
)
, class = "factor"
)
, timeGroup = structure( c( 1L, 1L, 2L, 1L, 2L)
, .Label = c("B1", "B2")
, class = "factor"
)
, SITE = structure( c( 1L, 1L, 2L, 1L, 2L)
, .Label = c("A", "B" )
, class = "factor"
)
)
, .Names = c( "ID", "EventDate"
, "timeGroup", "SITE")
, class = "data.frame"
, row.names = c(NA, -5L)
)
dat2 <- ( dat
%>% mutate( EventDate = as.Date( as.character( EventDate )
, format = "%m/%d/%y"
)
)
%>% arrange( ID, timeGroup, EventDate )
%>% group_by( ID, timeGroup )
%>% top_n( 1, EventDate )
%>% ungroup
)
dat2
#> # A tibble: 4 x 4
#> ID EventDate timeGroup SITE
#> <fct> <date> <fct> <fct>
#> 1 id_X 2016-09-09 B1 A
#> 2 id_X 2017-09-15 B2 B
#> 3 id_Y 2016-09-07 B1 A
#> 4 id_Y 2016-09-15 B2 B
dat3a <- ( dat2
%>% mutate( timeGroup = paste( "EventDate"
, timeGroup
, sep="_"
)
)
%>% select( ID, timeGroup, EventDate )
%>% spread( timeGroup, EventDate )
)
dat3a
#> # A tibble: 2 x 3
#> ID EventDate_B1 EventDate_B2
#> <fct> <date> <date>
#> 1 id_X 2016-09-09 2017-09-15
#> 2 id_Y 2016-09-07 2016-09-15
dat3b <- ( dat2
%>% mutate( timeGroup = paste( "SITE"
, timeGroup
, sep = "_"
)
)
%>% select( ID, timeGroup, SITE )
%>% spread( timeGroup, SITE )
)
dat3b
#> # A tibble: 2 x 3
#> ID SITE_B1 SITE_B2
#> <fct> <fct> <fct>
#> 1 id_X A B
#> 2 id_Y A B
dat4 <- ( dat3a
%>% left_join( dat3b, by = "ID" ) )
dat4
#> # A tibble: 2 x 5
#> ID EventDate_B1 EventDate_B2 SITE_B1 SITE_B2
#> <fct> <date> <date> <fct> <fct>
#> 1 id_X 2016-09-09 2017-09-15 A B
#> 2 id_Y 2016-09-07 2016-09-15 A B
#########
On Wed, 2 May 2018, Jim Lemon wrote:
Hi Marna, This is a condition that the function cannot handle. It would be possible to reformat the result based on the time intervals, but the stretch_df function doesn't try to interpret the values, just stretches them out to a wide format. Jim On Wed, May 2, 2018 at 9:16 AM, Marna Wagley <marna.wagley at gmail.com> wrote:
Hi Jim,
The data set is correct. I took two readings from the "SITE A" within a
short time interval, therefore I want to take the first value if there are
repeated within a same group of "timeGroup".
Therefore I wanted following
FinalData1
B1 B2
id_X "A" "B"
id_Y "A" "B"
thanks,
On Tue, May 1, 2018 at 4:05 PM, Jim Lemon <drjimlemon at gmail.com> wrote:
Hi Marna, I think this is due to having three rows for id_X and only two for id_Y. The function creates a data frame with enough columns to hold the greatest number of values for each ID variable. Notice that the SITE_n columns contain three values for id_X (A, A, B) and two for id_Y (A, B, NA) as there was no third occasion of measurement for the latter. Even though there are only two _values_ for SITE, there must be enough space for three. In your desired output, SITE for the second occasion of measurement is wrong (it should be "A"), and for the third occasion it is unknown. Even if there was only one value for SITE in the original data frame, it should be repeated for the correct number of observations. I think you may be mixing up case ID with location of observation. Jim On Wed, May 2, 2018 at 8:48 AM, Marna Wagley <marna.wagley at gmail.com> wrote:
Hi Jim,
Thank you very much for your suggestions. I used it but it gave me three
sites. But actually I do have only two sites "Id_X" and "Id_y" . In fact
"A" is repeated two times for "Id_X". If it is repeated, I would like to
take the first one among many repeated values.
dat<-structure(list(ID = structure(c(1L, 1L, 1L, 2L, 2L), .Label =
c("id_X",
"id_Y"), class = "factor"), EventDate = structure(c(4L, 5L, 2L,
3L, 1L), .Label = c("9/15/16", "9/15/17", "9/7/16", "9/8/16",
"9/9/16"), class = "factor"), timeGroup = structure(c(1L, 1L,
2L, 1L, 2L), .Label = c("B1", "B2"), class = "factor"), SITE =
structure(c(1L,
1L, 2L, 1L, 2L), .Label = c("A", "B"), class = "factor")), .Names =
c("ID",
"EventDate", "timeGroup", "SITE"), class = "data.frame", row.names =
c(NA,
-5L))
library(prettyR)
stretch_df(dat,idvar="ID",to.stretch=c("EventDate","SITE"))
ID timeGroup EventDate_1 EventDate_2 EventDate_3 SITE_1 SITE_2 SITE_3
1 id_X B1 9/8/16 9/9/16 9/15/17 A A
B
2 id_Y B1 9/7/16 9/15/16 <NA> A B
<NA>
Basically I am looking for like following table ID timeGroup EventDate_1 EventDate_2 EventDate_3 SITE_1 SITE_2 1 id_X B1 9/8/16 9/9/16 9/15/17 A B 2 id_Y B1 9/7/16 9/15/16 <NA> A B Thanks On Tue, May 1, 2018 at 3:32 PM, Jim Lemon <drjimlemon at gmail.com> wrote:
Hi Marna,
Try this:
library(prettyR)
stretch_df(dat,idvar="ID",to.stretch=c("EventDate","SITE"))
Jim
On Wed, May 2, 2018 at 8:24 AM, Marna Wagley <marna.wagley at gmail.com>
wrote:
Hi R user,
I was trying to convert a long matrix to wide? I have an example and
would
like to get a table (FinalData1):
FinalData1
B1 B2
id_X "A" "B"
id_Y "A" "B"
but I got the following table using the following code.
FinalData1
B1 B2
id_X "A" "A"
id_Y "A" "B"
the code and the example data I used are given below. Is there any
suggestions to fix the problem?
dat<-structure(list(ID = structure(c(1L, 1L, 1L, 2L, 2L), .Label =
c("id_X",
"id_Y"), class = "factor"), EventDate = structure(c(4L, 5L, 2L,
3L, 1L), .Label = c("9/15/16", "9/15/17", "9/7/16", "9/8/16",
"9/9/16"), class = "factor"), timeGroup = structure(c(1L, 1L,
2L, 1L, 2L), .Label = c("B1", "B2"), class = "factor"), SITE =
structure(c(
1L,
1L, 2L, 1L, 2L), .Label = c("A", "B"), class = "factor")), .Names =
c("ID",
"EventDate", "timeGroup", "SITE"), class = "data.frame", row.names =
c(NA,
-5L))
tmp <- split(dat, dat$ID)
tmp1 <- do.call(rbind, lapply(tmp, function(dat){
tb <- table(dat$timeGroup)
idx <- which(tb>0)
tb1 <- replace(tb, idx, as.character(dat$SITE))
}))
tmp1
FinalData<-print(tmp1, quote=FALSE)
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k