Skip to content

Some days missing using xtabs

7 messages · Pascal Oettli, arun, Rui Barradas +1 more

#
Hello,

Why should those numbers show up in the final result? They are missing 
in the original data frame. A hack could be


fac <- factor(hospital_2004$d_release, levels = 
seq_len(max(hospital_2004$d_release)))

as.data.frame(xtabs( ~ fac + m_release + y_release, data=hospital_2004))


And there would still be a 31 in m_release 6, which you call June but in 
fact is just a number.


Hope this helps,

Rui Barradas

Em 23-07-2013 10:33, Stefano Sofia escreveu:
#
Hello,

As for your second question, before merge(), try the following.

release_freq$d_release <- as.integer(as.character(release_freq$d_release))
release_freq$m_release <- as.integer(as.character(release_freq$m_release))
release_freq$y_release <- as.integer(as.character(release_freq$y_release))


And the warning is gone.

Hope this helps,

Rui Barradas

Em 23-07-2013 10:33, Stefano Sofia escreveu:
#
Hi,

I tried this without the changing the class, but there was no warning.

?str(release_freq)
#'data.frame':??? 62 obs. of? 4 variables:
# $ d_release: Factor w/ 31 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
# $ m_release: Factor w/ 2 levels "5","6": 1 1 1 1 1 1 1 1 1 1 ...
# $ y_release: Factor w/ 1 level "2004": 1 1 1 1 1 1 1 1 1 1 ...
# $ Freq???? : num? 0 0 0 0 1 1 1 0 0 1 ...
?str(temp_h12)
#'data.frame':??? 31 obs. of? 4 variables:
# $ y_temp: int? 2004 2004 2004 2004 2004 2004 2004 2004 2004 2004 ...
# $ m_temp: int? 5 5 5 5 5 5 5 5 5 5 ...
# $ d_temp: int? 1 2 3 4 5 6 7 8 9 10 ...
# $ temp? : num? 16.9 18 17.4 19.7 105.7 ...


res<-merge(release_freq, temp_h12, by.x=c("y_release","m_release","d_release"), by.y=c("y_temp","m_temp","d_temp"), stringsAsFactors=FALSE)

? head(res)
?# y_release m_release d_release Freq temp
#1????? 2004???????? 5???????? 1??? 0 16.9
#2????? 2004???????? 5??????? 10??? 1 16.1
#3????? 2004???????? 5??????? 11??? 1 15.8
#4????? 2004???????? 5??????? 12??? 1 15.1
#5????? 2004???????? 5??????? 13??? 0 17.8
#6????? 2004???????? 5??????? 14??? 0 17.4

# changing the class
release_freq$d_release <- as.integer(as.character(release_freq$d_release))
release_freq$m_release <- as.integer(as.character(release_freq$m_release))
release_freq$y_release <- as.integer(as.character(release_freq$y_release))
res1<- merge(release_freq, temp_h12, 
by.x=c("y_release","m_release","d_release"), 
by.y=c("y_temp","m_temp","d_temp"), stringsAsFactors=FALSE)

head(res1)
#? y_release m_release d_release Freq temp
#1????? 2004???????? 5???????? 1??? 0 16.9
#2????? 2004???????? 5??????? 10??? 1 16.1
#3????? 2004???????? 5??????? 11??? 1 15.8
#4????? 2004???????? 5??????? 12??? 1 15.1
#5????? 2004???????? 5??????? 13??? 0 17.8
#6????? 2004???????? 5??????? 14??? 0 17.4

The results are not identical.
? identical(res,res1)
#[1] FALSE
str(res)
#'data.frame':??? 31 obs. of? 5 variables:
# $ y_release: Factor w/ 1 level "2004": 1 1 1 1 1 1 1 1 1 1 ...
# $ m_release: Factor w/ 2 levels "5","6": 1 1 1 1 1 1 1 1 1 1 ...
# $ d_release: Factor w/ 31 levels "1","2","3","4",..: 1 10 11 12 13 14 15 16 17 18 ...
# $ Freq???? : num? 0 1 1 1 0 0 1 1 0 1 ...
# $ temp???? : num? 16.9 16.1 15.8 15.1 17.8 17.4 16 17.7 17.3 22.3 ...
?str(res1)
#'data.frame':??? 31 obs. of? 5 variables:
# $ y_release: int? 2004 2004 2004 2004 2004 2004 2004 2004 2004 2004 ...
# $ m_release: int? 5 5 5 5 5 5 5 5 5 5 ...
# $ d_release: int? 1 10 11 12 13 14 15 16 17 18 ...
# $ Freq???? : num? 0 1 1 1 0 0 1 1 0 1 ...
# $ temp???? : num? 16.9 16.1 15.8 15.1 17.8 17.4 16 17.7 17.3 22.3 ...


sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
?[1] LC_CTYPE=en_CA.UTF-8?????? LC_NUMERIC=C????????????? 
?[3] LC_TIME=en_CA.UTF-8??????? LC_COLLATE=en_CA.UTF-8??? 
?[5] LC_MONETARY=en_CA.UTF-8??? LC_MESSAGES=en_CA.UTF-8?? 
?[7] LC_PAPER=C???????????????? LC_NAME=C???????????????? 
?[9] LC_ADDRESS=C?????????????? LC_TELEPHONE=C??????????? 
[11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C?????? 

attached base packages:
[1] stats???? graphics? grDevices utils???? datasets? methods?? base???? 

other attached packages:
[1] stringr_0.6.2? reshape2_1.2.2

loaded via a namespace (and not attached):
[1] plyr_1.8

A.K.

----- Original Message -----
From: Rui Barradas <ruipbarradas at sapo.pt>
To: Stefano Sofia <stefano.sofia at regione.marche.it>
Cc: "r-help at r-project.org" <r-help at r-project.org>
Sent: Tuesday, July 23, 2013 6:50 AM
Subject: Re: [R] Some days missing using xtabs

Hello,

As for your second question, before merge(), try the following.

release_freq$d_release <- as.integer(as.character(release_freq$d_release))
release_freq$m_release <- as.integer(as.character(release_freq$m_release))
release_freq$y_release <- as.integer(as.character(release_freq$y_release))


And the warning is gone.

Hope this helps,

Rui Barradas

Em 23-07-2013 10:33, Stefano Sofia escreveu:
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
Hello,

Something I've just noticed, stringsAsFactors is not an argument to merge().

And, without changing the class I g a warning:

Warning message:
In `[<-.factor`(`*tmp*`, ri, value = 1:31) :
   invalid factor level, NA generated


Rui Barradas

Em 23-07-2013 13:36, arun escreveu:
#
Thanks to Rui Barradas and to Arun.
Rui's considerations are very sensible, and they solved all my doubts.

Thank you
Stefano