Skip to content

two apparent anomalies

6 messages · analyst41 at hotmail.com, Sarah Goslee, Berwin A Turlach +2 more

#
(1)
[1] "character"
[1] "numeric"
[1] "numeric"

(2)
[1] "a" "b" "c"
[1] "a" "b"
[1] "a" "b" "c"
[1] "a" "b" "c"

Any explanation would be helpful.  Thanks.
#
(1)
chr [1:2] "a" "b"
num [1:2] 1 2
'data.frame':	2 obs. of  2 variables:
 $ a: Factor w/ 2 levels "a","b": 1 2
 $ b: num  1 2
[1] "numeric"
'data.frame':	2 obs. of  2 variables:
 $ a: chr  "a" "b"
 $ b: num  1 2
[1] "character"


(2)
[1] "a" "b" "c"
[1] "a" "a" "b"
[1] "a" "b"
[1] a a b b c
Levels: a b c
[1] a a b
Levels: a b c


On Sat, Jan 22, 2011 at 9:16 AM, analyst41 at hotmail.com
<analyst41 at hotmail.com> wrote:
#
On Sat, 22 Jan 2011 06:16:43 -0800 (PST)
"analyst41 at hotmail.com" <analyst41 at hotmail.com> wrote:

            
R> str(c)
'data.frame':	2 obs. of  2 variables:
 $ a: Factor w/ 2 levels "a","b": 1 2
 $ b: num  1 2

Character vectors are turned into factors by default by data.frame().

OTOH:

R> c = data.frame(a,b, stringsAsFactors=FALSE)
R> mode(c$a)  
[1] "character"
Subsetting factors does not get rid of no-longer used levels by default.

OTOH:

R> levels(a[1:3, drop=TRUE])
[1] "a" "b"

or

R> levels(factor(a[1:3]))
[1] "a" "b"


HTH.

Cheers,

	Berwin

========================== Full address ============================
Berwin A Turlach                      Tel.: +61 (8) 6488 3338 (secr)
School of Maths and Stats (M019)            +61 (8) 6488 3383 (self)
The University of Western Australia   FAX : +61 (8) 6488 1028
35 Stirling Highway                   
Crawley WA 6009                e-mail: berwin at maths.uwa.edu.au
Australia                        http://www.maths.uwa.edu.au/~berwin
#
On Jan 22, 9:50?am, Berwin A Turlach <ber... at maths.uwa.edu.au> wrote:
Thanks for both responses.

is there a difference between the "as.factor" and "factor" commands
and also between "as.data.frame" and "data.frame"?
#
My explanation for No2: 

When coercing a character vector to factor, the current levels are stored.
By choosing a subvector of the factor you don't change the levels of the
factor. So levels(a[1:3]) is still [1] "a" "b" "c" in the last line ... 

If you want to reduce levels you need to tell R.
[1] "a" "b"

________________
Moritz Grenke
http://www.360mix.de

-----Urspr?ngliche Nachricht-----
Von: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] Im
Auftrag von analyst41 at hotmail.com
Gesendet: Samstag, 22. Januar 2011 15:17
An: r-help at r-project.org
Betreff: [R] two apparent anomalies

(1)
[1] "character"
[1] "numeric"
[1] "numeric"

(2)
[1] "a" "b" "c"
[1] "a" "b"
[1] "a" "b" "c"
[1] "a" "b" "c"

Any explanation would be helpful.  Thanks.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.