On 17-Nov-10 00:02:39, Jos? Fernando Zea Castro wrote:
Hello.
First, I'm thankful about your wonderful project.
However, I have serious worries about the reliability of R.
I found the next bug which I consider important because in
my job everytime We work with datanames like next. Please
see below:
b=data.frame(matrix(1:9,ncol=3))
names(b)=c("q99","r88","s77")
q99 r88 s77
1 1 4 7
2 2 5 8
3 3 6 9
[1] 1 2 3
Please note that the variable q9 does not exist in the dataframe,
. but you can see that R show q9 (as q99).
Thank in advanced
Cordially
Jos? Fernando Zea Castro
Statistician Universidad Nacional Colombiana
What you see here is a case of "partial matching": You ask for
'b$q9', and R sees that 'q9' matches the beginning of 'q99'
and nothing else. Therefore it responds with the value of 'b$q99',
since there is no ambiguity.
You would have got the same result if you had asked for
b$q
since there is no component name in b which matches 'q' except 'q99'.
If there had been two components which matched 'q9', say both
b$q99 and b$q98, then you would have got a NULL result, since
there is not a unique match.
However, if you also have b$q9 and b$q99 in b, then R would find that
b$q9 was an *exact* (not partial) match, and would return that one.
Normally, this should not cause problems. However, if you have
written code which must take special action if a name is not
present in a list, then there could be problems.
For example, if b might (depending on what has happened) contain
b$q9 only, or b$q99 only, or *both* b$q9 and b$q99, and you want
to execute special actions if a name is not present in b, then
in the case where b contained only b$q99 and you asked for b$q9,
you would get the wrong result because of partial matching.
This is one of those cases, in my opinion, where R's documentation
drops you into a flat landscape, in the middle of nowhere, in a
thick mist.