Skip to content
Back to formatted view

Raw Message

Message-ID: <1354206468.57229.BPMail_low_carrier@web125105.mail.ne1.yahoo.com>
Date: 2012-11-29T16:27:48Z
From: Suharto Anggono Suharto Anggono
Subject: Use of 'match' in end part of 'levels<-.factor'

match(xlevs[x], nlevs)
is equivalent to
match(xlevs, nlevs)[x]
The latter has an advantage. In the latter, an element of 'xlevs' is onlz once matched against 'nlevs'. In the former, the same element is repeatedly matched if it is selected multiple times by 'x'.

In end part of the code of function 'levels<-.factor', there is
y <- match(xlevs[x], nlevs)
It is still there in R 2.15.2. I suggest changing it to
y <- match(xlevs, nlevs)[x]

However,
match(xlevs[x], nlevs)
is more efficient than
match(xlevs, nlevs)
if xlevs[x] is short compared to xlevs. In 'levels<-.factor', a compromise may be using something like
y <- if (length(x) <=
length(xlevs))
match(xlevs[x], nlevs) else
match(xlevs, nlevs)[x]