In that case, I think that using a subscript of NA is the
best way to go. It works for both matrices and data.frames
(unlike an integer larger than nrow(data)) and its meaning
is pretty clear.
Also, you will probably get better results if the function
in your call to apply() returns the index (perhaps NA) of a row
of a data.frame instead of the row itself. Then subscript that data.frame
once with the output of apply rather than subscripting it many
times and rbinding the results back together. This is natural
if you use match(), as it returns NA for no match (merge() does
this sort of thing).
Here is an example of this sort of thing when using a non-standard
sort of match. The following matches a long/lat pair to that of the
nearest city in the table, but returns NA if the point is too far from
any city:
nearestTo <- function (x, table, limit = 1)
{
stopifnot(all(is.element(c("long", "lat"), names(x))), all(is.element(c("long",
"lat"), names(table))))
dists <- sqrt((x["lat"] - table[, "lat"])^2 + (x["long"] -
table[, "long"])^2)
retval <- which.min(dists)
if (dists[retval] > limit) {
retval <- NA_integer_
}
retval
}
cities <- data.frame(
long = c(-117.833, -116.217, -123.083, -123.9, -121.733,
-117.033, -122.683, -122.333, -117.433),
lat = c(44.7833, 43.6, 44.05, 46.9833, 42.1667,
46.4, 45.5167, 47.6167, 47.6667),
row.names = c("Baker", "Boise", "Eugene", "Hoquiam",
"Klamath Falls", "Lewiston", "Portland",
"Seattle", "Spokane")
)
df <- data.frame(
long = c(-116.77, -123.68, -122.96, -120.81, -116.26,
-123.54, -121.22, -115.12),
lat = c(47.3, 44.53, 44.35, 45.99, 46.75, 43.78,
42.71, 46.66))
whichCity <- apply(df, 1, nearestTo, cities, limit=1)
whichCity
# [1] 9 3 3 NA 6 3 5 NA
cbind(df, nearbyCity = rownames(cities)[whichCity])
# long lat nearbyCity
# 1 -116.77 47.30 Spokane
# 2 -123.68 44.53 Eugene
# 3 -122.96 44.35 Eugene
# 4 -120.81 45.99 <NA>
# 5 -116.26 46.75 Lewiston
# 6 -123.54 43.78 Eugene
# 7 -121.22 42.71 Klamath Falls
# 8 -115.12 46.66 <NA>
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
-----Original Message-----
From: Liviu Andronic [mailto:landronimirc at gmail.com]
Sent: Wednesday, July 11, 2012 2:19 PM
To: William Dunlap
Cc: arun; R help
Subject: Re: [R] fill 0-row data.frame with 1 line of NAs
On Wed, Jul 11, 2012 at 9:56 PM, William Dunlap <wdunlap at tibco.com> wrote:
Why does one want to replace a zero-row data.frame
with a one-row data.frame of NA's? Unless this is for
an external program that cannot handle zero-row inputs,
this suggests that there is an unnecessary limitation (i.e.,
a bug) in the R code that uses this data.frame.
I'm running an apply(df, 1, f) function, where f() matches a df$string
in another matrix and fetches data associated with this string. When
no match is made I do not need a zero-row data frame, but to preserve
the structure of the original df I need a data frame with 1 row of
NAs. There may be a nicer approach, but I'm not aware of any.
Regards
Liviu