Skip to content

fill 0-row data.frame with 1 line of NAs

13 messages · Rui Barradas, Brian Diggs, Peter Ehlers +5 more

#
Dear all
Is there a simpler method to achieve the following: When I obtain an
empty data.frame after subsetting, I need for it to contain one line
of NAs. Here's a dummy example:
[1] Sepal.Length Sepal.Width  Petal.Length Petal.Width  Species
<0 rows> (or 0-length row.names)
[1] 0 5
X1 X2 X3 X4 X5
1 NA NA NA NA NA
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1           NA          NA           NA          NA      NA


The solution I came up with is way too convoluted. Anything simpler? Regards
Liviu
#
Hello,

If you write a function, it becomes less convoluted...


empty <- function(x){
	if(NROW(x) == 0){
		y <- rep(NA, NCOL(x))
		names(y) <- names(x)
		y
	}else x
}

(.xb <- iris[ iris$Species=='zz', ])
empty(.xb)


Hope this helps,

Rui Barradas

Em 10-07-2012 14:15, Liviu Andronic escreveu:
#
On 2012-07-10 06:57, Rui Barradas wrote:
Both this and Liviu's original solution destroy the
factor nature of 'Species' (which may not matter, of
course). How about

   (.xb <- iris[ iris$Species=='zz', ])
   .xb <- .xb[1, ]   # this probably shouldn't work, but it does.

?

Peter Ehlers
#
On 7/10/2012 7:53 AM, Peter Ehlers wrote:
Using NA subscripting seems even better

empty <- function(x) {
   if(NROW(x) == 0) {
     x[NA,]
   } else {
     x
   }
}

It even preserves the factor nature of things:

 > empty(iris[iris$Specis=='zz',])
    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
NA           NA          NA           NA          NA    <NA>
 > str(empty(iris[iris$Specis=='zz',]))
'data.frame':   1 obs. of  5 variables:
  $ Sepal.Length: num NA
  $ Sepal.Width : num NA
  $ Petal.Length: num NA
  $ Petal.Width : num NA
  $ Species     : Factor w/ 3 levels "setosa","versicolor",..: NA

  
    
#
On 2012-07-10 08:50, Brian Diggs wrote:
Yes, you can subset with NA or any real number greater than 1.

Peter Ehlers
#
Hello,

Em 10-07-2012 18:59, Peter Ehlers escreveu:
Good to know,  was completely unaware of this indexing possibility.

Rui Barradas
#
On Tue, Jul 10, 2012 at 4:53 PM, Peter Ehlers <ehlers at ucalgary.ca> wrote:
This one is an excellent solution, but yet another---what I
call---quirky behaviour from R.

Thanks all! Regards
Liviu
#
On Jul 10, 2012, at 2:05 PM, Rui Barradas wrote:

            
It would be difficult to be more compact than this:

 > iris[1, ][NA,]
    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
NA           NA          NA           NA          NA    <NA>

--  
David
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
#
On Tue, Jul 10, 2012 at 9:15 AM, Liviu Andronic <landronimirc at gmail.com> wrote:
Try this:

Try this:
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
NA           NA          NA           NA          NA    <NA>
#
Hi,

Try this:
 .xa<-iris[1,][rep(NA,length(iris),1),]
.xa
#?? Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#NA?????????? NA????????? NA?????????? NA????????? NA??? <NA>
#or

.xb<-iris[1,][rep(NA,ncol(iris),1),]
?.xb
#?? Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#NA?????????? NA????????? NA?????????? NA????????? NA??? <NA>


A.K.


----- Original Message -----
From: Liviu Andronic <landronimirc at gmail.com>
To: "r-help at r-project.org Help" <r-help at r-project.org>
Cc: 
Sent: Tuesday, July 10, 2012 9:15 AM
Subject: [R] fill 0-row data.frame with 1 line of NAs

Dear all
Is there a simpler method to achieve the following: When I obtain an
empty data.frame after subsetting, I need for it to contain one line
of NAs. Here's a dummy example:
[1] Sepal.Length Sepal.Width? Petal.Length Petal.Width? Species
<0 rows> (or 0-length row.names)
[1] 0 5
? X1 X2 X3 X4 X5
1 NA NA NA NA NA
? Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1? ? ? ? ?  NA? ? ? ? ? NA? ? ? ? ?  NA? ? ? ? ? NA? ? ? NA


The solution I came up with is way too convoluted. Anything simpler? Regards
Liviu
#
Why does one want to replace a zero-row data.frame
with a one-row data.frame of NA's?  Unless this is for
an external program that cannot handle zero-row inputs,
this suggests that there is an unnecessary limitation (i.e.,
a bug) in the R code that uses this data.frame.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
#
On Wed, Jul 11, 2012 at 9:56 PM, William Dunlap <wdunlap at tibco.com> wrote:
I'm running an apply(df, 1, f) function, where f() matches a df$string
in another matrix and fetches data associated with this string. When
no match is made I do not need a zero-row data frame, but to preserve
the structure of the original df I need a data frame with 1 row of
NAs. There may be a nicer approach, but I'm not aware of any.

Regards
Liviu
#
In that case, I think that using a subscript of NA is the
best way to go.  It works for both matrices and data.frames
(unlike an integer larger than nrow(data)) and its meaning
is pretty clear.

Also, you will probably get better results if the function
in your call to apply() returns the index (perhaps NA) of a row
of a data.frame instead of the row itself.  Then subscript that data.frame
once with the output of apply rather than subscripting it many
times and rbinding the results back together.  This is natural
if you use match(), as it returns NA for no match (merge() does
this sort of thing).

Here is an example of this sort of thing when using a non-standard
sort of match.  The following matches a long/lat pair to that of the
nearest city in the table, but returns NA if the point is too far from
any city:

nearestTo <- function (x, table, limit = 1) 
{
    stopifnot(all(is.element(c("long", "lat"), names(x))), all(is.element(c("long", 
        "lat"), names(table))))
    dists <- sqrt((x["lat"] - table[, "lat"])^2 + (x["long"] - 
        table[, "long"])^2)
    retval <- which.min(dists)
    if (dists[retval] > limit) {
        retval <- NA_integer_
    }
    retval
}

cities <- data.frame(
     long = c(-117.833, -116.217, -123.083, -123.9, -121.733, 
        -117.033, -122.683, -122.333, -117.433),
     lat = c(44.7833, 43.6, 44.05, 46.9833, 42.1667, 
        46.4, 45.5167, 47.6167, 47.6667),
     row.names = c("Baker", "Boise", "Eugene", "Hoquiam", 
        "Klamath Falls", "Lewiston", "Portland", 
        "Seattle", "Spokane")
)

df <- data.frame(
     long = c(-116.77, -123.68, -122.96, -120.81, -116.26, 
        -123.54, -121.22, -115.12),
     lat = c(47.3, 44.53, 44.35, 45.99, 46.75, 43.78, 
        42.71, 46.66))

whichCity <- apply(df, 1, nearestTo, cities, limit=1)
whichCity
# [1]  9  3  3 NA  6  3  5 NA
cbind(df, nearbyCity = rownames(cities)[whichCity])
#      long   lat    nearbyCity
# 1 -116.77 47.30       Spokane
# 2 -123.68 44.53        Eugene
# 3 -122.96 44.35        Eugene
# 4 -120.81 45.99          <NA>
# 5 -116.26 46.75      Lewiston
# 6 -123.54 43.78        Eugene
# 7 -121.22 42.71 Klamath Falls
# 8 -115.12 46.66          <NA>


Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com