Skip to content
Prev 389190 / 398506 Next

How to remove all rows that have a numeric in the first (or any) column

My apologies. My reply was to Andrew, not Gregg.

Enough damage for one night. Here is hoping we finally understood a question that could have been better phrased. list columns are not normally considered common data structures but quite possibly will be more as time goes on and the tools to handle them become better or at least better understood.


-----Original Message-----
From: R-help <r-help-bounces at r-project.org> On Behalf Of Avi Gross via R-help
Sent: Wednesday, September 15, 2021 1:23 AM
To: R-help at r-project.org
Subject: Re: [R] How to remove all rows that have a numeric in the first (or any) column

You are correct, Gregg, I am aware of that trick of asking something to not be evaluated in certain ways.

 

And you can indeed use base R to play with contents of beta as defined above.  Here is a sort of incremental demo:
[1] FALSE  TRUE  TRUE FALSE
[1]  TRUE FALSE FALSE  TRUE
# A tibble: 2 x 2

alpha beta     

<int> <list>   

  1     1 <chr [1]>

  2     4 <chr [1]>

  > str(mydf[keeping, ])

tibble [2 x 2] (S3: tbl_df/tbl/data.frame)

$ alpha: int [1:2] 1 4

$ beta :List of 2

..$ : chr "Hello"

..$ : chr "bye"

 

Now for the bad news. The original request was for ANY column. But presumably one way to do it, neither efficiently nor the best, would be to loop on the names of all the columns and starting with the original data.frame, whittle away at it column by column and adjust which column you search each time until what is left had nothing numeric anywhere. 

 

Now if I was using dplyr, I wonder if there is a nice way to use rowwise() to evaluate across a row.

 

Using your technique I made the following data.frame:

 

mydf <- data.frame(alpha=I(list("first", 2, 3.3, "Last")), 

                   beta=I(list(1, "second", 3.3, "Lasting")))
alpha    beta

1 first       1

2     2  second

3   3.3     3.3

4  Last Lasting

 

Do we agree only the fourth row should be kept as the others have one or two numeric values?

 

Here is some code I cobbled together that seems to work:

 

 

rowwise(mydf) %>% 

  mutate(alphazoid=!is.numeric(unlist(alpha)), 

         betazoid=!is.numeric(unlist(beta))) %>%

  filter(alphazoid & betazoid) -> result

 

str(result)  

print(result)

result[[1,1]]

result[[1,2]]

 

as.data.frame(result)

 

The results are shown below that only the fourth row was kept:
+   mutate(alphazoid=!is.numeric(unlist(alpha)), 

             +          betazoid=!is.numeric(unlist(beta))) %>%

  +   filter(alphazoid & betazoid) -> result
> str(result)  

rowwise_df [1 x 4] (S3: rowwise_df/tbl_df/tbl/data.frame)

$ alpha    :List of 1

..$ : chr "Last"

..- attr(*, "class")= chr "AsIs"

$ beta     :List of 1

..$ : chr "Lasting"

..- attr(*, "class")= chr "AsIs"

$ alphazoid: logi TRUE

$ betazoid : logi TRUE

- attr(*, "groups")= tibble [1 x 1] (S3: tbl_df/tbl/data.frame)

..$ .rows: list<int> [1:1] 

.. ..$ : int 1

.. ..@ ptype: int(0)
# A tibble: 1 x 4

# Rowwise: 

alpha     beta      alphazoid betazoid

<I<list>> <I<list>> <lgl>     <lgl>   

  1 <chr [1]> <chr [1]> TRUE      TRUE
[[1]]

[1] "Last"
[[1]]

[1] "Lasting"
alpha    beta alphazoid betazoid

1  Last Lasting      TRUE     TRUE

 

Of course, the temporary columns for alphazoid and betazoid can trivially be removed.

 

 

 

 

From: Andrew Simmons <akwsimmo at gmail.com>
Sent: Wednesday, September 15, 2021 12:44 AM
To: Avi Gross <avigross at verizon.net>
Cc: Gregg Powell via R-help <r-help at r-project.org>
Subject: Re: [R] How to remove all rows that have a numeric in the first (or any) column

 

I'd like to point out that base R can handle a list as a data frame column, it's just that you have to make the list of class "AsIs". So in your example

 

temp <- list("Hello", 1, 1.1, "bye")

 

data.frame(alpha = 1:4, beta = I(temp)) 

 

means that column "beta" will still be a list.
On Wed, Sep 15, 2021, 00:40 Avi Gross via R-help <r-help at r-project.org <mailto:r-help at r-project.org> > wrote:
Calling something a data.frame does not make it a data.frame.

The abbreviated object shown below is a list of singletons. If it is a column in a larger object that is a data.frame, then it is a list column which is valid but can be ticklish to handle within base R but less so in the tidyverse.

For example, if I try to make a data.frame the normal way, the list gets made into multiple columns and copied to each row. Not what was expected. I think some tidyverse functionality does better.

Like this:

library(tidyverse)
temp=list("Hello", 1, 1.1, "bye")

Now making a data.frame has an odd result:
alpha beta..Hello. beta.1 beta.1.1 beta..bye.
1     1        Hello      1      1.1        bye
2     2        Hello      1      1.1        bye
3     3        Hello      1      1.1        bye
4     4        Hello      1      1.1        bye

But a tibble handles it:
# A tibble: 4 x 2
alpha beta     
<int> <list>   
  1     1 <chr [1]>
  2     2 <dbl [1]>
  3     3 <dbl [1]>
  4     4 <chr [1]>

So if the data does look like this, with a list column, but access can be tricky as subsetting a list with [] returns a list and you need [[]].

I found a somehwhat odd solution like this:

mydf %>%
   filter(!map_lgl(beta, is.numeric)) -> mydf2 # A tibble: 2 x 2
alpha beta     
<int> <list>   
  1     1 <chr [1]>
  2     4 <chr [1]>

When I saved that result into mydf2, I got this.

Original:

  > str(mydf)
tibble [4 x 2] (S3: tbl_df/tbl/data.frame) $ alpha: int [1:4] 1 2 3 4 $ beta :List of 4 ..$ : chr "Hello"
..$ : num 1
..$ : num 1.1
..$ : chr "bye"

Output when any row with a numeric is removed:
tibble [2 x 2] (S3: tbl_df/tbl/data.frame) $ alpha: int [1:2] 1 4 $ beta :List of 2 ..$ : chr "Hello"
..$ : chr "bye"

So if you try variations on your code motivated by what I show, good luck. I am sure there are many better ways but I repeat, it can be tricky.

-----Original Message-----
From: R-help <r-help-bounces at r-project.org <mailto:r-help-bounces at r-project.org> > On Behalf Of Jeff Newmiller
Sent: Tuesday, September 14, 2021 11:54 PM
To: Gregg Powell <g.a.powell at protonmail.com <mailto:g.a.powell at protonmail.com> >
Cc: Gregg Powell via R-help <r-help at r-project.org <mailto:r-help at r-project.org> >
Subject: Re: [R] How to remove all rows that have a numeric in the first (or any) column

You cannot apply vectorized operators to list columns... you have to use a map function like sapply or purrr::map_lgl to obtain a logical vector by running the function once for each list element:

sapply( VPN_Sheet1$HVA, is.numeric )
On September 14, 2021 8:38:35 PM PDT, Gregg Powell <g.a.powell at protonmail.com <mailto:g.a.powell at protonmail.com> > wrote:
--
Sent from my phone. Please excuse my brevity.

______________________________________________
R-help at r-project.org <mailto:R-help at r-project.org>  mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help at r-project.org <mailto:R-help at r-project.org>  mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.