wrangling errors

filtering a list by type

Charles T. Gray https://twitter.com/cantabile
02-19-2019

# packages used in this post
library(tidyverse)

I want to wrangle some errors out of the results of a function.

At the moment, my function returns NULL if a warning or an error is thrown, which gets me the results that ran, but I’d like to have more information about the trials that didn’t run.

I think I can get my function to return a dataframe of results if the function works as intended, and a character string detailing the error or warning if the function fails. This will give me a list of dataframes intermixed with character strings.

So, my question of the day is how to filter a list by type?

First, I’ll create a dummy list1.


# create a list of dataframes and character strings
playlist <- list(
  "Beanie 'Legs' McGraw",
  "Peug the Door-opening Cat",
  iris %>% select(Sepal.Length, Species) %>%  filter(Species == "setosa") %>% head(),
  iris %>% select(Sepal.Length, Species) %>% filter(Species == "versicolor") %>% head(),
  iris %>% select(Sepal.Length, Species) %>% filter(Species == "virginica") %>% head(),
  "Lord Euclid of the Fluffy Butt"
)

playlist %>% str()

List of 6
 $ : chr "Beanie 'Legs' McGraw"
 $ : chr "Peug the Door-opening Cat"
 $ :'data.frame':   6 obs. of  2 variables:
  ..$ Sepal.Length: num [1:6] 5.1 4.9 4.7 4.6 5 5.4
  ..$ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1
 $ :'data.frame':   6 obs. of  2 variables:
  ..$ Sepal.Length: num [1:6] 7 6.4 6.9 5.5 6.5 5.7
  ..$ Species     : Factor w/ 3 levels "setosa","versicolor",..: 2 2 2 2 2 2
 $ :'data.frame':   6 obs. of  2 variables:
  ..$ Sepal.Length: num [1:6] 6.3 5.8 7.1 6.3 6.5 7.6
  ..$ Species     : Factor w/ 3 levels "setosa","versicolor",..: 3 3 3 3 3 3
 $ : chr "Lord Euclid of the Fluffy Butt"

So, how do I get the dataframes out into one list and the errors into another? Once separated, I’ll be able to bind_rows into two dataframes, results and errors.

Off the top of my head, I can see how to do this with map, at least for the dataframes.


# extract elements that are dataframes
playlist %>% 
  map(.f = function(x){
    if (is.data.frame(x)) return(x)
  }) %>% bind_rows()

   Sepal.Length    Species
1           5.1     setosa
2           4.9     setosa
3           4.7     setosa
4           4.6     setosa
5           5.0     setosa
6           5.4     setosa
7           7.0 versicolor
8           6.4 versicolor
9           6.9 versicolor
10          5.5 versicolor
11          6.5 versicolor
12          5.7 versicolor
13          6.3  virginica
14          5.8  virginica
15          7.1  virginica
16          6.3  virginica
17          6.5  virginica
18          7.6  virginica

So that seems to work and didn’t take too much code. Good enough. (I checked, and the function defaults to returning NULL if the condition is not met.)

But applying the same logic to filtering the character strings out


# extract elements that are character strings
playlist %>% 
  map(.f = function(x){
    if (is.character(x)) return(x) 
  }) %>% as.character()

[1] "Beanie 'Legs' McGraw"           "Peug the Door-opening Cat"     
[3] "NULL"                           "NULL"                          
[5] "NULL"                           "Lord Euclid of the Fluffy Butt"

How to get rid of the NULL elements? I suppose I could do some base, but there’s probably a nifty tidy way around this.

I suspect, however, that there is a better way to filter lists with purrr:: combined with dplyr::. Possibly scary lambda functions? I need to level up my purrr::.

tweeps to the rescue

Thanks to James, Ken, and Francois, I found the purrr::keep function. Just what I wanted for Christmas! Cheers.

Hey, purrr:: tweeps. Anyone know offhand about filtering a list by type? Here's a quick post outlining what I'd like to do, and my hack. @samclifford @rensa_co, perchance? https://t.co/BxpXXfHUBc

— Charles T. Gray (@cantabile) February 19, 2019

playlist %>% keep(is.data.frame) %>% bind_rows()

   Sepal.Length    Species
1           5.1     setosa
2           4.9     setosa
3           4.7     setosa
4           4.6     setosa
5           5.0     setosa
6           5.4     setosa
7           7.0 versicolor
8           6.4 versicolor
9           6.9 versicolor
10          5.5 versicolor
11          6.5 versicolor
12          5.7 versicolor
13          6.3  virginica
14          5.8  virginica
15          7.1  virginica
16          6.3  virginica
17          6.5  virginica
18          7.6  virginica

playlist %>% keep(is.character) %>% as.character()

[1] "Beanie 'Legs' McGraw"           "Peug the Door-opening Cat"     
[3] "Lord Euclid of the Fluffy Butt"

w00t


  1. I know, I know, I should find a more interesting dataset than iris.

Citation

For attribution, please cite this work as

Gray (2019, Feb. 19). measured.: wrangling errors. Retrieved from https://fervent-hypatia-7b7343.netlify.com/posts/2019-02-19-wrangling-errors/

BibTeX citation

@misc{gray2019wrangling,
  author = {Gray, Charles T.},
  title = {measured.: wrangling errors},
  url = {https://fervent-hypatia-7b7343.netlify.com/posts/2019-02-19-wrangling-errors/},
  year = {2019}
}