# good enough testing coverage

What defines good testing coverage in an R analysis?

Charles T. Gray true
12-30-2018

I’ve previously written about how cool I think testing is, as well as debugging. It feels natural to next ask, when is one done? When has one achieved testing coverage? What does covr(Hester 2018) measure? What does the literature say about testing? My ultimate question, pedant as I am, what is good enough(Wilson et al. 2017) testing?

I want to test a function, g_lnorm from a package I’m working from.

## what to test

Each function has a number of arguments. Each argument will have either a finite or infinite valid inputs, but will unlikely have infinite valid input data structures.

In addition to data types, there is also the question of dimensionality. Some arguments must be length 1, and so forth.

So, we need to check that the function behaves as it should.

We also need to check that the function misbehaves as it should when passed something that it shouldn’t.

But what exactly is behave as it should?

Well, we need to consider every possible valid input and combination thereof. This sounds overwhelming, however. So, what is a minimum testing standard. Where to start?

### what to test

The function g_norm takes arguments with specified defaults and some without.

#### check equivalence classes

Numerics are a bit tricky, because we need to consider a few cases.

argument type equivalence classes
logical TRUE | FALSE
numeric $$[-\infty, -1], (-1, 0), [0,0], (0,1), [1, \infty]$$
character context dependent
function runs, and$$\dots$$?

Checking each equivalence class produces an output of the type expected seems like an excellent way to begin.

Often I check and check the mathematics, only to find it was the code that was broken. I believe there’s much that statisticians can learn from. Often simply checking in turn that the function runs for an arbitrary choice for each argument.

I like drawing a sample at the start to represent each of the equivalence classes, if it’s not too unwieldy. That way my test is taking an arbitrary representative of each equivalence class; and I’m explicit about what I’m testing. It doesn’t cover all possible cases, if I don’t set up my equivalence classes correctly, but it’s a start. It’s enough to see if the code is running as I go.

## more tools for testing

In Parker’s rumination on opinionated analysis, assertr and validate are noted as tools that can help with testing(???). How?

### assertr

I’m unclear how to integrate assertr::(Fischetti 2018) into testthat:: and not sure I’m meant to. But it does seem like it could help me assert that my assumptions about my pipe are correct.

If any of these assertions were violated, an error would have been raised and the pipeline would have been terminated early(Fischetti 2018).

I’m always doing this type of testing in the console; truncating my %>% and adding dim or nrow or something to the end. From my understanding of asserts I can build these tests into my pipes and functionally they act as the identify if it passes the check.

Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa

# A tibble: 6 x 12
dist  par_1 par_2 sample_size rsample rquantile rdensity rprob
<chr> <dbl> <lis>       <dbl> <list>  <list>    <list>   <lis>
1 cauc…     2 <dbl…          15 <fn>    <fn>      <fn>     <fn>
2 cauc…     2 <dbl…          30 <fn>    <fn>      <fn>     <fn>
3 cauc…     2 <dbl…         100 <fn>    <fn>      <fn>     <fn>
4 cauc…     2 <dbl…          15 <fn>    <fn>      <fn>     <fn>
5 cauc…     2 <dbl…          30 <fn>    <fn>      <fn>     <fn>
6 cauc…     2 <dbl…         100 <fn>    <fn>      <fn>     <fn>
# … with 4 more variables: sim_id <int>, par_list <list>,
#   true_median <dbl>, true_iqr <dbl>

### validate

In the readme for validate::, we have this description:

The validate R-package makes it super-easy to check whether data lives up to expectations you have based on domain knowledge. It works by allowing you to define data validation rules independent of the code or data set.

What does define data validation rules mean?

The validate package is intended to make checking your data easy, maintainable and reproducible(Loo and Jonge 2018).

Still not entirely sure what defining validation rules are, but this seems very neat. You can check the data is what you want it to be.

For example, I require all true medians to be positive, as I’ll take $$\log$$.

name items passes fails nNA error warning      expression
1   V1    87     87     0   0 FALSE   FALSE true_median > 0

Well now, isn’t that handy?

## collecting my thoughts on this

So, I’ve been working on tests in earnest to solve a problem on this package recently. And I find myself persevering rather doggedly with testthat::auto_test. It really does help me debug more efficiently. This is a workflow changer for me.

Fischetti, Tony. 2018. Assertr: Assertive Programming for R Analysis Pipelines. https://CRAN.R-project.org/package=assertr.

Hester, Jim. 2018. Covr: Test Coverage for Packages. https://CRAN.R-project.org/package=covr.

Loo, Mark van der, and Edwin de Jonge. 2018. Validate: Data Validation Infrastructure. https://CRAN.R-project.org/package=validate.

Wilson, Greg, Jennifer Bryan, Karen Cranston, Justin Kitzes, Lex Nederbragt, and Tracy K. Teal. 2017. “Good Enough Practices in Scientific Computing.” Edited by Francis Ouellette. PLOS Computational Biology 13 (6): e1005510. https://doi.org/10.1371/journal.pcbi.1005510.