fuzzr: Fuzz-Testing for R Functions

Matthew Lincoln

2018-05-08

R’s dynamic typing can be both blessing and curse. One drawback is that a function author must decide how to check which inputs should be accepted, and which should throw warnings or errors. fuzzr helps you to check how cleanly and informatively your function responds to a range of unexpected inputs.

Say we build a function intended to a single string and a single integer, repeat the string that number of times, and paste it together using a given delimiter:

my_function <- function(x, n, delim = " - ") {
  paste(rep(x, n), collapse = delim)
}

my_function("fuzz", 7)
## [1] "fuzz - fuzz - fuzz - fuzz - fuzz - fuzz - fuzz"

Simple enough. However, this function quickly breaks if we pass in somewhat unexpected values:

my_function("fuzz", "bar")
## Warning in paste(rep(x, n), collapse = delim): NAs introduced by coercion
## Error in rep(x, n): invalid 'times' argument

Let’s test this with a full battery of fuzz tests:

library(fuzzr)
# Note that, while we are specifically fuzz testing the 'n' argument, we still 
# need to provide an 'x' argument to pass along to my_function(). We do not have
# to supply a delimiter, as my_function() declares a default value for this
# argument.
my_fuzz_results <- fuzz_function(my_function, "n", x = 1:3, tests = test_all())

# Produce a data frame summary of the results
fuzz_df <- as.data.frame(my_fuzz_results)
knitr::kable(fuzz_df)
n x output messages warnings errors result_classes results_index
char_empty 1:3 NA NA NA invalid ‘times’ argument NA 1
char_single 1:3 NA NA NAs introduced by coercion invalid ‘times’ argument NA 2
char_single_blank 1:3 NA NA NA invalid ‘times’ argument NA 3
char_multiple 1:3 NA NA NAs introduced by coercion invalid ‘times’ argument NA 4
char_multiple_blank 1:3 NA NA NAs introduced by coercion invalid ‘times’ argument NA 5
char_with_na 1:3 NA NA NAs introduced by coercion invalid ‘times’ argument NA 6
char_single_na 1:3 NA NA NA invalid ‘times’ argument NA 7
char_all_na 1:3 NA NA NA invalid ‘times’ argument NA 8
int_empty 1:3 NA NA NA invalid ‘times’ argument NA 9
int_single 1:3 NA NA NA NA character 10
int_multiple 1:3 NA NA NA NA character 11
int_with_na 1:3 NA NA NA invalid ‘times’ argument NA 12
int_single_na 1:3 NA NA NA invalid ‘times’ argument NA 13
int_all_na 1:3 NA NA NA invalid ‘times’ argument NA 14
dbl_empty 1:3 NA NA NA invalid ‘times’ argument NA 15
dbl_single 1:3 NA NA NA NA character 16
dbl_mutliple 1:3 NA NA NA NA character 17
dbl_with_na 1:3 NA NA NA invalid ‘times’ argument NA 18
dbl_single_na 1:3 NA NA NA invalid ‘times’ argument NA 19
dbl_all_na 1:3 NA NA NA invalid ‘times’ argument NA 20
fctr_empty 1:3 NA NA NA invalid ‘times’ argument NA 21
fctr_single 1:3 NA NA NA NA character 22
fctr_multiple 1:3 NA NA NA NA character 23
fctr_with_na 1:3 NA NA NA invalid ‘times’ argument NA 24
fctr_missing_levels 1:3 NA NA NA NA character 25
fctr_single_na 1:3 NA NA NA invalid ‘times’ argument NA 26
fctr_all_na 1:3 NA NA NA invalid ‘times’ argument NA 27
lgl_empty 1:3 NA NA NA invalid ‘times’ argument NA 28
lgl_single 1:3 NA NA NA NA character 29
lgl_mutliple 1:3 NA NA NA NA character 30
lgl_with_na 1:3 NA NA NA invalid ‘times’ argument NA 31
lgl_single_na 1:3 NA NA NA invalid ‘times’ argument NA 32
lgl_all_na 1:3 NA NA NA invalid ‘times’ argument NA 33
date_single 1:3 NA NA NA NA character 34
date_multiple 1:3 NA NA NA invalid ‘times’ argument NA 35
date_with_na 1:3 NA NA NA invalid ‘times’ argument NA 36
date_single_na 1:3 NA NA NA invalid ‘times’ argument NA 37
date_all_na 1:3 NA NA NA invalid ‘times’ argument NA 38
raw_empty 1:3 NA NA NA invalid ‘times’ argument NA 39
raw_char 1:3 NA NA NA NA character 40
raw_na 1:3 NA NA NA invalid ‘times’ argument NA 41
df_complete 1:3 NA NA NA (list) object cannot be coerced to type ‘double’ NA 42
df_empty 1:3 NA NA NA invalid ‘times’ argument NA 43
df_one_row 1:3 NA NA NA invalid ‘times’ argument NA 44
df_one_col 1:3 NA NA NA invalid ‘times’ argument NA 45
df_with_na 1:3 NA NA NA (list) object cannot be coerced to type ‘double’ NA 46
null_value 1:3 NA NA NA invalid ‘times’ argument NA 47

Almost all the unexpected values for n throw the fairly generic warning invalid 'times' argument, which really comes from the rep function within my_function. Some types, like doubles, factors, and even dates (!) don’t throw errors, but instead return a result. We can check the value of that result with fuzz_value(), and the call originating it with fuzz_call(), both of which search for the first test result that matches a regex of the test name. The argument should match the name of the argument tested with in fuzz_function:

fuzz_call(my_fuzz_results, n = "dbl_single")
## $fun
## [1] "my_function"
## 
## $args
## $args$n
## [1] 1.5
## 
## $args$x
## [1] 1 2 3
fuzz_value(my_fuzz_results, n = "dbl_single")
## [1] "1 - 2 - 3"
fuzz_call(my_fuzz_results, n = "date_single")
## $fun
## [1] "my_function"
## 
## $args
## $args$n
## [1] "2001-01-01"
## 
## $args$x
## [1] 1 2 3
# Hm, dates can be coerced into very large integers. Let's see how long this
# result is.
nchar(fuzz_value(my_fuzz_results, n = "date_single"))
## [1] 135873
# Oh dear.

Perhaps we might chose to enforce this with a tailored type check (using assertthat) that catches unexpected values and produces a more informative error message.

my_function_2 <- function(x, n, delim = " - ") {
  assertthat::assert_that(assertthat::is.count(n))
  paste(rep(x, n), collapse = delim)
}

# We will abbreviate this check by only testing against double and date vectors
fuzz_df_2 <- as.data.frame(fuzz_function(my_function_2, "n", x = "fuzz", 
                                         tests = c(test_dbl(), test_date())))

knitr::kable(fuzz_df_2)
n x output messages warnings errors result_classes results_index
dbl_empty “fuzz” NA NA NA n is not a count (a single positive integer) NA 1
dbl_single “fuzz” NA NA NA n is not a count (a single positive integer) NA 2
dbl_mutliple “fuzz” NA NA NA n is not a count (a single positive integer) NA 3
dbl_with_na “fuzz” NA NA NA n is not a count (a single positive integer) NA 4
dbl_single_na “fuzz” NA NA NA missing value where TRUE/FALSE needed NA 5
dbl_all_na “fuzz” NA NA NA n is not a count (a single positive integer) NA 6
date_single “fuzz” NA NA NA n is not a count (a single positive integer) NA 7
date_multiple “fuzz” NA NA NA n is not a count (a single positive integer) NA 8
date_with_na “fuzz” NA NA NA n is not a count (a single positive integer) NA 9
date_single_na “fuzz” NA NA NA n is not a count (a single positive integer) NA 10
date_all_na “fuzz” NA NA NA n is not a count (a single positive integer) NA 11

Fuzzing multiple arguments

fuzz_function works by mapping several test inputs over one argument of a function while keeping the other arguments static. p_fuzz_function lets you specify a battery of tests for each variable as a named list of named lists. Every test combination is then run. These tests can be specified using the provided functions like test_char, or with variable inputs you provide. Remember that each test condition must, itself, be named.

p_args <- list(
  x = list(
    simple_char = "test",
    numbers = 1:3
  ),
  n = test_all(),
  delim = test_all())

pr <- p_fuzz_function(my_function_2, p_args)
prdf <- as.data.frame(pr)

knitr::kable(head(prdf))
x n delim output messages warnings errors result_classes results_index
simple_char char_empty char_empty NA NA NA n is not a count (a single positive integer) NA 1
numbers char_empty char_empty NA NA NA n is not a count (a single positive integer) NA 2
simple_char char_single char_empty NA NA NA n is not a count (a single positive integer) NA 3
numbers char_single char_empty NA NA NA n is not a count (a single positive integer) NA 4
simple_char char_single_blank char_empty NA NA NA n is not a count (a single positive integer) NA 5
numbers char_single_blank char_empty NA NA NA n is not a count (a single positive integer) NA 6

Specifying multiple arguments can quickly compound the number of total test combinations to run, so p_fuzz_function will prompt the user to confirm running more than 500,000 tests at once.