Parameterized Letters for Mass Mailings

Robert Nuske

2023-02-26

Mail merge is just a variant of R Markdown’s parameterized reports using komaletter. The process is similar to mail merge in traditional word processing applications:

TLDR: Example

Template letter (saved as template_letter.Rmd):

---
author: Max Mustermann
return-address: [Musterstr. 12, 34567 Musterstadt]
signature: Max

params: 
  name: DefaultNickname
  address: "[Default Name, 123 Default St, Default Town]"
  gift: DefaultGift

output: komaletter::komaletter
---
---
address: `r params$address`
opening: `r paste0('Dear ', params$name, ',')`
closing: "Yours truly,"
---

thank you very much for the beautiful `r params$gift`. It was a pleasure to have you.

Collect data about recipients in data.frame recipients:

recipients <- data.frame(name=c("Bob", "Megan", "Alex"),
                         gift=c("painting", "candlestick", "book"),
                         address=c("[Robert Pitts, 5543 Aliquet St, Fort Dodge GA 20783]",
                                   "[Megan Smith, 4156 Tincidunt Ave, Green Bay IN 19759]",
                                   "[Alexander Fitzgerald, 869 Laurel Ave, St Paul MN 55104]"),
                         stringsAsFactors=FALSE)

Combine template letter with recipients’ data and create lots of PDFs:

for(i in 1:nrow(recipients)){
  rmarkdown::render("template_letter.Rmd", 
                    params=list(name=recipients[i, "name"], 
                                gift=recipients[i, "gift"], 
                                address=recipients[i, "address"]),
                    output_file=paste0("letter_", recipients[i, 'name'], ".pdf"))
}

Basics of Mail Merge with komaletter

To personalize letters for mass mailings, you can include one or more placeholders (parameters) in a komaletter. These parameters need to be declared first and then put to use later in the document. While rendering the letters, you can assign various values to the parameters.

Declaring Parameters

Parameters are declared in the YAML metadata header using the params field. You can specify one or more parameters each on a new line. The default values (below: John & candlestick) given during parameter declaration, will be used if a parameter is not provided during rendering.

---
params:
  name: John
  gift: candlestick

output: komaletter::komaletter
---

Using Parameters

The declared parameters are automatically made available within the knit environment as components of the read-only list params. For example, the values of the above parameters can be accessed with the following R Code:

params$name
params$gift

If the value of a parameter shall be used in the YAML metadata header, the parameter must be declared previously. Since the backtick ` is a reserved character in YAML, the inline R Code snippet has to be wrapped in quotes.

---
params:
  name: John
  gift: candlestick
  
subject: "`r params$gift`"

output: komaletter::komaletter
---

Setting Parameter values

To set the parameter values, you can add the params argument to rmarkdown::render. If a parameter does not get a value, the default defined during parameter declaration is used (eg. John, candlestick).

rmarkdown::render("example.Rmd", params=list(name="Jane"))

Challenges of Mail Merge with komaletter

R Markdown and thus komaletter combines R Code, YAML and Markdown. R Markdown documents are processed by knitr to pure YAML and Markdown which in turn is send to pandoc for conversion to the final document type (pdf in this case). A parametrized letter has to obey the restrictions of all parts involved.

The most common personalization are the address of the recipient and the salutation or opening. komaletter expects the address to be a YAML sequence within the YAML metadata header. YAML sequences can be written in flow or block style. Both need quotation marks to protect the square brackets or to enable the escape code \n during parameter declaration in the YAML metadata.

---
params:
  # scalar:
  name: John Doe
  # flow style sequence:
  address_flow: "[FirstName LastName, 123 Main St, Anytown]"
  # block style sequence:
  address_block: "\n  - FirstName LastName\n  - 123 Main St\n  - Anytown"
---

Since the address and the letter opening are defined in the YAML metadata. The corresponding parameters must be declared and afterwards put to use within the YAML metadata. Parameter values are accessed in R Code. In the YAML metadata header this means inline code snippets `r expression`. Since backticks ` are ‘reserved indicators’ in YAML, the code snippet usually needs to be wrapped in quotes.

---
params:
  name: John Doe
  
opening: "`r paste0('Dear ', params$name, ',')`"

output: komaletter::komaletter
---

The result of the quoted R expression "`r expression`" is a value enclosed in double quotes, i.e. "value", which is not harmful if the result is of R type character corresponding to a YAML scalar. YAML scalars can be enclosed in single or double quotes or not wrapped at all. But everything in quotes is a scalar to YAML.

Consequently this means that expressions that are supposed to supply an address and thus a YAML sequence, must not be enclosed in quotes! Which poses a main problem since we learned above inline code snippets need to be enclosed in quotes in the main YAML metadata header.

Solution

Or rather a hack found as bycatch at stackoverflow circumventing the issue at the moment.

As explained above, the address is a YAML sequence and sequences can not be enclosed in quotes. To access the parameter in the main YAML metadata header the inline R Code snippet has to be wrapped in quotes and R expressions wrapped in quotes evaluate to a value enclosed in quotes.

Mysteriously, the rule that backticks must be enclosed in quotes is not enforced in a second YAML metadata block. You can use two metadata blocks because Pandoc combines YAML metadata blocks while converting the .md file to the final document. So in a second metadata block the inline R Code snippet can be written without quotes resulting in a correctly formatted YAML sequence.

---
author: Max Mustermann
return-address: [Musterstr. 12, 34567 Musterstadt]
signature: Max

params: 
  name: Johnny
  address: "[John Doe, 123 Main St, Anytown]"
  gift: flowers

output: komaletter::komaletter
---
---
address: `r params$address`
opening: `r paste0('Dear ', params$name, ',')`
closing: "Yours truly,"
---

thank you very much for the beautiful `r params$gift`. It was a pleasure to have you.

The template letter above can then be called with various values

recipients <- data.frame(name=c("Bob", "Megan", "Alex"),
                         gift=c("painting", "candlestick", "book"),
                         address=c("[Robert Pitts, 5543 Aliquet St, Fort Dodge GA 20783]",
                                   "[Megan Smith, 4156 Tincidunt Ave, Green Bay IN 19759]",
                                   "[Alexander Fitzgerald, 869 Laurel Ave, St Paul MN 55104]"),
                         stringsAsFactors=FALSE)


for(i in 1:nrow(recipients)){
  rmarkdown::render("template_letter.Rmd", 
                    params=list(name=recipients[i, "name"], 
                                gift=recipients[i, "gift"], 
                                address=recipients[i, "address"]),
                    output_file=paste0("letter_", recipients[i, 'name'], ".pdf"))
}

Further Information

Hints

Since parameters are evaluated roughly speaking during document knitting (conversion from .Rmd to .md), it is helpful to look at the intermediate .md files to hunt down issues in parameterized letters.

Since komaletter is based on rmarkdown’s output format pdf_document, you can use the argument keep_md to obtain the intermediary .md file (rmarkdown >= 1.14). Previously one had to resort to rmarkdown::render(..., clean=FALSE) to cause the intermediate .md files to remain available.