
Test data (SDTM) for the pharmaverse family of packages
To provide a one-stop-shop for SDTM test data in the pharmaverse
family of packages. This includes datasets that are therapeutic area
(TA)-agnostic (DM, VS, EG, etc.)
as well TA-specific ones (RS, TR,
OE, etc.).
The package is available from CRAN and can be installed by running
install.packages("pharmaversesdtm"). To install the latest
development version of the package directly from GitHub use the
following code:
if (!requireNamespace("remotes", quietly = TRUE)) {
install.packages("remotes")
}
remotes::install_github("pharmaverse/pharmaversesdtm", ref = "main") # This command installs the latest development version directly from GitHub.Some test datasets have been sourced from the CDISC pilot project, while other datasets have been constructed ad-hoc by the {admiral} team. Please check the Reference page for detailed information regarding the source of specific datasets.
dm, rs).oe_ophtha,
rs_onco, rs_onco_irecist).Note: If an SDTM domain is used by multiple TAs,
{pharmaversesdtm} may provide multiple versions of the
corresponding test dataset. For instance, the package contains
ex and ex_ophtha as the latter contains
ophthalmology-specific variables such as EXLAT and
EXLOC, and EXROUTE is exchanged for a
plausible ophthalmology value.
Firstly, make a GitHub issue in {pharmaversesdtm}
with the planned updates and tag @pharmaverse/admiral so
that one of the development core team can sanity check the request. Then
there are two main ways to extend the test data: either by adding new
datasets or extending existing datasets with new records/variables.
Whichever method you choose, it is worth noting the following:
data-raw/ folder.library() at the start of the program (but please do
not call library(pharmaversesdtm)).data-raw/
folder, you need to run it as a standalone R script, in order to
generate a test dataset that will become part of the
{pharmaversesdtm} package, but you do not need to build the
package..rda file whose
name is consistent with the name of the dataset, e.g., dataset
xx is stored as xx.rda. The easiest way to
achieve this is to use usethis::use_data(xx)data-raw/ are stored within the
{pharmaversesdtm} GitHub repository, but they are
not part of the {pharmaversesdtm}
package–the data-raw/ folder is specified in
.Rbuildignore.data-raw/ folder,
you generate a dataset that is written to the data/ folder,
which will become part of the {pharmaversesdtm}
package.R/*.R, for the purpose of generating documentation in the
man/ folder.Note: The documentation process in
{pharmaversesdtm} is automated for consistency and ease of
maintenance.
(inst/extdata/sdtms-specs.json){pharmaversesdtm} uses a single JSON file to store
metadata for all SDTM datasets. This file contains information such
as:
This metadata drives the automated documentation process, and the
file is read by data-raw/create_sdtms_data.R to help
generate:
.R files in R/.Rd files in man/Test Name/Test Code table inclusion (when
present)Therapeutic Area.data-raw/ folder, named
<name>.R, where <name> should
follow the naming convention, to generate the test
data and output <name>.rda to the data/
folder.
dm as input in this
program in order to create realistic synthetic data that remains
consistent with other domains (not mandatory).inst/extdata/sdtms-specs.json with the new
dataset metadata, including:
data-raw/create_sdtms_data.R in order to update
NAMESPACE and update the .Rd files in
man/..github/CODEOWNERS.NEWS.md.<name>.R in the
data-raw/ folder, update it accordingly.inst/extdata/sdtms-specs.json to reflect the changes,
including:
<name>.rda to
the data/ folder.data-raw/create_sdtms_data.R in order to update
NAMESPACE and update the .Rd files in
man/..github/CODEOWNERS.NEWS.md.Along with the authors and contributors, thanks to the following people for their work on the package:
G Gayatri, Pooja Kumari, Sadchla Mascary, Kangjie Zhang and Zelos Zhu.