The randomNames package contains a single function allowing users to generate proportionally correct, gender and ethnicity specific first and last names. The package contains an embedded data set of names that is based upon a large scale data set of names from the United States where the ethnicity and gender of the person was identified. The embedded data set includes information about ethnicity, gender (first name), and the probability of the name occurring in the data set. Random names are often useful in situations where one needs to share personal information but needs to obscure the identity of the person to whom the data belongs.
After installing the package from either CRAN or GitHub, it’s simple to generate random names using the following simple instructions.
By default, the randomNames
function supplies a single
random last and first name separated by a comma:
To generate more random names, just supply a positive integer to the
randomNames
function:
> randomNames(5)
[1] "Twiss, Nicholas" "Gonzales, Aimee" "Martinez, Rashawnya" "Cross, Hunter" "Spellman, Victoria"
The randomNames function accepts several arguments
including n
, gender
, ethnicity
,
which.names
, name.order
, and
name.sep
.
randomNames(n,
gender,
ethnicity,
which.names="both",
name.order="last.first",
name.sep=", ",
sample.with.replacement=TRUE,
return.complete.data=FALSE)
For complete documentation on values accepted for arguments, see the function documentation
The first argument, n
, controls the number of names
returned by the randomNames function:
> randomNames(5) ## 5 last, first names
[1] "Bayona, Christopher" "Valentine, Allison" "Sandoval, Joseph" "Avants, Kali" "Elliott, Kharim"
The second argument, gender
, controls the gender
(0=male, 1=female) of the first names returned. This argument can be a
vector up to the same length as the number of names requested.
> randomNames(5, gender=1) ## 5 female last, first names
[1] "Siu, Abigail" "Lizardo, Elsa" "Wilcox, Taylor" "Sinath, Christine" "Zamora, Waylene"
>
> randomNames(5, gender=c(0,0,1,1,1)) ## 2 male and 3 female last, first names
[1] "Alcocer, John" "al-Hannan, Abdul Fattaah" "el-Zaki, Khaira" "Brown, Kelsie" "Stoor, Mickela"
The third argument, ethnicity
, controls the ethnicity of
the names returned. The following integer codes/ethnicities are
accepted:
> randomNames(5, gender=0, ethnicity=3) ## 5 African American, male last, first names
[1] "Magraff, Robert" "Fortenberry, Elijah" "Adams, Eli" "Henderson, Seiko" "Teshome, Patrick"
The fourth argument, which.names
controls which names
are returned. The argument accepts the values: both
(the
default), last
, first
, or
complete.data
> randomNames(5, gender=1, ethnicity=6, which.names="first") ## 5 Middle Eastern/Arabic, female first names
[1] "Shafee'a" "Fareeda" "Nusaiba" "Fidda" "Mu'hsina"
The fifth argument, name.order
, controls the order in
which the names are returned. The argument is only relevant if
which.names=both
. The argument accepts the values:
last.first
(the default) and first.last
.
> randomNames(5, gender=1, ethnicity=6, name.order="first.last") ## 5 first last names
[1] "Awaatif, al-Rafiq" "Aadila, al-Koroma" "Husniyya, al-Shahan" "Khairiya, el-Barakat" "Mutee'a, el-Atallah"
The sixth argument, name.sep
, is a character string that
controls the separator used when both names are returned. The
default separator is ,
.
If you have contribution or a feature request for the randomNames package, don’t hesitate to write or set up an issue on GitHub.