scholidonline provides online utilities for working with
scholarly identifiers. It builds on scholid
for structural detection and normalization, and adds registry-backed
functionality such as:
This vignette introduces the interface and typical workflows when working with registry-connected identifier data.
scholidonline exposes a small set of user-facing
functions:
scholidonline_types()scholidonline_capabilities()id_exists()id_convert()id_metadata()id_links()You can inspect which identifier types are supported:
scholidonline is registry-driven. You can inspect all
supported operations, conversions, and providers:
| type | operation | target | providers | default_provider |
|---|---|---|---|---|
| arxiv | exists | NA | auto, arxiv | arxiv |
| arxiv | links | NA | auto, arxiv | arxiv |
| arxiv | meta | NA | auto, arxiv | arxiv |
| assembly | exists | NA | auto, ncbi | ncbi |
| assembly | meta | NA | auto, ncbi | ncbi |
| bioproject | exists | NA | auto, ncbi | ncbi |
| bioproject | meta | NA | auto, ncbi | ncbi |
| doi | exists | NA | auto, doi.org, crossref | doi.org |
| doi | links | NA | auto, crossref | crossref |
| doi | meta | NA | auto, crossref, doi.org | crossref |
| doi | convert | pmid | auto, ncbi, epmc | ncbi |
| doi | convert | pmcid | auto, ncbi, epmc | ncbi |
| geo | exists | NA | auto, ncbi | ncbi |
| geo | meta | NA | auto, ncbi | ncbi |
| openalex | exists | NA | auto, openalex | openalex |
| openalex | links | NA | auto, openalex | openalex |
| openalex | meta | NA | auto, openalex | openalex |
| openalex | convert | doi | auto, openalex | openalex |
| openalex | convert | pmid | auto, openalex | openalex |
| orcid | exists | NA | auto, orcid | orcid |
| orcid | links | NA | auto, orcid | orcid |
| orcid | meta | NA | auto, orcid | orcid |
| pmcid | exists | NA | auto, ncbi, epmc | ncbi |
| pmcid | links | NA | auto, ncbi, epmc | ncbi |
| pmcid | meta | NA | auto, ncbi, epmc | ncbi |
| pmcid | convert | pmid | auto, ncbi, epmc | ncbi |
| pmcid | convert | doi | auto, ncbi, epmc | ncbi |
| pmid | exists | NA | auto, ncbi, epmc | ncbi |
| pmid | links | NA | auto, ncbi, epmc | ncbi |
| pmid | meta | NA | auto, ncbi, epmc | ncbi |
| pmid | convert | doi | auto, ncbi, epmc | ncbi |
| pmid | convert | pmcid | auto, ncbi, epmc | ncbi |
| refseq | exists | NA | auto, ncbi | ncbi |
| refseq | meta | NA | auto, ncbi | ncbi |
| ror | exists | NA | auto, ror | ror |
| ror | meta | NA | auto, ror | ror |
| sra | exists | NA | auto, ncbi | ncbi |
| sra | meta | NA | auto, ncbi | ncbi |
| uniprot | exists | NA | auto, uniprot | uniprot |
| uniprot | meta | NA | auto, uniprot | uniprot |
Not every supported type offers every operation. For example, ROR and UniProt support existence checks and metadata, while DOI and PMID also support linked identifiers and conversion. To inspect one type:
| type | operation | target | providers | default_provider | |
|---|---|---|---|---|---|
| 15 | openalex | exists | NA | auto, openalex | openalex |
| 16 | openalex | links | NA | auto, openalex | openalex |
| 17 | openalex | meta | NA | auto, openalex | openalex |
| 18 | openalex | convert | doi | auto, openalex | openalex |
| 19 | openalex | convert | pmid | auto, openalex | openalex |
id_exists()id_exists() verifies whether identifiers exist in their
respective registries.
If type = NULL, the type is inferred automatically:
Return values:
id_convert()Many scholarly identifiers are cross-linked across systems.
Common examples:
If from = NULL, the source type is inferred per
element:
Unresolvable mappings return NA_character_.
id_metadata()id_metadata() retrieves harmonized metadata from
external registries.
Metadata completeness depends on the registry. For NCBI accession
types such as BioProject, title is the short registry title
from Entrez ESummary, not the full project description on the NCBI
website; use url for the complete record.
You can restrict returned fields:
id_links()id_links() returns related identifiers discovered via
registry queries. Returns an empty table when the provider exposes no
linked identifiers for that record.
The result is a long data.frame with one row per link. When no links are found, the same columns are returned with zero rows.
A common workflow for messy identifier columns:
scholid)Example:
x <- c(
"https://doi.org/10.1000/182",
"PMCID: PMC1234567",
"not an id"
)
types <- scholid::detect_scholid_type(x)
x_norm <- rep(NA_character_, length(x))
for (i in seq_along(x)) {
if (is.na(types[i])) {
next
}
x_norm[i] <- scholid::normalize_scholid(
x = x[i],
type = types[i]
)
}
types
x_normid_exists(x) below uses the default
type = "auto", so each element is classified and normalized
automatically. You do not need to pass a vector type
argument.
Most functions accept a provider argument.
scholidonline::id_exists(
x = "10.1000/182",
type = "doi",
provider = "crossref"
)
scholidonline::id_exists(
x = "10.1000/182",
type = "doi",
provider = "doi.org"
)If provider = "auto" (default), a sensible registry is
chosen automatically, potentially with fallback behavior.
Available providers depend on the identifier type and operation. Use
scholidonline_capabilities() to inspect them.
The chosen provider affects:
scholidonline focuses on identifier types with stable
public registries and accessible APIs. The package supports online
operations for:
Not every type supports every operation. For example, ROR and UniProt
support existence checks and metadata, while DOI and PMID additionally
support linked identifiers and conversion. Use
scholidonline_capabilities() as the authoritative
summary.
Many other identifier types (e.g., ISBN, ISSN, bibcode, RRID) are
structurally supported by scholid, but are not covered by
scholidonline because they lack a stable, open registry API
fit for this package.