Architecture

This vignette provides a high-level overview of the core architectural components in LaminDB. Understanding these concepts will help you navigate the system and effectively manage your data and metadata.

Core concepts

LaminDB is built around a few key ideas:

Instance

A LaminDB instance is a self-contained environment for storing and managing data and metadata. You can think of it like a database or a project directory. Each instance has its own:

For more information about instances, see ?connect() and ?Instance.

Module

A module in LaminDB is a collection of related registries that provide functionality in a specific domain. For example:

Modules help organize the system and make it easier to find the specific registries you need.

For more information about modules, see ?Module. The core module is documented in the module_core vignette: vignette("module_core", package = "laminr").

Registry

A registry is a centralized collection of related records. It’s like a table in a database, where each row represents a specific entity. Examples of registries include:

Each registry has a defined structure with specific fields that hold relevant information.

For more information about registries, see ?Registry. The core registries are documented in the module_core vignette: vignette("module_core", package = "laminr").

Field

A field is a single piece of information within a registry. It’s analogous to a column in a database table. For example, the Artifact registry might have fields like:

Fields define the type of data that can be stored in a registry and provide a way to organize and query the metadata.

For more information about fields, see ?Field. The fields of core registries are documented in the module_core vignette: vignette("module_core", package = "laminr").

Record

A record is a single entry within a registry. It’s like a row in a database table. A record combines multiple fields to represent a specific entity. For example, a record in the Artifact registry might represent a single dataset with its key, storage location, description, creator, and other relevant information.

Putting it together

In essence, you have instances that contain modules. Each module contains registries, which in turn hold records. Every record is composed of multiple fields. This hierarchical structure allows for flexible and organized management of data and metadata within LaminDB.

Class structure

The laminr package provides a set of classes that mirror the core concepts of LaminDB. These classes allow you to interact with instances, modules, registries, fields, and records in a programmatic way.

The package provides two sets of classes: the base classes and the sugar syntax classes.

Base classes

These classes provide the core functionality for interacting with LaminDB instances, modules, registries, fields, and records. These are the classes that are documented via ?Instance, ?Module, ?Registry, ?Field, and ?Record.

The class diagram below illustrates the relationships between these classes.

However, they are not intended to be used directly in most cases. Instead, the sugar syntax classes provide a more user-friendly interface for working with LaminDB data.

laminr
+connect(String slug) : : RichInstance
Instance
+is_default: Boolean
+initialize(
 InstanceSettings Instance_settings, API api,
 Map schema
) : : Instance
+get_modules() : : Module[]
+get_module(String module_name) : : Module
+get_module_names() : : String[]
+get_api() : : InstanceAPI
+get_settings() : : InstanceSettings
+get_py_lamin(Boolean check, String what) : : PythonModule
+track(String path, String transform) : : NULL
+finish() : : NULL
UserSettings
+email: String
+access_token: String
+uid: String
+uuid: String
+handle: String
+name: String
+initialize(...) : : UserSettings
InstanceSettings
+owner: String
+name: String
+id: String
+schema_id: String
+api_url: String
+initialize(...) : : InstanceSettings
InstanceAPI
+initialize(InstanceSettings Instance_settings)
+get_schema() : : Map
+get_record(...) : : Map
+get_records(...) : : Map
+delete_record(...) : : NULL
Module
+name: String
+initialize(
 Instance Instance, API api, String module_name,
 Map module_schema
) : : Module
+get_registries() : : Registry[]
+get_registry(String registry_name) : : Registry
+get_registry_names() : : String[]
Registry
+name: String
+class_name: String
+is_link_table: Bool
+initialize(
 Instance Instance, Module module, API api,
 String registry_name, Map registry_schema
) : : Registry
+get_fields() : : Field[]
+get_field(String field_name) : : Field
+get_field_names() : : String[]
+get(
 String id_or_uid, Bool include_foreign_keys,
 List<String> select, Bool verbose
) : : RichRecord
+get_record_class() : : RichRecordClass
+get_temporary_record_class() : : TemporaryRecordClass
+df(Integer limit, Bool verbose) : : DataFrame
+from_df(
 DataFrame dataframe, String key,
 String description, String run
) : : TemporaryRecord
+from_path(
 Path path, String key, String description, String run
) : : TemporaryRecord
+from_anndata(
 AnnData adata, String key, String description, String run
) : : TemporaryRecord
Field
+type: String
+through: Map
+field_name: String
+registry_name: String
+column_name: String
+module_name: String
+is_link_table: Bool
+relation_type: String
+related_field_name: String
+related_registry_name: String
+related_module_name: String
+initialize(
 String type, String through, String field_name,
 String registry_name, String column_name, String module_name,
 Bool is_link_table, String relation_type, String related_field_name,
 String related_registry_name, String related_module_name
) : : Field
Record
+initialize(
 Instance Instance, Registry registry,
 API api, Map data
) : : Record
+get_value(String field_name) : : Any
+delete() : : NULL
RelatedRecords
+field: Field
+initialize(
 Instance instance, Registry registry, Field field,
 String related_to, API api
) : : RelatedRecords
+df() : : DataFrame

Sugar syntax classes

The sugar syntax classes provide a more user-friendly way to interact with LaminDB data. These classes are designed to make it easier to access and manipulate instances, modules, registries, fields, and records.

For example, to get an artifact with a specific ID using only base classes, you might write:

db <- connect("laminlabs/cellxgene")

artifact <- db$get_module("core")$get_registry("artifact")$get("KBW89Mf7IGcekja2hADu")

artifact$get_value("id")

With the sugar syntax classes, you can achieve the same result more concisely:

db <- connect("laminlabs/cellxgene")

artifact <- db$Artifact$get("KBW89Mf7IGcekja2hADu")

artifact$id

This sugar syntax is achieved by creating RichInstance and RichRecord classes that inherit from Instance and Record, respectively. These classes provide additional methods and properties to simplify working with LaminDB data.

Class diagram

The class diagram below illustrates the relationships between the sugar syntax classes in the laminr package. These classes provide a more user-friendly interface for interacting with LaminDB data.

laminr
+connect(String slug) : : RichInstance
UserSettings
+email: String
+access_token: String
+uid: String
+uuid: String
+handle: String
+name: String
+initialize(...) : : UserSettings
InstanceSettings
+owner: String
+name: String
+id: String
+schema_id: String
+api_url: String
+initialize(...) : : InstanceSettings
Instance
+is_default: Boolean
+initialize(
 InstanceSettings Instance_settings, API api,
 Map schema
) : : Instance
+get_modules() : : Module[]
+get_module(String module_name) : : Module
+get_module_names() : : String[]
+get_api() : : InstanceAPI
+get_settings() : : InstanceSettings
+get_py_lamin(Boolean check, String what) : : PythonModule
+track(String path, String transform) : : NULL
+finish() : : NULL
InstanceAPI
+initialize(InstanceSettings Instance_settings)
+get_schema() : : Map
+get_record(...) : : Map
+get_records(...) : : Map
+delete_record(...) : : NULL
Module
+name: String
+initialize(
 Instance Instance, API api, String module_name,
 Map module_schema
) : : Module
+get_registries() : : Registry[]
+get_registry(String registry_name) : : Registry
+get_registry_names() : : String[]
Registry
+name: String
+class_name: String
+is_link_table: Bool
+initialize(
 Instance Instance, Module module, API api,
 String registry_name, Map registry_schema
) : : Registry
+get_fields() : : Field[]
+get_field(String field_name) : : Field
+get_field_names() : : String[]
+get(
 String id_or_uid, Bool include_foreign_keys,
 List<String> select, Bool verbose
) : : RichRecord
+get_record_class() : : RichRecordClass
+get_temporary_record_class() : : TemporaryRecordClass
+df(Integer limit, Bool verbose) : : DataFrame
+from_df(
 DataFrame dataframe, String key,
 String description, String run
) : : TemporaryRecord
+from_path(
 Path path, String key, String description, String run
) : : TemporaryRecord
+from_anndata(
 AnnData adata, String key, String description, String run
) : : TemporaryRecord
Field
+type: String
+through: Map
+field_name: String
+registry_name: String
+column_name: String
+module_name: String
+is_link_table: Bool
+relation_type: String
+related_field_name: String
+related_registry_name: String
+related_module_name: String
+initialize(
 String type, String through, String field_name,
 String registry_name, String column_name, String module_name,
 Bool is_link_table, String relation_type, String related_field_name,
 String related_registry_name, String related_module_name
) : : Field
RelatedRecords
+field: Field
+initialize(
 Instance instance, Registry registry, Field field,
 String related_to, API api
) : : RelatedRecords
+df() : : DataFrame
Record
+initialize(
 Instance Instance, Registry registry,
 API api, Map data
) : : Record
+get_value(String field_name) : : Any
+delete() : : NULL
RichInstance
+initialize(
 InstanceSettings Instance_settings, API api,
 Map schema
): RichInstance
+Registry Artifact
+Registry Collection
+...registry accessors...
+Registry User
+Bionty bionty
Core
+Registry Artifact
+Registry Collection
+...registry accessors...
+Registry User
Bionty
+Registry CellLine
+Registry CellMarker
+...registry accessors...
+Registry Tissue
RichRecord
+...field value accessors...
TemporaryRecord
+save() : : NULL
Artifact
+...field value accessors...
+cache() : : String
+load() : : AnnData | DataFrame | ...
+open() : : SOMACollection | SOMAExperiment
+describe() : : NULL