Neptune

Neptune is a metadata store for MLOps, built for teams that run a lot of experiments.

It gives you a single place to log, store, display, organize, compare, and query all your model-building metadata.

Neptune is used for: * Experiment tracking: Log, display, organize, and compare ML experiments in a single place. * Model registry: Version, store, manage, and query trained models, and model building metadata. * Monitoring ML runs live: Record and monitor model training, evaluation, or production runs live

Getting started

Register

Go to https://neptune.ai/ and sign up.

You can use Neptune for free for work, research, and personal projects. All individual accounts are free within quota.

Install Neptune R package

If you don’t have a Python installed and just want use Neptune, simply run the following code:

install.packages("reticulate")
library(reticulate)
install_miniconda()

install.packages("neptune")
library(neptune)
neptune_install()

This code will install miniconda (a minimalistic Python environment) and set it up for you. This is a one time thing and after that you only need to run the last two lines when you want to work with Neptune:

library(neptune)
neptune_install()

If you have a Python virtual environment already setup (conda, miniconda, or virtualenv), you can point to it instead of creating a new one:

# If you are using virtualenv
install.packages("neptune")
library(neptune)
neptune_install(method="virtualenv", envname = "PATH/TO/YOUR/VIRTUALENV")


# If you are using conda or miniconda
install.packages("neptune")
library(neptune)
neptune_install(method="conda", envname = "PATH/TO/YOUR/CONDA/ENVIRONMENT")

Create a tracked run

run <- neptune_init(project="MY_WORKSPACE/MY_PROJECT",
                    api_token="NEPTUNE_API_TOKEN")

This code creates a Run in the project of your choice. This will be your gateway to log metadata to Neptune.

You need to pass your credentials (project and API token) to the neptune_init() method. You can also set the API token globally:

neptune_set_api_token(token = "NEPTUNE_API_TOKEN")

API token

To find your API token: * Go to the Neptune UI * Open the User menu toggle in the upper right * Click Get your API token * Copy your API token

or get your API token directly from here.

Project

The project argument has the format WORKSPACE_NAME/PROJECT_NAME.

To find it: * Go to the Neptune UI * Go to your project * Open Settings > Properties * Copy the project name

Stop tracking

Once you are finished with tracking metadata you need to stop the tracking for that particular run:

neptune_stop(run)

# Note that you can also use reticulate based syntax:
run$stop()

If you are running a script it will stop tracking automatically at the end. However, in interactive environment such as RStudio you need to to stop it explicitly.

Track metadata

Log hyperparameters

params <- list(
  "dense_units"= 128,
  "activation"= "relu",
  "dropout"= 0.23,
  "learning_rate"= 0.15,
  "batch_size"= 64,
  "n_epochs"= 30
)
run["parameters"] <- params

If you have parameters in form of a dictionary you can log them to Neptune in batch. It will create a field with the appropriate type for each dictionary entry. You can update the hyperparameters or add new ones later in the code:

# Add additional parameters 
run["model/parameters/seed"] <- RANDOM_SEED

# Update parameters e.g. after triggering an early stopping
run["model/parameters/n_epochs"] <- epoch

Log training metrics

for (i in 1:epochs) {
  [...] # My training loop
  neptune_log(run["train/epoch/loss"], loss)
  neptune_log(run["train/epoch/accuracy"], acc)
}

# Note that you can also use reticulate based syntax:
run["train/epoch/loss"]$log(loss)

You can log training metrics to Neptune using series fields. In Neptune there are three types of series - float series, string series, and file series. Each neptune_log() will add a new value at the end of the series.

Log evaluation results

run["evaluation/accuracy"] <- eval_acc
run["evaluation/loss"] <- eval_loss

To log evaluation metrics simply assign them to a field of your choice. Using the snippet above, both evaluation metrics will be stored in the same evaluation namespace.

neptune_upload(run["evaluation/ROC"], "roc.png")

# You can upload ggplot plots directly without saving them to a file
neptune_upload(run["evaluation/ROC"], ggplot_roc)

# If you want to control additional parameters like size of the plot you can pass the same arguments as to ggsave
neptune_upload(run["evaluation/ROC"], ggplot_roc, width=20, height=20, units="cm")

# Note that you can also use reticulate based syntax:
run["evaluation/ROC"]$upload("roc.png")
run["evaluation/ROC"]$upload(ggplot_roc)

You can log plots and charts easily using the neptune_upload() function. In the case of a ggplot object, it gets converted to an image file and uploaded, but you can also upload images from the local disc.

Upload model file

You can upload any binary file (e.g. model weights) from disk using the neptune_upload() method. If your model is saved as multiple files you can upload a whole folder as a FileSet using neptune_upload_files().

# Upload a single fingle sile
neptune_upload(run["model"], "model.Rdata")

# You can also upload folders in batch if you don't need access to the separate files
neptune_upload_files(run["model"], "models")

# Note that you can also use reticulate based syntax:
run["model"]$upload("model.Rdata")
run["model"]$upload_files("models")

Getting help

If you get stuck, don’t worry we are here to help: