---
title: "GPU / Vulkan Backend"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{GPU / Vulkan Backend}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(eval = TRUE)
```

ggmlR is GPU-first: Vulkan is auto-detected at install time and used by default
when available. All APIs (sequential, functional, autograd, ONNX) fall back to
CPU transparently when no GPU is present.

```{r}
library(ggmlR)
```

---

## 1. Check availability

```{r}
if (ggml_vulkan_available()) {
  cat("Vulkan is available\n")
  ggml_vulkan_status()              # print device list and properties
} else {
  cat("No Vulkan GPU — running on CPU\n")
}

n <- ggml_vulkan_device_count()
cat("Vulkan device count:", n, "\n")
```

---

## 2. Device enumeration

```{r}
# Low-level device registry (all backends including CPU)
ggml_backend_load_all()

n_dev <- ggml_backend_dev_count()
for (i in seq_len(n_dev)) {
  dev  <- ggml_backend_dev_get(i - 1L)   # 0-based
  name <- ggml_backend_dev_name(dev)
  desc <- ggml_backend_dev_description(dev)
  mem  <- ggml_backend_dev_memory(dev)
  cat(sprintf("[%d] %s — %s\n", i, name, desc))
  cat(sprintf("    %.1f GB free / %.1f GB total\n",
              mem["free"] / 1e9, mem["total"] / 1e9))
}
```

Full example: `inst/examples/device_discovery.R`

---

## 3. Autograd device selection

```{r}
# Select GPU (falls back to CPU if unavailable)
device <- tryCatch({
  ag_device("gpu")
  "gpu"
}, error = function(e) {
  message("GPU not available, using CPU")
  "cpu"
})

cat("Active device:", device, "\n")
```

After `ag_device("gpu")`, all subsequent `ag_param` and `ag_tensor` calls
allocate on the GPU.  Switch back with `ag_device("cpu")`.

---

## 4. Mixed precision (f16 / bf16)

```{r}
if (device == "gpu") {
  ag_dtype("f16")     # half-precision on Vulkan GPU
  # ag_dtype("bf16") # bfloat16 — falls back to f16 on Vulkan automatically
} else {
  ag_dtype("f32")     # full precision on CPU
}

cat("Active dtype:", ag_default_dtype(), "\n")
```

`bf16 → f16` fallback is automatic on Vulkan because bf16 is not natively
supported in GLSL shaders.

---

## 5. Vulkan memory info

```{r}
if (ggml_vulkan_available()) {
  mem <- ggml_vulkan_device_memory(0L)
  cat(sprintf("GPU memory: %.1f MB free / %.1f MB total\n",
              mem$free / 1e6, mem$total / 1e6))
}
```

---

## 6. Multi-GPU

ggmlR supports multiple Vulkan devices via the backend scheduler.
`dp_train()` distributes data across replicas automatically:

```{r}
n_gpu <- ggml_vulkan_device_count()
cat(sprintf("Using %d GPU(s)\n", n_gpu))

# dp_train handles multi-GPU internally — see vignette("data-parallel-training")
```

For low-level multi-GPU scheduler usage see `inst/examples/multi_gpu_example.R`.

---

## 7. Sequential / functional API on GPU

The high-level `ggml_fit()` API picks up the Vulkan backend automatically —
no extra configuration needed:

```{r}
data(iris)
x_train <- scale(as.matrix(iris[, 1:4]))
y_train <- model.matrix(~ Species - 1, iris)

model <- ggml_model_sequential() |>
  ggml_layer_dense(64L, activation = "relu", input_shape = 4L) |>
  ggml_layer_dense(3L,  activation = "softmax") |>
  ggml_compile(optimizer = "adam", loss = "categorical_crossentropy")

# Training runs on GPU if Vulkan is available
model <- ggml_fit(model, x_train, y_train, epochs = 50L,
                  batch_size = 32L, verbose = 0L)
```

---

## 8. ONNX inference on GPU

```{r, eval = FALSE}
# Weights loaded to GPU once at load time
model_onnx <- ggml_onnx_load("model.onnx", backend = "vulkan")

# Repeated inference — no weight re-transfer
for (i in seq_len(100L)) {
  out <- ggml_onnx_run(model_onnx, list(input = batch[[i]]))
}
```

---

## 9. Build configuration

Vulkan support is compiled in automatically when `libvulkan-dev` and `glslc`
are detected at install time.  SIMD (AVX2/AVX512) for the CPU fallback
requires an explicit flag:

```bash
# Default install (Vulkan auto-detected, CPU fallback without SIMD)
R CMD INSTALL .

# With CPU SIMD acceleration
R CMD INSTALL . --configure-args="--with-simd"
```

To check what was compiled in:

```{r}
cat(ggml_version(), "\n")
ggml_vulkan_status()   # shows "Vulkan not available" if not compiled in
```
