Part 48: GPU Passthrough — The Subset That Actually Works
"GPU passthrough on a dev workstation is possible. It is also fragile. We support the subset that does not require rebuilding your kernel."
Why
Some workloads need a GPU: ML training, inference benchmarks, video transcoding, CUDA-accelerated simulations. A homelab user with an NVIDIA card on their workstation might reasonably want to expose it to a container running inside a VM. The naive approach is to install the NVIDIA driver inside the VM, mount the device, and hope. The reality is more complicated:
- VirtualBox does not officially support GPU passthrough at all. Workarounds exist, but they involve PCI device pass-through that only works on Linux hosts with VFIO and IOMMU. Not portable.
- Hyper-V supports GPU partitioning (
Set-VMGpuPartitionAdapter) on Windows hosts only. Reasonably reliable. - Parallels on macOS supports limited GPU virtualisation but not raw passthrough.
- libvirt/KVM supports VFIO passthrough on Linux. The most reliable, but requires Linux host.
The thesis of this part is: HomeLab supports GPU passthrough on the intersection of "your hypervisor supports it" and "your card is supported", with a clear failure mode when neither holds. The rest of the lab works without GPU. The user gets one config field; the contributor handles the platform-specific bits.
The shape
public sealed record GpuPassthroughSpec
{
public bool Enabled { get; init; }
public string? VendorId { get; init; } // "10de" for NVIDIA
public string? DeviceId { get; init; } // PCI device ID
public string Mode { get; init; } = "passthrough"; // "passthrough" | "partition"
public string? PartitionCount { get; init; } // for Hyper-V GPU-P
}public sealed record GpuPassthroughSpec
{
public bool Enabled { get; init; }
public string? VendorId { get; init; } // "10de" for NVIDIA
public string? DeviceId { get; init; } // PCI device ID
public string Mode { get; init; } = "passthrough"; // "passthrough" | "partition"
public string? PartitionCount { get; init; } // for Hyper-V GPU-P
}The config:
machines:
- name: devlab-gpu-worker
box: frenchexdev/alpine-3.21-gpuhost
cpus: 8
memory: 16384
gpu:
enabled: true
vendor_id: 10de
device_id: 2786 # RTX 4070
mode: passthroughmachines:
- name: devlab-gpu-worker
box: frenchexdev/alpine-3.21-gpuhost
cpus: 8
memory: 16384
gpu:
enabled: true
vendor_id: 10de
device_id: 2786 # RTX 4070
mode: passthroughThe contributor that handles this lives in the GpuPassthroughContributor:
[Injectable(ServiceLifetime.Singleton)]
[Order(40)] // after the host overlay
public sealed class GpuPassthroughContributor : IPackerBundleContributor, IMachineTypeContributor
{
private readonly HomeLabConfig _config;
private readonly IPlatformInfo _platform;
public bool ShouldContribute() => _config.Machines.Any(m => m.Gpu?.Enabled == true);
public void Contribute(PackerBundle bundle)
{
if (!ShouldContribute()) return;
// 1. Add NVIDIA driver + nvidia-container-toolkit to the image
bundle.Scripts.Add(new PackerScript("install-nvidia.sh", InstallNvidiaScript()));
bundle.Provisioners.Add(new PackerProvisioner
{
Type = "shell",
Properties = new()
{
["scripts"] = new[] { "scripts/install-nvidia.sh" },
["execute_command"] = "{{ .Vars }} sh '{{ .Path }}'"
}
});
}
public void Contribute(VosMachine machine)
{
var gpu = machine.Config.Gpu;
if (gpu is null || !gpu.Enabled) return;
// 2. Add provider-specific Vagrant customisation
switch (machine.Provider)
{
case "virtualbox":
ConfigureVirtualBoxPassthrough(machine, gpu);
break;
case "hyperv":
ConfigureHyperVPartition(machine, gpu);
break;
case "libvirt":
ConfigureLibvirtPassthrough(machine, gpu);
break;
case "parallels":
throw new NotSupportedException("Parallels does not support GPU passthrough");
default:
throw new InvalidOperationException($"GPU passthrough not implemented for {machine.Provider}");
}
}
private void ConfigureVirtualBoxPassthrough(VosMachine machine, GpuPassthroughSpec gpu)
{
// VirtualBox PCI passthrough requires Linux host with IOMMU enabled
if (!_platform.IsLinux)
throw new NotSupportedException("VirtualBox GPU passthrough requires a Linux host");
machine.VagrantCustomizations.Add(new VagrantCustomization(
Provider: "virtualbox",
Lines: new[]
{
$"v.customize ['modifyvm', :id, '--pciattach', '0a:00.0@01:00.0']", // device:bus@bus:device
$"v.customize ['modifyvm', :id, '--vrde', 'on']"
}));
}
private void ConfigureHyperVPartition(VosMachine machine, GpuPassthroughSpec gpu)
{
if (!_platform.IsWindows)
throw new NotSupportedException("Hyper-V GPU partitioning requires a Windows host");
machine.VagrantCustomizations.Add(new VagrantCustomization(
Provider: "hyperv",
Lines: new[]
{
"h.gpu_partition_adapter = true",
$"h.gpu_partition_count = {gpu.PartitionCount ?? "1"}"
}));
}
private void ConfigureLibvirtPassthrough(VosMachine machine, GpuPassthroughSpec gpu)
{
machine.VagrantCustomizations.Add(new VagrantCustomization(
Provider: "libvirt",
Lines: new[]
{
$"l.pci :bus => '0x0a', :slot => '0x00', :function => '0x0'"
}));
}
private string InstallNvidiaScript() => """
#!/bin/sh
set -eux
# Alpine has nvidia-driver-560 in testing repo (as of mid-2025)
echo 'https://dl-cdn.alpinelinux.org/alpine/edge/testing' >> /etc/apk/repositories
apk update
apk add --no-cache nvidia-driver nvidia-utils nvidia-container-toolkit
# Configure docker to use the nvidia runtime
cat > /etc/docker/daemon.json.gpu <<'EOF'
{
"runtimes": {
"nvidia": {
"path": "/usr/bin/nvidia-container-runtime",
"runtimeArgs": []
}
},
"default-runtime": "nvidia"
}
EOF
# Merge with existing daemon.json
jq -s '.[0] * .[1]' /etc/docker/daemon.json /etc/docker/daemon.json.gpu > /etc/docker/daemon.json.tmp
mv /etc/docker/daemon.json.tmp /etc/docker/daemon.json
rm /etc/docker/daemon.json.gpu
service docker restart
# Verify
nvidia-smi || echo 'nvidia-smi not available — GPU not actually passed through?'
""";
}[Injectable(ServiceLifetime.Singleton)]
[Order(40)] // after the host overlay
public sealed class GpuPassthroughContributor : IPackerBundleContributor, IMachineTypeContributor
{
private readonly HomeLabConfig _config;
private readonly IPlatformInfo _platform;
public bool ShouldContribute() => _config.Machines.Any(m => m.Gpu?.Enabled == true);
public void Contribute(PackerBundle bundle)
{
if (!ShouldContribute()) return;
// 1. Add NVIDIA driver + nvidia-container-toolkit to the image
bundle.Scripts.Add(new PackerScript("install-nvidia.sh", InstallNvidiaScript()));
bundle.Provisioners.Add(new PackerProvisioner
{
Type = "shell",
Properties = new()
{
["scripts"] = new[] { "scripts/install-nvidia.sh" },
["execute_command"] = "{{ .Vars }} sh '{{ .Path }}'"
}
});
}
public void Contribute(VosMachine machine)
{
var gpu = machine.Config.Gpu;
if (gpu is null || !gpu.Enabled) return;
// 2. Add provider-specific Vagrant customisation
switch (machine.Provider)
{
case "virtualbox":
ConfigureVirtualBoxPassthrough(machine, gpu);
break;
case "hyperv":
ConfigureHyperVPartition(machine, gpu);
break;
case "libvirt":
ConfigureLibvirtPassthrough(machine, gpu);
break;
case "parallels":
throw new NotSupportedException("Parallels does not support GPU passthrough");
default:
throw new InvalidOperationException($"GPU passthrough not implemented for {machine.Provider}");
}
}
private void ConfigureVirtualBoxPassthrough(VosMachine machine, GpuPassthroughSpec gpu)
{
// VirtualBox PCI passthrough requires Linux host with IOMMU enabled
if (!_platform.IsLinux)
throw new NotSupportedException("VirtualBox GPU passthrough requires a Linux host");
machine.VagrantCustomizations.Add(new VagrantCustomization(
Provider: "virtualbox",
Lines: new[]
{
$"v.customize ['modifyvm', :id, '--pciattach', '0a:00.0@01:00.0']", // device:bus@bus:device
$"v.customize ['modifyvm', :id, '--vrde', 'on']"
}));
}
private void ConfigureHyperVPartition(VosMachine machine, GpuPassthroughSpec gpu)
{
if (!_platform.IsWindows)
throw new NotSupportedException("Hyper-V GPU partitioning requires a Windows host");
machine.VagrantCustomizations.Add(new VagrantCustomization(
Provider: "hyperv",
Lines: new[]
{
"h.gpu_partition_adapter = true",
$"h.gpu_partition_count = {gpu.PartitionCount ?? "1"}"
}));
}
private void ConfigureLibvirtPassthrough(VosMachine machine, GpuPassthroughSpec gpu)
{
machine.VagrantCustomizations.Add(new VagrantCustomization(
Provider: "libvirt",
Lines: new[]
{
$"l.pci :bus => '0x0a', :slot => '0x00', :function => '0x0'"
}));
}
private string InstallNvidiaScript() => """
#!/bin/sh
set -eux
# Alpine has nvidia-driver-560 in testing repo (as of mid-2025)
echo 'https://dl-cdn.alpinelinux.org/alpine/edge/testing' >> /etc/apk/repositories
apk update
apk add --no-cache nvidia-driver nvidia-utils nvidia-container-toolkit
# Configure docker to use the nvidia runtime
cat > /etc/docker/daemon.json.gpu <<'EOF'
{
"runtimes": {
"nvidia": {
"path": "/usr/bin/nvidia-container-runtime",
"runtimeArgs": []
}
},
"default-runtime": "nvidia"
}
EOF
# Merge with existing daemon.json
jq -s '.[0] * .[1]' /etc/docker/daemon.json /etc/docker/daemon.json.gpu > /etc/docker/daemon.json.tmp
mv /etc/docker/daemon.json.tmp /etc/docker/daemon.json
rm /etc/docker/daemon.json.gpu
service docker restart
# Verify
nvidia-smi || echo 'nvidia-smi not available — GPU not actually passed through?'
""";
}The contributor implements two role interfaces: it touches both the Packer bundle (to install the driver) and the machine config (to add the provider-specific Vagrant customisation).
Exposing the GPU to a compose service
A service that wants the GPU declares it in its compose contributor:
public void Contribute(ComposeFile compose)
{
compose.Services["ml-worker"] = new ComposeService
{
Image = "nvidia/cuda:12.4.0-base-ubuntu22.04",
Restart = "always",
Deploy = new ComposeDeploy
{
Resources = new ComposeResources
{
Reservations = new ComposeReservations
{
Devices = new[]
{
new ComposeDevice
{
Driver = "nvidia",
Count = 1,
Capabilities = new[] { "gpu" }
}
}
}
}
}
};
}public void Contribute(ComposeFile compose)
{
compose.Services["ml-worker"] = new ComposeService
{
Image = "nvidia/cuda:12.4.0-base-ubuntu22.04",
Restart = "always",
Deploy = new ComposeDeploy
{
Resources = new ComposeResources
{
Reservations = new ComposeReservations
{
Devices = new[]
{
new ComposeDevice
{
Driver = "nvidia",
Count = 1,
Capabilities = new[] { "gpu" }
}
}
}
}
}
};
}Docker Compose v2 reads deploy.resources.reservations.devices and asks the nvidia container runtime to expose the GPU. Inside the container, nvidia-smi works.
Failure modes
GPU passthrough has many failure modes. The contributor surfaces them as Result.Failure rather than crashing:
- No NVIDIA card on the host: detected by checking for
/dev/nvidia0(Linux) or the GPU device list (Windows). Failure with "no NVIDIA GPU detected on host". - Host hypervisor does not support passthrough: detected by the platform / provider combination. Failure with "VirtualBox GPU passthrough requires Linux host".
- IOMMU not enabled: detected by checking
/sys/kernel/iommu_groups/. Failure with "IOMMU is not enabled in your BIOS". - Driver mismatch: the container's CUDA version must match the host's driver. The contributor warns if the major versions differ.
Each failure is a clear message at validation time, before anyone tries to boot the VM.
The test
[Fact]
public void contributor_skips_if_no_machine_has_gpu_enabled()
{
var bundle = new PackerBundle();
var c = new GpuPassthroughContributor(
Options.Create(new HomeLabConfig { Machines = new() { new() { Gpu = new() { Enabled = false } } } }),
new FakePlatformInfo(IsLinux: true));
c.Contribute(bundle);
bundle.Scripts.Should().NotContain(s => s.FileName.Contains("nvidia"));
}
[Fact]
public void contributor_throws_when_virtualbox_passthrough_requested_on_macos_host()
{
var c = new GpuPassthroughContributor(
Options.Create(new HomeLabConfig
{
Machines = new() { new() { Provider = "virtualbox", Gpu = new() { Enabled = true } } }
}),
new FakePlatformInfo(IsMacOs: true));
var machine = new VosMachine { Provider = "virtualbox", Config = new() { Gpu = new() { Enabled = true } } };
Action act = () => c.Contribute(machine);
act.Should().Throw<NotSupportedException>().WithMessage("*requires a Linux host*");
}
[Fact]
public void hyperv_partition_emits_correct_vagrant_customization()
{
var c = new GpuPassthroughContributor(
Options.Create(StandardConfigWithGpu()),
new FakePlatformInfo(IsWindows: true));
var machine = new VosMachine { Provider = "hyperv", Config = new() { Gpu = new() { Enabled = true, PartitionCount = "2" } } };
c.Contribute(machine);
machine.VagrantCustomizations.Should().ContainSingle();
machine.VagrantCustomizations[0].Lines.Should().Contain("h.gpu_partition_adapter = true");
machine.VagrantCustomizations[0].Lines.Should().Contain("h.gpu_partition_count = 2");
}[Fact]
public void contributor_skips_if_no_machine_has_gpu_enabled()
{
var bundle = new PackerBundle();
var c = new GpuPassthroughContributor(
Options.Create(new HomeLabConfig { Machines = new() { new() { Gpu = new() { Enabled = false } } } }),
new FakePlatformInfo(IsLinux: true));
c.Contribute(bundle);
bundle.Scripts.Should().NotContain(s => s.FileName.Contains("nvidia"));
}
[Fact]
public void contributor_throws_when_virtualbox_passthrough_requested_on_macos_host()
{
var c = new GpuPassthroughContributor(
Options.Create(new HomeLabConfig
{
Machines = new() { new() { Provider = "virtualbox", Gpu = new() { Enabled = true } } }
}),
new FakePlatformInfo(IsMacOs: true));
var machine = new VosMachine { Provider = "virtualbox", Config = new() { Gpu = new() { Enabled = true } } };
Action act = () => c.Contribute(machine);
act.Should().Throw<NotSupportedException>().WithMessage("*requires a Linux host*");
}
[Fact]
public void hyperv_partition_emits_correct_vagrant_customization()
{
var c = new GpuPassthroughContributor(
Options.Create(StandardConfigWithGpu()),
new FakePlatformInfo(IsWindows: true));
var machine = new VosMachine { Provider = "hyperv", Config = new() { Gpu = new() { Enabled = true, PartitionCount = "2" } } };
c.Contribute(machine);
machine.VagrantCustomizations.Should().ContainSingle();
machine.VagrantCustomizations[0].Lines.Should().Contain("h.gpu_partition_adapter = true");
machine.VagrantCustomizations[0].Lines.Should().Contain("h.gpu_partition_count = 2");
}What this gives you that bash doesn't
A bash script that "passes through a GPU" is a stack-overflow-cobbled snippet that works on the author's machine and breaks on every other configuration. There is no test. There is no platform check. There is no clear error.
A typed GpuPassthroughContributor gives you, for the same surface area:
- Provider-aware behaviour (VirtualBox / Hyper-V / libvirt / Parallels)
- Platform validation (Windows for Hyper-V, Linux for libvirt and VBox passthrough)
- NVIDIA driver installation in the Packer image
- nvidia-container-runtime registration in the Docker daemon
- Compose service GPU reservation via the Docker Compose v2 deploy syntax
- Clear failure messages for unsupported combinations
The bargain pays back the first time you run an ML training container inside a HomeLab VM and nvidia-smi works.
End of Act VII
We have now added every operational concern HomeLab promised: secrets, observability, backup with restore tests, multi-host scheduling, cost tracking, and GPU passthrough. With these in hand, DevLab is not just running — it is operable, with the depth a security-conscious SRE would actually accept.
Act VIII covers the day-2 operations and the variants: how to upgrade DevLab, how to run it on Podman, how to run multiple instances side-by-side, and how to tear it all down cleanly when you are done.