Skip to content

Compute Platform

The primary compute platform is a Dell PowerEdge T620 running KVM/libvirt. This single server hosts all continuously operating infrastructure services, application workloads, and media services. The choice of a single powerful host over multiple smaller nodes keeps operational complexity low — there is no distributed coordination, no cluster state to manage, and no network storage dependency for VM boot.


Platform Specifications

ComponentDetails
ServerDell PowerEdge T620 (tower)
CPUsDual Intel Xeon E5-2670 v2 — 20 cores / 40 threads total (Ivy Bridge)
Memory128 GB ECC RAM
StorageThree ZFS pools — see Storage Architecture
NetworkingHost bond — 10GbE primary with 1GbE failover — see Network Architecture

ECC memory protects against single-bit memory errors — relevant for ZFS, which relies on in-memory checksums to detect corruption before writing to disk. Without ECC, a bit flip in RAM could cause ZFS to write corrupted data while reporting success.


Virtualization Stack

KVM provides the hypervisor layer with near-native performance for CPU-bound workloads. Libvirt manages VM lifecycle operations — definition, provisioning, and runtime control — while virtual machine disks are stored as ZFS volumes, allowing snapshotting and replication to be handled natively by the storage layer.

Virtual machines use virtio drivers for both network and storage interfaces. Virtio is a paravirtualized I/O model: the guest knows it is running in a VM and uses an optimized protocol rather than emulating legacy hardware. This eliminates the overhead of device emulation and provides throughput and latency close to bare metal.

VM disks are provisioned as ZFS zvols — block devices backed by the SSD ZFS pool. Snapshotting, replication, and integrity verification are therefore handled directly by ZFS at the storage layer rather than through hypervisor-managed snapshots, while remaining independent of the guest OS’s filesystem.

Virtual machines are used in preference to containers for infrastructure workloads. VMs provide stronger fault isolation guarantees, independent kernel versions, and clearer operational boundaries. Each VM has its own network stack, its own OS lifecycle, and can be managed, snapshotted, and recovered independently of other workloads.


VM Workload Inventory

Approximately ten VMs run continuously, grouped into three operational categories.

Core Infrastructure

These services are treated as production — they run continuously and other workloads depend on them.

ServiceRole
pfSenseFirewall, routing, VLAN policy enforcement
Home AssistantHome automation platform, Zigbee coordination
Omada ControllerWireless access point management
Internal DNSName resolution for infrastructure services

Application and Personal Services

Long-running workloads with characteristics similar to small production environments.

ServiceRole
NextcloudPersonal cloud storage and collaboration
Reverse ProxyTLS termination, external access routing
AsteriskVoIP server
MinecraftGame server

Media Services

Separated from application services due to specialised hardware requirements (GPU passthrough).

ServiceRole
JellyfinMedia server with hardware-accelerated transcoding

Workload Isolation and Development Separation

Infrastructure services (pfSense, Home Assistant, DNS) are deliberately separated from experimental and development workloads. The Lab/Development VLAN and dedicated lab VMs ensure that testing activity cannot affect production services.

Production VMs follow a validation process before changes are made permanent: ZFS snapshots provide a rollback point, changes are tested in a running state before the snapshot is discarded. Experimental VMs can be rebuilt or rolled back independently without any risk to production services.

This separation means the hypervisor hosts active development and testing alongside continuously operating services — without the two interfering with each other.


PCI Passthrough

Two workloads use direct hardware passthrough rather than virtualised devices.

Network — pfSense

The pfSense VM receives three physical NICs via PCI passthrough: a dedicated WAN interface and two LAN interfaces that form a LAGG interface inside pfSense. The LAN LAGG provides a 10GbE primary path with a 1GbE failover path into the switching fabric. Because these interfaces bypass the hypervisor networking stack entirely, pfSense operates independently of the host bridge and bond configuration. Firewall routing and policy enforcement therefore remain unaffected by changes or misconfiguration in host networking.

GPU — Jellyfin

A NVIDIA Quadro P1000 is passed through to the Jellyfin media server VM. Hardware-accelerated H.264 and HEVC transcoding runs on the GPU rather than the CPU, reducing per-stream CPU usage significantly. This allows multiple concurrent transcoding sessions without impacting CPU availability for other VMs.

The Quadro P1000 was selected for its NVENC and NVDEC hardware video engines, low power consumption, and reliable compatibility with Linux KVM GPU passthrough. Unlike consumer GeForce cards, Quadro GPUs do not impose artificial limits on concurrent NVENC encoding sessions.


Design Decisions

Single host over cluster

A cluster introduces distributed state, quorum requirements, and network storage dependencies. A single well-specified server avoids all of this while still providing sufficient compute capacity for the current workload. Operational simplicity outweighs the theoretical availability gain from clustering at this scale.

VMs over containers for infrastructure

Containers share a kernel. A kernel panic or misconfigured namespace affects all containers on the host. VMs provide hardware-level isolation — a kernel crash in one VM does not affect others.

Hardware passthrough only where justified

Passthrough reduces VM portability and complicates live migration. It is used only where the operational benefit (firewall isolation for pfSense, transcoding performance for Jellyfin) clearly outweighs this cost.