Skip to content

Storage Architecture

All storage is built on ZFS. The properties that make ZFS appropriate for this environment are end-to-end checksumming and copy-on-write consistency, built-in snapshot capability, and native send/receive replication. Storage is divided into three separate pools — each tuned for its workload class — rather than a single general-purpose pool.


Design Philosophy

Separating pools by workload class provides several operational benefits: I/O contention between VM storage and bulk data transfers is eliminated, each pool can use the disk type best suited to its workload, and a failure or degradation in one pool does not affect the others.

ZFS generates a checksum for every block written and verifies that checksum on every read. Silent data corruption — bit rot from aging spinning disks — is detected and corrected automatically when redundant copies exist. If corruption is detected or a pool enters a degraded state, monitoring alerts are generated so the issue can be investigated immediately.


Pool Layout

PoolDisk TypeConfigurationWorkload
SystemSAS SSDSingle diskOS, hypervisor configuration
VM StorageEnterprise SATA SSDRAIDZ1 (multiple disks)VM zvols, infrastructure service data
DataEnterprise HDD (SAS/SATA)RAIDZ1 + L2ARC SSD cacheBulk data, media, replicated infrastructure datasets

System Pool

A small dedicated pool on a SAS SSD holds the host operating system and hypervisor configuration. Keeping this isolated from VM workloads ensures that heavy I/O on the VM pool cannot affect OS responsiveness. SAS SSDs provide consistent latency and enterprise-grade endurance characteristics suited to a continuously running host.

VM Storage Pool

Virtual machine disks are provisioned as ZFS zvols on a RAIDZ1 array of enterprise SATA SSDs. RAIDZ1 tolerates a single disk failure without data loss while delivering low latency suitable for virtualised I/O workloads. The all-SSD composition means snapshot operations, scrubs, and replication activity do not noticeably impact running VMs.

Data Pool

Bulk storage uses large-capacity enterprise HDDs (SAS/SATA) in a RAIDZ1 configuration. A dedicated SSD provides an L2ARC read cache, allowing frequently accessed blocks to be served from flash while the primary storage remains on high-capacity disks. This pool stores media libraries, large datasets, and replicated infrastructure data from the primary system.


Snapshot Strategy

ZFS snapshots are created automatically on a scheduled basis across all pools with defined retention policies.

Snapshots serve several operational purposes:

  • Rollback before changes — a snapshot taken before a configuration change or software update provides an instant rollback point if the change causes problems
  • Accidental deletion recovery — files deleted from a snapshotted dataset can be recovered without restoring from backup
  • Consistent replication points — replication always uses a snapshot as its source, ensuring the transferred state is consistent

Snapshots are instantaneous and do not interrupt running workloads. A snapshot preserves the current filesystem state by retaining references to existing blocks. As the active filesystem changes, new blocks are written and the snapshot continues to reference the original data, so space is consumed only as the two diverge.


Replication Architecture

Critical datasets are replicated to a secondary system on a regular schedule using ZFS send/receive. After the initial full replication, subsequent replications transfer only the changed blocks since the last snapshot — incremental delta transfers keep ongoing replication bandwidth low.

Replication protects against primary host hardware failure. If the T620 were to fail catastrophically, recent copies of critical data exist on the secondary system and can be promoted to a functional state.

Replication activity is reviewed periodically to ensure transfers are completing successfully and that replicated datasets remain current.


Operational Practices

ZFS scrubs run on a regular schedule across all pools. A scrub reads every block, verifies the checksum, and corrects errors where redundancy allows. On the spinning disk pool, scrubs are the primary mechanism for detecting pre-failure conditions — UREs (unrecoverable read errors) that have not yet caused a visible failure often appear during a scrub before they would cause data loss.

SMART monitoring tracks disk health attributes (reallocated sectors, pending sectors, uncorrectable errors) for all drives. Degrading disk health triggers an alert before the disk reaches a failure state.

Pool status monitoring tracks the health of all pools — including any degraded vdevs, checksum errors, or read/write errors that accumulate between scrubs.

The combination of scheduled scrubs, SMART monitoring, and replication validation means storage problems are detected proactively rather than discovered during a recovery attempt.


Disk Selection Rationale

SAS SSD (System pool)

SAS SSDs offer enterprise endurance ratings and consistent latency under mixed workloads. The system pool sees low write volume but benefits from the reliability characteristics of enterprise-class media.

Enterprise SATA SSD (VM pool)

Consumer SSDs are rated for lower write endurance and variable latency under sustained I/O. Enterprise SATA SSDs maintain consistent performance under the mixed random read/write workload typical of VM storage.

NAS-grade HDD (Data pool)

NAS-grade drives are designed for continuous operation and carry higher vibration tolerance and TLER (Time-Limited Error Recovery) settings appropriate for RAID/ZFS use. Consumer desktop drives may time out on read errors in a way that causes ZFS to unnecessarily mark a vdev as faulted.