A Gentle Introduction to ZFS, Part 2: Storage Pools and Hard Disk Drives on the Jetson TX2
If you haven’t yet read about ZFS and compiled the source, please do so by following the previous article, A Gentle Introduction to ZFS, Part 1: Compiling ZFS on Ubuntu 18.04 (aarch64), before continuing.
ZFS storage pools are very powerful. What used to require proprietary OS support or hardware RAID cards is now freely available as open source software. Democratized RAID for the masses. Let’s set up a storage pool to demostrate the decisions that go into the configuration, both software and hardware.
ZFS Storage Pools
Many ZFS storage pools are structured as ZFS filesystems (as opposed to ZFS Volumes that represent block devices). ZFS storage pools are made up of virtual devices (vdevs). These vdevs are made up of physical hard disk drives.
Storage Pool Capacity
Let’s start by defining the requirements. One might think we’d start by defining the TB’s of data we would like in the array, but we are actually going to start with the number of hard disk drives we can fit into our system. The reason for this is that the number of drives are not allowed to change in a storage pool, but the drives can be replaced with higher capacity drives, one by one, resilvering between each drive change (more on that in a later article). Let’s determine the number of hard disk drives over the next few sections.
Storage Pool Throughput
How much throughput do we need for the array? There’s two basic ways for data to be read/written to the storage pool: 1) by local disk access and 2) external access (ethernet, USB, fibre channel, etc.). Because I will be setting up a Network Attached Storage (NAS), I will be bandwidth limited by the single gigabit ethernet port on the TX2 (1Gbps or 125MB/s), since all data will be read/written from the network and not locally on the NAS.
The ethernet interface sets the upper bound on the NAS bandwidth and may be entirely comfortable, depending on the type of data you may want to access. For example, for all of my photography work, the RAW files are all around 50MB or less. Accessing a RAW photo in 0.4s is perfectly fine. For a 2GB video, access time to read the entire file is now 16s. Now, we’re starting to get slow and unsuitable for video editing.
Now that we know the upper bound of the NAS, we can scale our hard drive access speeds accordingly. A typical 5400 RPM hard drive has a read/write bandwidth of 100MB/s, while a 7200 RPM hard drive clocks in at 120 MB/s on average. Since we will also be allowing parallel access to each disk, 5400 RPM hard drives are more than enough for the task. Disks with a rotation speed of 5400 RPM in a striped configuration (parallel access) will have a throughput of N x 100MB/s for N hard disk drives:
- 2 drives = 200 MB/s
- 3 drives = 300 MB/s
- and so on…
Remember that our gigabit ethernet bandwidth caps out at 125 MB/s. So, any configuration of 5400 RPM drives past 2 drives will be ethernet-limited. Don’t even bother with SSDs, as they’re too fast for the ethernet interface. Besides, 5400 RPM hard drives are cheaper than all the rest. Now that we know the throughput of the ethernet interface and the speed of the hard drives, we can talk about the TX2’s drive capacity.
Expanding the TX2’s Drive Capacity
There’s a small problem for our ZFS storage array. The Jetson TX2 carrier board has exactly one SATA II port. And yes, that’s not a typo, the port is SATA II and not III, limiting the port to 3 Gbps. Fortunately, a sustained read/write from a 5400 RPM hard drive is, at most, 800 Mbps or 27% of the SATA II bandwidth. Even if there’s a cached data burst (i.e. the data has already been read, and doesn’t need to be re-read), there’s still enough headroom to not make the limited SATA II bandwidth a bottleneck.
But are you going to be satisfied with one hard drive in your ZFS storage pool? Clearly, not. One drive doesn’t make any meaningful ZFS storage array. For one, we will need at least two drives to achieve data parity. This just means that if one drive fails, we can replace it with another with no loss of data. One way to increase the storage capacity of the TX2 is to use its onboard PCI-e Gen2 x4 slot.
The Jetson TX2‘s PCI-e Gen2 x4 slot has a theoretical bandwidth of 2GB/s. If we are planning to add 8 x SATA III (6Gbps) drives, that’s 6GB/s of bandwidth, far higher than the Gen2 x4 slot can handle. Fortunately, we don’t need the maximum theoretical bandwidth, just the maximum transfer speed per drive.
OpenZFS states the following recommendations for controller cards:
The ideal storage controller for ZFS has the following attributes:
* Driver support on major OpenZFS platforms: Stability is important.
* High per-port bandwidth: PCI Express interface bandwidth divided by the number of ports
* Low cost: Support for RAID, Battery Backup Units and hardware write caches is unnecessary.
We will install the Syba PCI-e Mini SAS Expander (SI-PEX40137) to extend the storage capacity of the TX2. This is not to be confused with a RAID controller card. We are not looking to control the RAID array in hardware, only to expand the accessible ports to the TX2. Please also see the note I made about the Syba in my previous post, A Gentle Introduction to ZFS on Ubuntu 18.04 (aarch64):
Note that I don’t actually recommend the Syba today. The mini SAS ports will work on the Syba, but try the MZHOU B082D6XSZN instead. The MZHOU contains the same Marvell 9215 ASIC as the Syba but with 8x SATA III ports directly integrated on the expansion card. The Syba has mini SAS to SATA converter cables, but the SATA III ports make this specialized cable unnecessary.
In addition, OpenZFS gives the following caveats:
ZFS depends on the block device layer for storage. Consequently, ZFS is affected by the same things that affect other filesystems, such as driver support and non-working hardware. Consequently, there are a few things to note:
Never place SATA disks into a SAS expander without a SAS interposer. If you do this and it does work, it is the exception, rather than the rule. Do not expect SAS controllers to be compatible with SATA port multipliers. This configuration is typically not tested. The disks could be unrecognized. Support for SATA port multipliers is inconsistent across OpenZFS platforms. Linux drivers generally support them. Illumos drivers generally do not support them. FreeBSD drivers are somewhere between Linux and Illumos in terms of support.
So, the Jetson TX2 has the hard drive capacity for 1 (SATA II) + 8 (Syba, SATA III) = 9 total hard disk drives.
Computer Case and Noise
One thing we didn’t take into account when deciding our storage capacity is how we would arrange these hard disk drives in a computer case, and whether spinning disk noise is going to be an issue. I’ve assembled these hardware components into a case already, and I’m using the Fractal Design Node 804 micro ATX chassis. One could go much cheaper, as the TX2 is meant to be a standalone development kit. One could, for example, just stack-and-space the hard drives in a 3D printed enclosure, while leaving the TX2 to run bare-board. Those who have young children are acutely aware: anything that isn’t enclosed or locked down eventually grows legs or is spirited away (and found half a year later behind the couch). So, only run bare-board if you don’t expect things to bump the expansion card out of the PCI-e slot or any another distrubing event.
Regarding noise, 5400 RPM hard drives have about 3dB less noise than 7200 RPM drives, which is at least an order of magnitude below fan noise. In other words, it doesn’t really matter. Also, I wouldn’t recommend going fanless for a NAS. We want to keep the hard disk drives as cool as possible.
Creating the ZFS Storage Pool
The commands for creating a storage pool, after you’ve loaded the ZFS kernel module, all follow a similar form:
$ sudo zpool create -f tank \
I’ll get into more complicated ZFS storage pool configurations later, but I just want to highlight the general syntax first:
- zpool is one of the primary commands we will use to interact with ZFS.
- create is a set of sub-commands surrounding creation of a ZFS storage pool.
- -f simply means to forces use of vdevs, even if they appear in use or specify a conflicting replication level. This can occur if you’ve used a vdev in the past to create a pool but haven’t wiped the drive prior to the zpool create sub-command.
- tank is the name of your storage pool.
- -m /tank is the mount point, where the local filesystem can access the ZFS storage pool.
- draid2 is the ZFS filesystem you want to create. Here I’ve specified the new dRAID filesystem used in ZFS. The 2 part means 2 parity drives and the rest for data with no spares. Later, we’ll be configuring the dRAID filesystem, so hold tight.
- wwn-0x0000000000000000 is the World Wide Name (WWN) of the hard disk and is unique to that disk. Why this way and not sda, sdb, etc.? Because those names are not guaranteed to point to the same disk on reboot, whereas the WWN is. In my example, I’ve shown 9 disks.
After this command is run, a storage pool will have been created. Use
zpool status to verify. If you don’t see your pool, then run
sudo zpool import -a and check the status again.
- The hardware requirements for a NAS were described in ethernet throughput, hard disk drive rotation speed, and SATA II/III slots via TX2 & expander card on PCI-e Gen2 x4.
- The basic command to create a ZFS storage pool was presented and parameters explained.
In the next article, we’ll use the ZFS storage pool we just created to create an NFS share that can be accessed from anywhere in our network and performance test our network channel.