Studying Simple Pooled Storage and Striped Sets

Before we get started I want to acknowledge the obvious. Striping drives is not for everyone. Unless you are confident in your local and offsite backup solutions I do not recommend playing with your data in this way.

This is an examination of the built in Windows 10 striping solutions. I am looking at 2 basic questions.

  • Are they actually faster than a normal disk configuration?
  • Which one should you choose?

For the uninitiated I’ve provided a brief summary of logical and physical storage along with an idea of how striping works.

An introduction to storage

Storage solutions are an overlooked aspect of building computers. Computer enthusiasts often have multiple drives and even multiple types of drives in their computer. The most simple way to set these up is to use each physical drive as a discrete logical drive. That is, each drive physically in your computer corresponds to a drive logically mounted in your operating system.

Here a common enthusiast setup:

Logical				       Physical

500GB - Boot drive (C://) <== 500GB SSD A

1TB - Movies and TV Shows drive (D://) <== 1TB HHD A

Here we have two physical drives and two logical drives.

However, it is possible to combine multiple physical drives into a single logical drive. Here is a example of a logical drive arrangement called a Striped Set:

Logical				       Physical

500GB - Boot drive (C://) <== 500GB SSD A

1TB - Movies and TV Shows drive (D://) <== 500GB HHD B + 500GB HHD C

There are two main reasons to play with logical drive configurations, speed or resiliency. I want to emphasis the “or” of that statement. In most cases you will be choosing between the two.

The most common versions of these are Striped Drive Sets (speed) and Mirrored Drive Sets (resiliency). You may also have heard these referred to as RAID 0 and RAID 1 respectively.

We are going to focus primarily on Striped Drive Sets.

What are striped sets?

Here is the most basic and unhelpful definition of a striped drive set:

A set of drives that have data striped across them.

But what is striped data? Wikipedia has a great definition.

Data Striping is the technique of segmenting logically sequential data, such as a file, so that consecutive segments are stored on different physical storage devices.

Striping is useful when a processing device requests data more quickly than a single storage device can provide it. By spreading segments across multiple devices which can be accessed concurrently, total data throughput is increased.

Basically, the computer can read and write your files quicker when talking to two drives than it can talking to one.

Why you shouldn’t do this

I want to take a moment and recognize the biggest pitfall of this method of drive management. If any drive in your striped set fails, you will lose all of the data across all of your striped drives. It is very difficult to recover files lost to a failed striped set.

Only do this if you are okay with losing your files on your striped set or you have a backup solution.

So what is this post actually about?

The easiest way to logically configure drives in Windows 10 is to use one of two built in tools. Disk Management and Storage Spaces. Almost all online discussion about these tools revolves around the mirroring and parity aspects. Since I couldn’t find a great breakdown of how these tools differ in striped set performance and I happen to have my computer taken a part right now, I decided to make my own.

I’ll just be using a couple of the drives I own. My results are not necessarily reflective of how these tools work universally, but I believe some data is better than none.

These are the drives I’ll be testing:

  • Hitachi GST Travelstar 7K500 HTS725050A9A364–2.5" 7200 RPM — 2 of them
  • Seagate Momentus ST9750420AS — 2.5" 7200 RPM
  • WD Red WD20EFRX — 3.5" 5400 RPM
  • Seagate BarraCuda ST2000DM006–3.5" 7200 RPM — 2 of them
  • Crucial M500 CT240M500SSD1–2.5" SSD

This is a complete hodgepodge of drives dating back all the way to 2010.

A quick note: I ended up breaking one of my Hitachi drives while testing due to rough handling while it was powered on. (not my proudest moment) This is why I don’t have complete results from that array.

The Methodology

I used a couple of tests to measure drive performance.

For synthetic, I used CrystalDiskmark 6.0 with a 32Gib test size as my standard benchmark. It is my understanding that this is the most accurate method of benchmarking within CrystalDiskmark.

I also tested a single large file transfer of 19.7 GB as well as a folder containing 201 files of varying size up to 10gb.

All drives were freshly formatted to NTFS prior to testing.

On to the testing!

Individual Drives

First the synthetic test. Keep in mind this graph is on a logarithmic scale, look at the values rather than the length of the lines. Higher is better.

Synthetic Benchmark: Higher is better.

Next, my own real life benchmarks. In this case, smaller is better.

Basically all of the drives perform exactly as you would expect. I am pleased to see my new Barracuda drives perform better than the Hitachi drives I am replacing. Nothing really to learn here but it will be useful as a comparison for our future tests.

Drive Pairs

Next up, the Hitachi and Barracuda drive sets in both Striped Set and Simple Pool. This gives us the most general idea of how these tools differ. Once again, mind the logarithmic scale.

Synthetic Benchmark: Higher is better.

The striped sets consistently outperform the simple pooled storage. Nearly doubling the speed of the Seq Q32T1 Write in both instances. Interestingly while the striped set provides a significant performance benefit, the pooled storage seems to offer little benefit compared to the individual drives.

Real Life Benchmark: Lower is better.

The real life benchmark confirm my findings, however the results are not as extreme.

Getting wild with Pooled Storage

Drive Pairs with SSD

I was interested in seeing how adding an ssd to a pair of identical drives affect performance.

Synthetic Benchmark: Higher is better.
Real Life Benchmark: Lower is better.

A small but consistent improvement.

Other Pooled Arrangements

I was also interested in how other drive arrangements affected pooled speed.

Synthetic Benchmark: Higher is better.
Real Life Benchmark: Lower is better.

Conclusions

Lets take a look at my initial questions and see if we can draw some conclusions.

  • Are they actually faster than a normal disk configuration?
  • Which one should you choose?

Are they actually faster than a normal disk configuration?

Short answer, yes, probably. How much is heavily dependent of type of load and disk configuration.

Striping the disks provided about a 90ish% performance benefit all around.

Pooled storage with identical disks seems to slightly hinder sequential performance but it does provide some benefit for random reads and writes. Not quite as much as striping though.

I find it very interesting how adding in an SSD to the pool doesn’t affect sequential speeds much but has a tremendous increase on random speeds.

Which one should you choose?

Based on my results, If you have an identical disk set, you should stripe. The striped set performance was consistently better than the pooled set.

If you are combining different drives, even if they are the same capacity, it may be worth looking into pooled storage.

Wrapping up with a few notes

I hope you found my collection of benchmarks interesting and possibly useful. I know this collection of graphs is not the easiest way to ingest the data so I have provided a link to my data in a spreadsheet here.

Thanks for reading.

I make videos and blog posts.