HDD Components Maker Hoya Describes 22TB and 24TB Hard Drives

ksec · on Nov 7, 2020

Most HDDs are going to Server Market. And from that point of view the capacity increase has been very very slow.

Edit: It seems lots of our computing components has reached plateau or soon within this decade. From HDD, DRAM, NAND, Chip Processes, etc. That is not saying they wont improve, but their Unit Cost aren't dropping any more.

rubiquity · on Nov 7, 2020

And for the server market there's a sweet spot for each workload for how big you want your spinning disks to be. These disks aren't getting any faster and rebuilding a 22 TB HDD is going to take a while and that imposes serious durability risks.

ody4242 · on Nov 7, 2020

It's not just the rebuild,some storage software (CEPH for example) also validate the data on the disks from time to time, and since the iops is quite limited on spindles, it takes more and more of that iops bucket just to verify the data that you have on the disks.

bayindirh · on Nov 7, 2020

ZFS also periodically re-silvers the disks to keep the FS in top shape. IIRC ZFS tries to re-silver disks when the traffic is low but, it's not always possible.

I bet that the disks is 16+ TB range will be used for colder tiers of the storage. Also, they should be useful in the OST of the LustreFS since the random read storm hits the MDT more severely.

karamanolev · on Nov 7, 2020

A friendly minor correction - a resilver is only when a drive is replaced, the periodic checksum check is a scrub in ZFS parlance.

magicalhippo · on Nov 8, 2020

And the scrub has to be initiated somehow, typically cron job, it's not automatic in ZFS.

Though NAS distributions like TrueNAS/FreeNAS sets this up for you by default.

bayindirh · on Nov 8, 2020

> And the scrub has to be initiated somehow, typically cron job, it's not automatic in ZFS.

The devices were Oracle/Sun ZFS appliances (I tortured a 7320, we liked it & bought a full-out 7420) so, maybe it was set up to scrub/resilver automatically in some cases.

As I've aforementioned, we're more of a Lustre shop and retired the systems some years ago.

bayindirh · on Nov 8, 2020

> A friendly minor correction...

Thank you. I'm no expert in ZFS TBH (We use Lustre much more) but, IIRC, when I was benchmarking the then new Sun Oracle ZFS 7320, I remember it was resilvering the disks after especially torturous loads, at night.

Maybe it was specific to the appliances (Our behemoth 7420 did the same) or, something was wrong. I remember Oracle/Sun guys jokingly asking me whether I succeeded to make it resilver the disks and, hearing it did indeed resilver the disks a dozen times visibly upset them. They've only said that "Pack it up, we need to go".

Fun times, it was.

Fnoord · on Nov 10, 2020

Hmm, if you had to resilver the disks a dozen times, I would assume that is indication of a hardware failure somewhere in the device (perhaps RAM?).

rubiquity · on Nov 7, 2020

Yes that’s a good call out. I was being lazy but there are quite a few different reasons to need to scan the entire disk.

formerly_proven · on Nov 7, 2020

Since all heads are mounted to the same arm, only one head can lock onto a track at a given time. What if each head had independent micro-actuators so that all heads can be locked to their track of the same radius, with the data being distributed across all heads. Wouldn't this improve throughput n-fold?

Edit: Seems like modern hard drives already have micro-actuators for each head to overcome precision and bandwidth issues with the main arm actuator, but it seems like none of those enable a range of motion that is sufficient to lock multiple heads simultaneously.

rbanffy · on Nov 7, 2020

For a long time, mainframes had hard disks with a set of heads on opposite sides of the platters. I'm not sure the use cases for spinning disks these days justify that kind of investment - they aren't (or shouldn't be) used in random-write-heavy workloads (as in frequently updated databases), but more for archival and, often, they sit behind a flash disk acting as a cache (I do that for my home server - a lot of write traffic never hits the disk because it's overwritten before being evicted from the flash)

XorNot · on Nov 8, 2020

Data verification to predict drive failure seems like a decent use case though. The drive is always spinning, so you essentially gain an extra data path to do read-verify-fix IOPs. Whether you can build this in cost-effectively is a big question though. But at rebuild times climbing to upwards of a week, being unable to do continuous health monitoring starts to get real problematic.

rbanffy · on Nov 8, 2020

This would be cool, but it can be done with a single set of heads if the utilization is less than 100%.

Multiple sets of heads would be useful if the limiting factor is positioning.

BTW, I don't know how a multi platter drive records its disk blocks. Is a block contained in a single platter or does it spread across all platters and read/writes from all heads at the same time?

Dual arms would be handy if the drive had a RAID-like checksumming scheme between platters. If a platter is corrupted but is still readable/writable (it wasn't a head malfunction), the drive could rebuild itself without the help of a computer.

Even if it is a head malfunction, the data could be redistributed between the other platters, reducing the drive capacity.

wtallis · on Nov 8, 2020

The specifics of how data is physically arranged across multiple platters is undocumented, complicated, and varies between models. But with some clever benchmarks, much of that information can be inferred: https://blog.stuffedcow.net/2019/09/hard-disk-geometry-micro...

A hard drive will only use one head on one platter at a time. A single logical block will be contained within a single track on one platter. The next logical block will usually be on the same track or an adjacent track on the same platter. Seeking from one track to the next using the same head is generally a bit quicker than switching to a different head on a different platter and getting it lined up with a nearby track.

droffel · on Nov 7, 2020

If you've got slow HDDs, the usual solution is to RAID them together. At that point your limiter starts becoming how fast you can slurp data down the line. RAID-0 would sort of emulate what you're talking about here.

R0b0t1 · on Nov 7, 2020

That still trades off density for speed, having independent arms would be a huge benefit.

rbanffy · on Nov 7, 2020

Would it still be cheaper than a similar SSD? Would it be cheaper than an SSD-augmented HD?

droffel · on Nov 8, 2020

> That still trades off density for speed

It does not. RAID-0 has no duplication, it acts effectively like your independent arms. No density tradeoff, just speed.

R0b0t1 · on Nov 8, 2020

There is a tradeoff: With RAID0 your parallelization for read or write is on the scale of 24TB chunks, not within 24TB chunks.

chungy · on Nov 7, 2020

I think a big problem too is that SSD prices haven't gone down fast enough. Basically the only reason that spinning rust is still a thing, is because SSDs are far more expensive.

jeffbee · on Nov 7, 2020

Apples and oranges, right? You need ten thousand hard drives to match the IOPS of one SSD and even with the 10000 disks your service latency will still be three orders of magnitude worse. The other advantage of an SSD is in terms of bytes per volume, in case rack density matters to you.

taneliv · on Nov 7, 2020

Putting 24TB parts in a Backblaze Storage Pod 6.0 would presumably allow 1440TB in a 4U rack mount server.[1] In practice, how would you reach the same density with SSDs? (I haven't looked into it, just curious if you know that SSD options more dense than that exist, and whether they are equally openly documented.)

[1] https://www.backblaze.com/blog/open-source-data-storage-serv...

ddulaney · on Nov 7, 2020

There aren’t many full open-source solutions available, but you can buy 100TB 3.5” SSDs for data center use.[1] At Storage Pod densities, that’s 6000TB in 4U. I’m not sure if some other factor comes in that limits density, but that’s a first-order estimate.

[1] https://nimbusdata.com/products/exadrive/pricing/

rbanffy · on Nov 7, 2020

Heat dissipation may be a concern, but, if we are talking 6PB of flash storage, we can pretty much consider custom liquid cooling is an option.

brigade · on Nov 7, 2020

There are 1U servers with 32 EDSFF slots, which were advertised to reach 1PB with 32TB SSDs. But 16TB is more common, and that's still 2PB in 4U that you can buy today without getting into exotic pricing.

marmaduke · on Nov 7, 2020

If 2PB SSD is not exotic pricing.. I’m glad I can work with off the shelf stuff.

jeffbee · on Nov 7, 2020

You can put 36 15TB NGSFF SSDs into a 1U height enclosure and only 12cm deep. SSD volumetric density is a lot higher than disk and has been for a few years now.

detaro · on Nov 7, 2020

SSDs are smaller, so you can pack more of them in the same volume. Not aware of really open designs (maybe in the OpenCompute project there are some), but just from a quick look at Supermicros homepage:

Something like https://www.supermicro.com/en/products/system/1U/1029/SSG-10... can fit 32 SSDs à 16 TB in a 1U system (Intel at some point announced 32 TB models in the same format, but I'm unsure if they ever were available).

This style can fit 48x 16TB in 2U: https://www.supermicro.com/en/products/system/2U/2028/SSG-20...

The trouble with such dense SSD capacity is more being able to interface them fast enough with the host and the outside world.

mciancia · on Nov 7, 2020

Well, there are 100tb 3.5" SSDs available...

ReptileMan · on Nov 7, 2020

Let me guess - if you have to ask for the price, you can't afford it ...

vxNsr · on Nov 8, 2020

price isn't that bad... only $40K[0]

[0]https://www.newegg.com/nimbus-data-dc-100tb/p/2U3-002M-00004...

iptrans · on Nov 8, 2020

Gotta love that $29.00 shipping charge. It's not like they could have afforded to offer free shipping at those prices :)

votepaunchy · on Nov 7, 2020

You need both SSD and HDD, the latter for archival storage and data replication. And that only works at data center scale.

chungy · on Nov 8, 2020

IOPS differences are exaggerated. You can generally beat consumer-grade SSDs with just four HDDs.

wtallis · on Nov 8, 2020

That depends very much on the workload. Four hard drives can deliver in aggregate a sequential bandwidth that exceeds any one consumer SATA SSD, or the sustained write bandwidth of most consumer NVMe SSDs. But when people are discussing IOPS, the usual implication is that they're talking about non-sequential access of relatively small block sizes. For those workloads, the difference between consumer SSDs and hard drives are still measured in orders of magnitude.

So, what workloads did you have in mind when you said "generally"?

walrus01 · on Nov 7, 2020

I was familiar with the name Hoya in the context of glass lenses, was unaware they make HDD platter substrates.

Kye · on Nov 7, 2020

If Kodak is any guide, there's a long list of business opportunities available to a company with knowledge of photography.

ericpauley · on Nov 7, 2020

Likewise with Fujifilm, who successfully adapted their photography business into many diverse industries: https://crm.org/articles/fujifilm-found-a-way-to-innovate-an...

srtjstjsj · on Nov 8, 2020

I think the key word is "chemistry", not "photography".

Kodak stopped being a big name in photography when the interesting part of photography stopped being the chemistry of printing.

kasabali · on Nov 7, 2020

> HAMR requires new heads and immediate transition to glass platters with all-new new coating

Shoot, this won't go well. Last time glass platters were used was in IBM Deathstars.

kdkeyser · on Nov 7, 2020

Glass platters have been used a lot since the IBM "Deathstar" series. They are used in almost all 2.5 inch laptop drives.

pixl97 · on Nov 7, 2020

Heh, and they can shatter. Had a customer bring in a laptop after they dropped it and said the hard drive wasn't detected.

Taking the hard drive out and rotating it made the noise of a thousand scurrying roaches.

kasabali · on Nov 7, 2020

2.5 drives have less platters which are also smaller.

I hope everything goes well and we'll see 100TB drives.

737maxtw · on Nov 7, 2020

I'm not sure whether the smaller size is an asset or liability here. There's likely some tradeoffs for manufacturing tolerances and/or thermals. (physics people speak up?)

magicalhippo · on Nov 7, 2020

> Last time glass platters were used was in IBM Deathstars.

Thanks to one of those I learned my lessons about having proper backups. Lost 3 years of work when my drive died one morning.

Of course, I only learned about the nickname after it broke.

Lets hope the new ones will fare better...

sschueller · on Nov 7, 2020

Ah yes, the deathstar. Also remember in "RAID 0" the 0 indicates the amount of data you have on failure.

SllX · on Nov 7, 2020

Materials science has advanced a fair bit, but hopefully Backblaze gives a few of these a spin to see how it goes and report back.

Netcob · on Nov 7, 2020

So how fast are these?

Will the RAID rebuild time finally be longer than the average lifespan of a drive?

choppaface · on Nov 7, 2020

Today you can fit (with off-the-shelf components) four 8TB M.2 drives (in RAID) in the volume of a 20TB spinning disk drive. Yes the cost is higher, but when it comes down ... Isn’t the industry expecting solid state density and cost might eclipse that of disk drives in the next 5(?) years?

AtlasBarfed · on Nov 7, 2020

SSD / Flash becomes more fragile with node shrinks, so flash (currently) kind of has hit the limit on shrinking, and they do "layering" to get more storage/density.

Not sure how much they can keep pushing layering, I think they are already at 96 layers at consumer and are having problems above 128, but I haven't checked the state of ssd in 6 months.

I suppose the investment cost will age out and ssd multilayer will continue to drop, but flash supply (and HDD supply) is effectively a cartel these days of not-that-many players/competitors.

m463 · on Nov 7, 2020

I expect cost/mb is king.

Lots of companies now use hard drives for backups instead of tapes.

I've also heard that data on flash drives degrades after a while without power, which is not the case with mechanical hard drives.

peter_d_sherman · on Nov 7, 2020

>"Seagate is confident that heating the media using laser (HAMR) is the best solution possible, while Toshiba and Western Digital believe that using microwaves to change coercivity of magnetic disks (MAMR) is more viable for the next several years. Furthermore, Western Digital even uses 'halfway-to-MAMR' energy-assisted perpendicular magnetic recording (ePMR) for its latest HDDs. Meanwhile, everyone agrees that HAMR is the best option for the long term.

HAMR requires new heads and immediate transition to glass platters with all-new new coating, whereas MAMR only needs new heads and can continue using aluminum media with a known coating. Even if HAMR offers a higher areal density than MAMR, it is possible to increase platter count to expand the capacity of a MAMR drive to match that of a HAMR-based HDD. There is a catch though: thin MAMR platters will have to rely on a glass substrate."

This is interesting; it seems that we might be halfway between metal and glass hard drives...

Which brings up an interesting thought, phrased as a challenge, for all Physicists out there...

Given a piece of glass, ordinary glass, ordinary glass and no magnetic metal platter, my challenge is:

a) How do you write data to it?

b) How do you read that data back from it?

c) What's the smallest size / depth you can accomplish this at, and why?

d) Does a formula govern c, and if so, what is it?

e) What do you do about imperfections in the glass?

?

We currently have CD's, DVD's and Blu-Ray discs that use lasers to write to chemical mixtures inside of those media.

My challenge is -- can we do it without those chemical compounds, could we use simple pure glass, and if so, how?

?

(Note to Future Self: Work on this in the future... <g>)

dan_hawkins · on Nov 8, 2020

Aluminum is also non-magnetic like glass. So either of them needs to be coated with magnetic material.

jonplackett · on Nov 7, 2020

That’s a lot of data to lose all in one go...

phaemon · on Nov 7, 2020

Very similar comments were made when a 25GB HDD was announced: http://web.archive.org/web/20001219170800/http://slashdot.or...

jonplackett · on Nov 8, 2020

and how did it turn out?

Mistletoe · on Nov 8, 2020

Multitudes of lost bitcoin wallets crying out in the darkness.

jonplackett · on Nov 8, 2020

That was very poetic. Upvote.

phaemon · on Nov 8, 2020

Have you never owned a hdd larger than 25GB? Surely you know from personal experience how it turned out.

neolefty · on Nov 7, 2020

It's also a lot of data to squeeze through a 250MB/s tube. 110+ hours to copy?

crististm · on Nov 7, 2020

I don't get why they should all vanish in one go. Unless there is a problem with all the heads and all the platters then you should still be able to recover a lot of data.

But what do I know, I returned an 8TB drive three weeks ago after it started developing bads and soon crashed with 2TB of workload in less than 24h of runtime. First the OS did not see it, then the BIOS stopped seeing it and now I think I won't see it any more. Good to know it has 5 years warranty but it shattered my trust in the model.

numpad0 · on Nov 8, 2020

One head crash kills an entire drive because the access arm is a single block of aluminum with one head each for each platter that swings in and out. They do not move individually.

Imagine a vinyl record player, stack up six of them vertically, connect all tone arms at counterweights to a single bar running height wise.

irjustin · on Nov 7, 2020

This was a nightmare to try and read on mobile

anuila · on Nov 7, 2020

Welcome to the modern web, where you exist only to consume ads and the content doesn't matter.

undergrowth54 · on Nov 7, 2020

Copyeditors have to pay rent, and they only have one way to monetize exposure

At least until Brave takes off...

amelius · on Nov 7, 2020

How many square nanometers does a bit occupy nowadays?

wtallis · on Nov 8, 2020

Last I heard, track widths were down to about 50nm (500k+ tracks per inch), but I'm not sure how wide a bit or sector within the track is.