Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What is so interesting about buying up tonnes of hard disks and connecting them?


There must be something hard about it. I work at a huge company and constantly get emails about deleting files from my home directory, because there is only 100G of storage for everyone.

All I know is that I have a 1TB 3-way RAID-1 array at home, and it cost less to build that than it costs to pay me to cleanup my inbox. Not My Problem, I suppose...


Where I work we have a super fast SAN for our storage, with a giant cache, fiber everything, and separate channels for backups. The cost works out to something like $1k/GB. Beyond that we're perennially short on rack space, so there's a cost to buying new storage. I used to believe it would be cheaper for the company to add storage than to make me clean up my home directory, but I worked out the numbers one day and it's simply not the case.


If your paying 1k/GB your getting the short end of the stick.

To put this in perspective you could have a redundant array of in expensive RAM + network storage for less than this. (Excluding energy costs).


The hard part is the cost.

Unlike at home, there are a few more things required:

Those might include (but not limited to):

  local RAID (*)
  mirror to remote-site
  versioned backup
  off-site backup archives
And every piece of that puzzle needs its own:

  fiber
  ethernet
  physical space
  cooling
  electricity
  management (support team)
All the above need be redundant and enterprise quality (to minimize downtime and errors).

Someone once told me the price calculated for each GB. I thought he was crazy, and then did the math myself. (he was quite sane)

(*) or more commonly storage arrays from HDS/IBM/etc, which start at 100K+ and go north (way way way north)


The hard part is "in such a way that maintains an acceptable access time for 10,000 concurrent users". That's what you pay EMC the big bucks for.

Tho' I have an idea, you could "shard" the storage by placing a "volume" actually inside everyone's desktop computer...


The artifically low file storage is a plausibly deniable tactic to get you to delete your old files, minimizing the chance of legally discoverable material being subpoenaed in the future.


It's actually because if you give users 100gig shared they will manage to use it and be slightly miffed about having to do housework on it every so often. But if you give users 1TB shared they will manage to use it and be slightly miffed about having to do housework on it every so often.


Specifically, I asked the IT department at a previous employer about the 50MB (no joke) email quota, and they specifically said it was for legal reasons (not on the record, of course).


Document retention policies are very common and rarely hush-hush. The only time they would have a need to make it "off the record" would be when they were subject to discovery regarding a lawsuit or criminal charge.


Re document-retention policies: Typically the real concern is the cost of document review: If there's a lawsuit, a lawyer or paralegal ($$$) will have to review all the potentially-relevant documents to determine whether there are any that should be (i) labeled as confidential, and/or (ii) withheld on grounds of attorney-client privilege.

(Footnote: The latter gives rise to the bigger fear -- if you were to inadvertently give the other side a copy of a privileged document, you might be deemed to have done a "subject-matter waiver," entitling the other side to all other privileged documents covering the subject(s) discussed in the first document. Moreover, the other side might be able to take the depositions of business people and lawyers, which amounts to interviewing them on videotape, to find out just what the client discussed with its lawyers, including who said what. The other side can usually be counted on to play any awkward sound bites for the jury.)


Having enough seeking bandwidth across 7200 RPM or 10K RPM or 15K RPM bottlenecks sometimes contributes to not being able to use 1 TB 7200 RPM disks in enterprise file servers with acceptable performance, but needing more expensive (per GB/TB) options.

Then there are the costs associated with backups.

Not that this necessarily explains away your complaint entirely.


Get a distributed file system and keep it on a number of different machines -- then you don't have to care about the connections and it scales easier to more users.


Is your storage at home Sarbanes-Oxley compliant?

It's not the primary storage, it's the backups and data retention.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: