Click/traffic data for a top 100 website.
We weren't doing a ton, but basic recommendation processing, search improvement, pattern matching in user behavior, etc
We normally still would only need to process say, the last 10 days of user data to get decent recommendations, but occasionally it would make sense for processes running over the entire dataset.
Also this isn't that large when you consider binary artifacts (say, healthcare imaging) being stored in a database, which pretty sure that's what a lot of electronic healthcare record systems do.
A random company I bumped into has a 40TB OLTP database to this effect.
We normally still would only need to process say, the last 10 days of user data to get decent recommendations, but occasionally it would make sense for processes running over the entire dataset.
Also this isn't that large when you consider binary artifacts (say, healthcare imaging) being stored in a database, which pretty sure that's what a lot of electronic healthcare record systems do.
A random company I bumped into has a 40TB OLTP database to this effect.