Processing data that cannot be processed on a single machine is fundamentally a different problem than processing data that can be processed on a single machine. It's useful to have a term for that.
As you say, single machines can scale up incredibly far. That just means 16 TB datasets no longer demand big data solutions.
I get your point, but I don’t know if big data is the right term anymore.
Many people like to think they have big data, and you kinda have to agree with them if you want their money. At least in consulting.
Also you could go well beyond a 16TB dataset on a single machine. You assume that the whole uncompressed dataset has to fit in memory, but many workloads don’t need that.
How many people in the world have such big datasets to analyse within reasonable time?
The risk is to build very good echo chambers. One shouldn’t have to read AI slop or despicable opinions during their free time, but some exposure to alternative respectable and not idiotic views should be part of the design.
It depends. Gaming PCs are fine for small models. Apple hardware can run much bigger models without having to open a window to cool down the room. If money isn’t an issue, NVIDIA isn’t that overpriced for no reasons and a server full of NVIDIA AI GPUs is neat.
And it’s open weight. Not open source.
reply