4.8b words on English Wikipedia. Knowledge cutoff of 6 months. A valid use case is to search across Wikipedia and ground your
answers.
Trivially proves that RAG is still needed.
This is only for the small model. The medium model is still at 1M (like Gemini 2.5)
Even if we could get the mid models to 10M, that's still a medium-sized repo at best. Repos size growth will also accelerate as LLMs generate more code. There's no way to catch up.