Hacker News new | past | comments | ask | show | jobs | submit login

Is Gemini tied/benefitting from Google TPU hardware? Because you need hardware in the data center to run this, and I feel it is somewhat specialised.





Gemini models are written in Jax which through the XLA compiler can be compiled either to TPU or GPU hardware.

Performance may differ but Google (and Nvidia) are very interested in having good performance on both platforms.


The raw computation is just a bunch of matrix multiplications in a row, most of the algorithmic complexity/secret stuff would be around scaling & efficiency.

For training the model the HW is much more important as you need to scale up to as many chips as possible without being bottlenecked by the network.

This would just be inference, and it doesn't need to be very efficient as its for on prem usage not selling API access. So you could strip out any efficiency secrets, and it would probably look like a bigger Gemma (their open source model).

I wonder if they would/could try and strip out stuff like whatever tricks they use for long context + video support (both of which they are a bit ahead of everyone else on).


The model itself is likely built upon their own open source system JAX so they should be usable in Nvidia. Of course cost efficiency is going to be a different story.

TPUs are definitely the reason why Gemini models have both massive context and very low prices. There is no nvidia tax to pay.

The Google blogpost notes that it's a partnership with Nvidia, so using cuda rather than TPUs apparently.

Makes complete sense, as NVidia has a lot more experience building these types of appliances.

Some one said it could also mean Google hardware has some advantage they would rather stay inside the G-silo.

Google abandoned Coral in true Google style.



Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: