It's split between the bot and the cloud! High-level decision making and training is performed in the cloud for performance reasons, and low-level control and inference of VLAs is performed on the robot which has a GPU onboard!
And yeah it's a powerful architecture, but it's not enough if you want to perform task, the VLMs orchestrate but you need another model for manipulation. And we put all of these together :)
And yeah it's a powerful architecture, but it's not enough if you want to perform task, the VLMs orchestrate but you need another model for manipulation. And we put all of these together :)
Happy to chat further!