Even the smallest multimodal LLM would be wayyyyyyy bigger than an exported model from this
https://www.tensorflow.org/lite
TF Lite has first-class Android support with hardware acceleration if I'm not mistaken.
Even the smallest multimodal LLM would be wayyyyyyy bigger than an exported model from this