Please correct my likely misunderstanding here, but on the surface, it seems to me that "call some tools then return JSON" has some pretty common use cases.
Let's say you wanna build an app that gives back structured data after a web search. First a tool call to a search api. Then do some reasoning/summar/etc on the data returned by the tool. And finally return JSON.
Suppose there's a pdf with lots of tables i want to scrape. I mention the pdf url in my message and with gemini's url context tool, i now have access to the pdf.
I can ask gemini to give me the pdf's content as a json and it complies most of the time. But at times, there's an introductory line like "Here's your json:". Those introductory lines interfere with programmatically using the output. They're sometimes there, sometimes not.
If I could have structured output at the same time as tool use, I can reliably use what gemini spits out as it'll be in a json, no annoying intro lines.