I'm not very confident either -- some of users in our user study also reports the same. :)
We are recently thinking about how to have AI provide more information in ways we can interpret. For example, might adding some feature to hint if AI's computation use a specific formula that users didn't mention. I feel working with AI is like the first week working with a someone we never met before.
We still really should not be confident. We have seen this time and time again that the output cannot be relied on.
People keep trying to downplay the problem as "not a big deal" or that it happens rarely, but that is kinda the problem. It is right enough times that people get confident in it and we stop double checking the work. When it is manipulating data that makes it even more problematic.
Now if what is happening here is the AI is generating the code that makes the visualizations and that code is still working in a traditional way (accessing the data directly), than I think its fine. Sure maybe some of the code may be wrong, but at least it is not manipulating the data and coming up with wild conclusions.
This is something I constantly struggle with. Its fine for some generative sales, marketing language but for data analysis wearing the risk that the output is even 0.1% wrong is not an option for many.
So you end up checking what the AI does which negates the productivity argument entirely.
I would love if we could seed LLMs with specific books, or give our own weightings to sources. I'm sure most books are in there already, I would happily pay extra (and want it to go to original authors) for known provenance of advice. Even pass royalties back to the original authors. For PyData code, I'm always looking at Effective Python/Polars.
this is a big challenge! the more complicated the code generated by AI, the more likely is going wrong and harder to verify. (more magic == more risk)
I'm curious how to really restrict what AI is generating each step so it is simple enough to verify and edit, yet not making it seems too verbose and slow