Good question.
Just one anecdotal data point here ... So take it for what it's worth.
I'm a grad student focusing on reinforcement learning but have a lot of interaction with many deep learning folk. I'd have to say that they seem mostly motivated to learn how to make deep learning even more powerful and how to apply it. Not so much solving what's going on inside the box.
We do understand in general, just not every neuron and input weight in particular, because they are too many for limited human working memory to hold at once.
An automated labyrinth could be solved by hand up to a size, but if it becomes larger, it would become "impossible to understand". We'd need to rely on computers to find the route for us.
It's not computer magic, just an ability to hold more data at once. We're limited to 7-8 things - try to remember a string of more digits and see if you can - it's hard, we just have that kind of limitation. So any computer algorithm that can't be broken down into 7-8 understandable parts is hard to grasp.
Of course we can make ML tools synthesize the preferred input of any neuron. That does help a little.
For many researchers, "understand" relates more to the underlying math. Things like "what kind of convergence properties and guarantees are there" and "formality, what can this model learn and not learn"?