It depends on what you’re testing. Much of a typical page is visual noise that is invisible to the accessibility tree but is often still something you’ll want tests for. It’s also not uncommon for accessible ui paths to differ from regular ones via invisible screen-reader only content, eg in a complex dropdown list. So you can end up with a situation where you test that accessible path works but not regular clicks!
If you really want gold standard screen reader testing, there’s no substitute for testing with actual screen readers. Each uses the accessibility tree in its own way. Remember also that each browser has its own accessibility tree.
When UI is only visual noise and has no impact on functionality, I don't see much value in automated testing for it. In my experience these cases are often related to animations and notoriously difficult to automate tests for anyway.
When UX diverges between UI and the accessibility tree, I'd really expect that to be the exception rather than the rule. There would need to be a way to test both in isolation, but when one use case diverges down two separate code paths it's begging for hard to find bugs and regressions.
Totally agree on testing with screen readers directly though. I can't count how many weird differences I've come across between Windows (IE or Edge) and Mac over the years. If I remember right, there was a proposed spec for unifying the accessibility tree and related APIs but I don't think it went anywhere yet.
If you really want gold standard screen reader testing, there’s no substitute for testing with actual screen readers. Each uses the accessibility tree in its own way. Remember also that each browser has its own accessibility tree.