1. It's confusing. The fact that there's a "role" attribute that is implicit sometimes and needs to be declared other times and a "aria-XYZ" set of attributes is stupid. The purist HTML take was always "Markup should describe content" - now we're describing content with attributes?
2. It's positioned specifically as being for accessibility. This is bad for several reasons. First, it's not necessary. The entire goal of semantic markup in the first place was to make it so that the content of documents can be read by machines (both screen readers and other systems) - tacking on a special system just for accessibility is basically them giving up on this entirely. Secondly, a lot of people don't care about accessibility. Like I said, in our current system people barely know how to write HTML. Or they write some bastardized version of it in JSX or w/e. You think they're going to learn the intricacies of ARIA? No.
The goal should be for HTML to be a language that when written properly is accessible by default. Not just "hey everybody it's cool if you use divs for everything, just make sure screen readers know what it's supposed to be."