You're right: It's quite possible that the new crop of linear RNNs may not be able to solve the same problems as Transformers.
It could also be that we will have to increase the depth of these linear RNNs (possibly even dynamically) to enable them to perform comparably well on all problems. Right now it's hard to tell for sure. I don't know.
What I do know is that recently proposed linear RNNs are performing comparably to Transformers at similar model sizes, but have not yet been evaluated at state-of-the-art scales.
You're right: It's quite possible that the new crop of linear RNNs may not be able to solve the same problems as Transformers.
It could also be that we will have to increase the depth of these linear RNNs (possibly even dynamically) to enable them to perform comparably well on all problems. Right now it's hard to tell for sure. I don't know.
What I do know is that recently proposed linear RNNs are performing comparably to Transformers at similar model sizes, but have not yet been evaluated at state-of-the-art scales.