I've applied the S4 operator to successfully do long-length video classification. It's massively more efficient than a similarly scaled transformer, but it doesn't train as well. Still, even with S4 I got some impressive results, looking forward to more.