Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

No, because LSTMs are recurrent. You couldn't use the same algorithm outlined here. Instead you'd have to iteratively pass elements of the sequence through the same layer over and over.


You are confused. The recurrence is within a layer, not between layers. The algorithm shown is for applying a stack of layers, but it doesn't really matter what the layers are. You can do the same (and people have been doing the same) with RNNs, convolutional networks, etc.

In reality it would typically be more complex for decoders, because you want to pass along a cache (such as a key-value cache in a transformer), add residual connections, etc.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: