This looks very intriguing, but I'm at a loss for what it is and how it works. Can someone provide an explanation at a level between a tweet and an academic paper?
Ah algebraic effects systems, if only more languages had them. Then we wouldn't need keywords like async, mutable, immutable, etc, or function colors at all. Rust is beginning to add something like this.
Add and Mult are added as extension of the type Effect.t and run installs the effect handler. To add more operations you would have to extend Effect.t again then setup an handler somewhere.
The current effects syntax is very rough. It’s considered an experimental feature by the compiler team. Currently effects handlers are exposed to users as a thin wrapper on top of their implementations in the runtime. A better syntax will come but the Ocaml dev team wants to take the time necessary to get it right.
I'm only superficially involved with machine learning, is it ever required for applications to implement their own differentation for something specific? In my line of work it's mostly transfer learning, maybe with small modifications to architectures and retraining. Is this only for researchers?
Automatic differentiation gives you the ability to freely define the forward propagation of your neural network and you get backpropagation for free. The NN library "Flow" for Julia makes great use out of this. Having automatic differentiation makes it very simple to define novel layers, with very little work.
>is it ever required for applications to implement their own differentation for something specific?
If you want to do backpropagation you need to manually or automatically calculate derivatives woth regards to your parameters.
IIRC even two years ago you could get gradients by AD in Flux (you are completely correct about the name).
Nowadays you have https://fluxml.ai/Flux.jl/stable/training/zygote to calculate gradients with AD.
In any case AD in general is useful for NNs if you want to implement a novel layer. Of course you could instead derive back propagation by algebraic or manual differentiation instead.