Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> In contrast, running the neural network takes 5μs (twenty thousand times faster) with only a small loss in accuracy. This “approximate function inversion via gradients” trick is a very general one that can not only be used with dynamical systems, but also lies behind the fast style transfer algorithm.

Very interesting! Follow up question -- how would you choose the network architecture?



Since the network only acts on a small portion of the entire system, we can constrain it in such a way that dramatically simple NNs work just fine.

`FastChain(FastDense(3,32,tanh), FastDense(32,32,tanh), FastDense(32,2))` (from [0]) would take three inputs from your basis, run it through one hidden layer and provide you with two trained parameters.

This [1] example uses two hidden layers, its one of the more complex solutions I've seen so far. To move to this complexity from a simpler chain, we first make sure our solution is not in a local minima [2], then proceed to increase the parameter count if the NN fails to converge.

[0] https://diffeqflux.sciml.ai/dev/FastChain/ [1] https://github.com/ChrisRackauckas/universal_differential_eq... [2] https://diffeqflux.sciml.ai/dev/examples/local_minima/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: