I wince at submitting my own tweets to HN, since I feel like tweets should have a very high bar to be featured on HN.
But this is sort of an Ask HN, or "ask the world" type question, and I spent some time formulating an exact description of the idea in that tweet chain.
Anyone know of prior research into this? This is a "meta" neural network. It's a network that learns the best techniques to help some other network learn some dataset.
Not only would this yield an optimizer specific to a certain training session, but the outcome would be specific to that specific dataset!
The advantage is, you'd be able to make that optimizer available to everyone else who's training on that same dataset, and say "as long as you keep the dataset the same, here's an optimizer that goes from 3.0 loss to 0.1 loss in the shortest number of training steps."
But this is sort of an Ask HN, or "ask the world" type question, and I spent some time formulating an exact description of the idea in that tweet chain.
Anyone know of prior research into this? This is a "meta" neural network. It's a network that learns the best techniques to help some other network learn some dataset.
Not only would this yield an optimizer specific to a certain training session, but the outcome would be specific to that specific dataset!
The advantage is, you'd be able to make that optimizer available to everyone else who's training on that same dataset, and say "as long as you keep the dataset the same, here's an optimizer that goes from 3.0 loss to 0.1 loss in the shortest number of training steps."