Hacker News new | past | comments | ask | show | jobs | submit login
Teaching a neural network to use a calculator (reiinakano.com)
78 points by baylearn on Nov 14, 2019 | hide | past | favorite | 12 comments



Here the neural network was given examples of how to use the calculator for each question which means it wasn't generating it's own abstractions.

If you wanted to use this to solve other (e.g. programming) problems you would need examples of every step required for almost every problem.

Using neural networks in this way is akin to locality sensitive hashing, instead it should understand what it's lowest level operators do and discover useful combinations of them that can solve new problems.


I haven't been following this field, but anyone know what happened to Neural Programmer Interpreters (2015)? It seemed like such a promising direction back then. It showed that a neural network can learn to use arbitrary commands to execute algorithms such as multidigit addition and bubble sort: http://www-personal.umich.edu/~reedscot/iclr_project.html

That seems like a much better demo of using blackbox tools as substeps in problem solving. Is there a reason why it shouldn't work when the blackbox is a more complex function like sympy's eval?


> Something that intrigued me in Saxton et. al.’s paper was how high a baseline transformer scored on probability tasks (~0.77 and ~0.73), given that working these out are a multi-step process. How could basic pattern-matching score so highly on such a task? Is mere perception enough to figure out something like the probability product rule, on such a generic architecture without any prior knowledge of numbers or probability?

> To try and explain this, we point out that although questions are unique, a lot of them will share the same answers. For example, Calculate prob of sequence aad from abcda, Calculate prob of sequence bbz from zbbmn, and Calculate prob of sequence rpr from {r: 2, p: 1, x:2} all lead to the same answer, 1/30.

> Doing a bit of analysis on training set questions, we find that out of 1 million samples each, swr_p_level_set and swr_p_sequence have 977179 and 978045 unique questions, respectively. This seems reasonable, as duplicates are limited to <3% of the training set and the distribution over questions appears fairly uniform.

> On the other hand, doing analysis on training set answers reveals that out of 1 million samples eachs, swr_p_level_set and swr_p_sequence have 1458 and 1865 unique answers, respectively.

> Counting the collective number of samples that share the top K most common answers reveals even more imbalance.

This is the real takeaway for me from the article.


From the title, I was expecting the neural network to take an input (e.g., speech or a string "5+11+3=") and then control mouse movements to push the keys on a calculator program (e.g., Windows Calculator). I.e., a neural network driving an existing user interface based on commands from a user.

But the article is more about using neural network transformers to build steps of a mathematical proof with each step checked by a symbolic "calculator". I.e., transformers applied to mathematical proofs.


The fact that a neural network isn't even able to calculate, even if only trained to do this show how limiting are neural network only AGIs.


Of course you could train a NN to do arithmetic, but this is much more impressive. Training a NN network to solve problems with available tools means more abstraction, and is closer to AGI than just essentially learning a LUT.


> Of course you could train a NN to do arithmetic

Are we really capable of teaching a NN to parse and calculate an arbitrary arithmetic expression? Because that sounds incredibly impressive...


Yes, it can be done.

https://openreview.net/pdf?id=S1eZYeHFDS

Natural language is harder.


I'm not sure. Human is general intelligence, but has to learn basic maths too.


Here’s one that does so using roman numerals:

http://static.offd.es/numerals/

It’s unsurprisingly easy to implement


Interesting but can it do more than addition? Also it doesn't seems to have 100% accuracy.


Yeah. I only trained addition. Actually exploring the impact of training a net to perform a range of operations on the minimum plausible neuron count would be quite interesting

I don’t see any reason why it would be significantly harder to do, however

You’re right about accuracy. I didn’t let the model train enough to push the error low enough to guarantee exact results over the input range. But then again this was designed as a toy experiment, not something people should rely on




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: