It's a surprise that GPT-4 could improve so much in maths, especially GPT 3.5 didn't seem to understand many word problems correctly.
For example, ChatGPT still struggle with this very simple problem, how GPT-4 could do much better is for me a bit of mystery:
Mina has a mix of boxes, some yellow and some purple. She
sorts 27 greeting cards into the boxes, putting exactly 3
cards into each yellow box, and 7 cards into each purple box. How many purple boxes does Mina have?
(After tried from 3 to 10, it gave up and said the problem is not solvable. In another run, it mimicked a correct strategy but messed up totally by division. Only in one run, it got the answer correctly.)
Mina has a mix of boxes, some yellow and some purple. She sorts 27 greeting cards into the boxes, putting exactly 3 cards into each yellow box, and 7 cards into each purple box. How many purple boxes does Mina have? (After tried from 3 to 10, it gave up and said the problem is not solvable. In another run, it mimicked a correct strategy but messed up totally by division. Only in one run, it got the answer correctly.)
I can not wait to test it out.