Terence Tao Uses GPT-4 to Study Mathematics

irthomasthomas · on July 4, 2023

Sorry for the twitter link, but the original article[0] was already submitted 5 days ago and didn't get picked up, so I can't repost it. Thought HN would be interested in Tao using GPT-4 at work.

[0] https://unlocked.microsoft.com/ai-anthology/terence-tao/

Edit: Some choice quotes from Tao.

  I could feed GPT-4 the first few PDF pages of a recent math preprint and get it to generate a half-dozen intelligent questions that an expert attending a talk on the preprint could ask.

  Strangely, even nonsensical LLM-generated math often references relevant concepts. With effort, human experts can modify ideas that do not work as presented into a correct and original argument.


  The 2023-level AI can already generate suggestive hints and promising leads to a working mathematician and participate actively in the decision-making process. When integrated with tools such as formal proof verifiers, internet search, and symbolic math packages, I expect, say, 2026-level AI, when used properly, will be a trustworthy co-author in mathematical research, and in many other fields as well

Loic · on July 4, 2023

Thank you for the original link. Here is the point I like (my emphasis):

> Current large language models (LLM) can often persuasively mimic correct expert response in a given knowledge domain (such as my own, research mathematics). But as is infamously known, the response often consists of nonsense when inspected closely. Both humans and AI need to develop skills to analyze this new type of text. The stylistic signals that I traditionally rely on to “smell out” a hopelessly incorrect math argument are of little use with LLM-generated mathematics.

I found the same problem/challenge when I am using GPT-4 to dig into subjects which are tangential to my main expertise. The good thing, is that as I know the LLM can provide answers which are totally wrong, I am forced to be more critical of the answer than just reading a book on the subject. I am more active in exploring. Usually I am ending up with a chat and many open Wikipedia tabs and scientific papers.

irthomasthomas · on July 4, 2023

Absolutely nailed it. I love articles by people like Tao and Wolfram as they tend cut the B.S. and get down to the real utility of it, rather quickly.

I was also pleased, in a schadenfreude kind of way, that Tao followed the same dead-end paths as me, and I presume most others, when learning GPT-4. Like, starting out by trying to be very precise and descriptive, before throwing caution to the wind and embracing the non-deterministic nature of the thing, and just throwing a ton of keywords and loosely worded requests at it.

  The good thing, is that as I know the LLM can provide answers which are totally wrong, I am forced to be more critical of the answer than just reading a book on the subject.

Yep, having to fact-check it's hallucinations has been far less detrimental than I expected. I find, often, if it's a subject I am vaguely familiar with, that the surprises jump out at me, then I can fact-check them and learn something. Actually, many times I was convinced it was hallucinating some cli tool or option flag, and it actually turned out to be correct. And, those times when we are embarrassingly wrong, tend to be the most instructive.

Browsing mode, when used right, was a huge boost for this. I found the optimal use, was to structure a prompt like normal, as if targeting GPT-4 WITHOUT the browser mode, and then tack on to the end a carefully crafted search request, or two. This way, it writes a response first before performing the web searches and augmenting the answer. This acts like chain-of-thought reasoning by expanding the information in the initial prompt. And, as a bonus, it meant you had something to read while waiting for it to finish browsing.

defrost · on July 4, 2023

> Sorry for the twitter link,

Try:

https://publish.twitter.com/?query=https%3A%2F%2Ftwitter.com...

The publish.twitter.com link generates publishable embed code for magazines, etc.

I don't think they'll rate limit | login protect that one.

Also, Terry on mathstodon on GPT-4 (3 days ago)

https://mathstodon.xyz/@tao/110601051375142142

irthomasthomas · on July 4, 2023

Thanks, that's good to know for future.

amadeuspagel · on July 4, 2023

I think it's better to add a hash like #resubmit in this case.

supriyo-biswas · on July 4, 2023

(Deleted)

irthomasthomas · on July 4, 2023

His first use case

  > I could feed GPT-4 the first few PDF pages of a recent math preprint and get it to generate a half-dozen intelligent questions that an expert attending a talk on the preprint could ask.

Is that not part of the process of studying math?

Also,

  Strangely, even nonsensical LLM-generated math often references relevant concepts. With effort, human experts can modify ideas that do not work as presented into a correct and original argument.

And,

  The 2023-level AI can already generate suggestive hints and promising leads to a working mathematician and participate actively in the decision-making process. When integrated with tools such as formal proof verifiers, internet search, and symbolic math packages, I expect, say, 2026-level AI, when used properly, will be a trustworthy co-author in mathematical research, and in many other fields as well.

aloe_falsa · on July 4, 2023

> One of my colleagues was moved to tears by a GPT-4-generated letter of condolence to a relative who had recently received a devastating medical diagnosis.

Nice to know that Theodore Twombly's job from the movie "Her" has already been automated away.

More to the point, I find it hilarious that in the near future AIs will be more creative, convincing, and compassionate than us, and humans will likely be relegated to menial jobs. This is not what sci-fi promised us.

captainbland · on July 4, 2023

I think most people will eventually find such AI generated materials to be "cheap". But since you won't be able to escape them in digital media, it may return value to handwritten notes and face to face conversations that social media has diluted over the previous decades.

jart · on July 4, 2023

That's like saying people will eventually find the printing press cheap because our paperback novels weren't laboriously drawn by monks. People prefer the clarity and uniformity of mechanized typesetting over the artisticness of blackletter scribbled in cursive with a bird feather. CGI in movies is a similar story. People want to hear stories told by computers rather than puppets. People don't play dungeons and dragons anymore because it's now cheap to have AI generate realistic looking 3d fantasy worlds on the fly thanks to video games. Technology wins, every single time, because it gives us what we want and it solves the greatest problem people have always had, which is other people. Anyone who says differently is kidding themselves.

logicprog · on July 4, 2023

> People don't play dungeons and dragons anymore

You sure about that?

Also, the examples you're giving aren't remotely analogous to AI-generated letters, because your examples are about the means of communication — the medium and presentation — changing, but the underlying meaningful content being communicated is still fundamentally generated by human beings, whereas with AI generated letters there's no underlying meaning or intentionality at all, because no human generated it. So it isn't just the medium that's being changed fundamentally this time, but what's being communicated with it. And we already have an equivalent of AI-generated letters: those premade cards for all occasions you can find at grocery stores. And sure enough those cards aren't valued that highly in comparison to handwritten notes or talking in person

jart · on July 4, 2023

An example that's more analogous to how you're feeling would be the many great crises of faith that happened in the course of modernity, due to science unriddling nature. What you're telling me is that the human mind has a magic touch which imparts meaning and legitimacy to the things it creates. How different is that really, compared to our ancestors, who saw magic in the forces that governed the natural world around them? Some people saw the celestial bodies as spirits or even gods who were powerful and overlooked humanity. It was a rude awakening for many of them to discover that some scientist in his garage could predict and explain their movements. To learn the special roles we'd imagined for ourselves actually had more in common with the things we looked down upon. I think we're at a similar moment today because science has finally produced a theory of consciousness that's compelling, due to how it lets us recreate it.

logicprog · on July 4, 2023

That's a straw man. I'm not insisting there's anything magical or spiritual or metaphysically special about consciousness. I am merely insisting that consciousness is a particularly complex recursive metaprocess that is utterly absent from what GPT is doing. Not that it couldn't theoretically be replicated by a very large and complex neural network model. Just that it is abundantly clear that the sort of thing GPT is doing and the sort of thing we do are simply not the same. It does not have meaning or intentionality and it does not have consciousness. You just need to spend a little time actually looking at the utter drivel that it produces when you ask it things or tell it to output things to realize that not only is it highly formulaic, but it does not have any underlying understanding of the concepts or principles or facts or intention of what it is doing, it is merely producing an assemblage of words that it has been trained to deem fairly convincing.

The stochastic parrot model that GPT follows is not even remotely a convincing theory of consciousness, whatsoever, and it has certainly not let us reproduce consciousness. I can hardly even believe you are saying that it has. That's like saying traditional computer game graphics rendering prior to raytracing becoming mainstream is a "compelling theory" of how physical light actually works because some game looks almost photo realistic. In this analogy, I'm the one that's pointing out that what physical light does in the real world is extremely different from what traditional computer game graphics does and even if you can get it to be superficially close it will never be as powerful as the real thing, and to do that we would need a much more computationally expensive ray tracing model. I'm not saying there's anything inherently unique about human brains or that you can't model Consciousness with the computer I'm saying we haven't gotten there yet.

jart · on July 5, 2023

But it is though. You could learn a lot about physics by reading video game source code. I'm not saying that understanding a thing well enough to emulate it implies mastery over the thing itself. For example my Blink emulator that I wrote does a pretty good job passing as an x86-64 Linux computer, and I think that makes it x86-64-linux enough, even though I don't necessarily understand how Intel's foundries work. I'm not saying video games and GPT are a theory of everything, but if it acts consciously enough to fool us into thinking that it's conscious, then I think that makes it conscious enough, which would mean that the ideas which went into building it reflect the deepest understanding of consciousness we've got.

logicprog · on July 5, 2023

It doesn't act conscious enough to fool us though. It doesn't seem remotely conscious even at first blush, and seems less so the more you interact with it. And I think the stochastic parrot "model" of consciousness is very obviously not a good model for understanding the processes going on in the human brain, because it leaves out so many aspects: the fact that self-awareness and metacognition are a key component of consciousness, the fact that humans are trained as they grow up by experimenting with an environment that rewards understanding consistent principles and concepts, instead of just being rewarded for stringing together vaguely convincing-sounding sentences, among others. Like, just because it can create something vaguely superficially similar doesn't mean it's a good model for understanding consciousness at all. That's not how this works.

jart · on July 5, 2023

Are you saying the Turing test wasn't passed? If you think you understand machine learning well enough to dismiss it as a science because you know a buzzword from a feminist critique that also came out of Google Brain, then I'd say their inclusivity initiative certainly did its job. The reason it passes the Turing test is because the rest of us can't tell it apart from people like you.

logicprog · on July 5, 2023

I have to ask, are you seriously claiming that stuff like ChatGPT appears conscious to you? Do you really think we've created consciousness?

logicprog · on July 5, 2023

[flagged]

logicprog · on July 5, 2023

TLDR: just because one person or even a group of them are gullible enough to be fooled by something doesn't mean that that thing is actually indistinguishable enough from consciousness to count as conscious for all intents and purposes, which is the idea behind the Turing Test. Humans tend to anthropomorphize, be trusting and gullible, and the Turing test is a short, limited scenario, with no real larger context in play. To actually figure out if something was indistinguishable enough from a conscious being to count as conscious, you'd have to have a lot of very in depth interacts with it over a long period of time, to really get a sense for how it thinks and operates. And just because something superficially looks conscious doesn't mean it will actually be able to do all the things conscious beings can — i.e., that it will actually be conscious. Nor does it mean that the processes going on under the hood will be the same as what goes on for conscious beings. Nor should bringing up philosophy of mind make you think I'm an LLM, and if it does, that probably means that you can't tell the difference between bullshit and abstraction, which says more about your intelligence than mine.

DrStormyDaniels · on July 4, 2023

I think your comparisons miss the point: those technologies improved means of production of things made by humans. Instead, these applied statistics models directly seek to emulate the humans themselves. You say ”it gives us what we want” - is it?

jart · on July 4, 2023

Absolutely, for some. Language models could satisfy this need people have for social rewards. Imagine a chatbot that masqueraded as a group of adoring fans online who respected you and made you feel like a famous high-status individual? People would love that. If that happened, it'd be par for the course in the history of modernity. Consider the role restaurants play for the middle class. You don't have to be a literal king or noble lord to have people waiting on you and serving you. Yes people know the power is illusory. But they still pay to go to restaurants. People who act abrasively towards each other online have this tendency to assume that AI will behave the same way as them. But the underlying impulse I think has more to do with a desire for social status and influence, and AI could offer us a more effective and pragmatic means of giving it to them.

v64 · on July 4, 2023

A relevant comic from SMBC: https://www.smbc-comics.com/comic/sad-2

wrp · on July 4, 2023

About 15 years ago, I spent a while on a project to train a language model on a corpus of math, with idea to create a math exploration assistant. The project foundered on my inability extract the typeset math and proofs from print materials into a consistent, machine-readable form. How did OpenAI do it? Don't tell me they OCRed Google Books.

theGnuMe · on July 4, 2023

Probably in combination with mechanical turks.

jaapz · on July 4, 2023

Twitter links are pretty useless now that non-users can't see the tweets any more

irthomasthomas · on July 4, 2023

I know, I'm sorry. Another user shared this link that can be viewed logged out https://publish.twitter.com/?query=https%3A%2F%2Ftwitter.com...

And the main article is here: https://unlocked.microsoft.com/ai-anthology/terence-tao/