More

soist · on July 14, 2024

There is no such thing as too much optimization. Early stopping is to prevent overfitting to the training set. It's a trick just like most advances in deep learning because the underlying mathematics is fundamentally not suited for creating intelligent agents.

rocqua · on July 14, 2024

Is over fitting different from 'too much optimization'? Optimization still needs a value that is optimized. Over fitting is the result of too much optimization for not quite the right value (i.e. training error when you want to reduce prediction error)

soist · on July 15, 2024

What value is being optimized and how do you know it is too much or not enough?

godelski · on July 15, 2024

I think the miscommunication is due to the proxy nature of our modeling. From one perspective, yes you're right because it's just on your optimization function and objectives. But if we're in the context where we recognize the practical usage of our model replies on it being an inexact representation (proxy) then certainly there is too much optimization. I mean most of what we try to model in ML is intractable.

In fact, that entire notion of early stopping is due to this. We use a validation set as a pseudo test set to inject information into our optimization products without leaking information from the test set (why you shouldn't choose parameters based on test results. That is spoilage. Doesn't matter if it's status quo, it's spoilage)

But we also need to consider that a lack of divergence between train/val does not mean there isn't overfittng. Divergence implies overfittng but the inverse statement is not true. I state this because it's both relevant here and an extremely common mistake.

soist · on July 15, 2024

Most practitioners seem to understand that what they are doing is creating executable models and they don't confuse the model based on numeric observations with the actual reality. This is why I very much do not like all the AI hype and how statistical models were rebranded as artificial "intelligence" because the people who are not aware of what the words mean get very confused and start thinking they are nothing more than computers executing algorithms to fit numerical data to some unspecified cognitive model.

godelski · on July 15, 2024

> Most practitioners seem to understand that what they are doing is creating executable models and they don't confuse the model based on numeric observations with the actual reality.

I think you're being too optimistic, and I'm a pretty optimistic person. Maybe it is because I work in ML, but I've had to explain to a large number of people this concept. This doesn't matter if it is academia or industry. It is true for both management and coworkers. As far as I can tell, people seem very happy to operate under the assumption that benchmark results are strong indicators of real world performance __without__ the need to consider assumptions of your metrics or data. I've even proven this to a team at a trillion dollar company where I showed a model with lower test set performance had more than double the performance on actual customer data. Response was "cool, but we're training a much larger model on more data, so we're going to use that because it is a bit better than yours." My point was that the problem still exists in that bigger model with more data, but that increased params and data do a better job at hiding the underlying (and solvable!) issues.

In other words, in my experience people are happy to be Freeman Dyson in the conversation Calavar linked[0] and very upset to hear Fermi's critique: being able to fit data doesn't mean shit without either a clear model or a rigorous mathematical basis. Much of data science is happy to just curve fit. But why shouldn't they? You advance your career in the same way, by bureaucrats who understand the context of metrics even less.

I've just experienced too many people who cannot distinguish empirical results from causal models. And a lot of people who passionately insist there is no difference.

[0] https://news.ycombinator.com/item?id=40964328

soist · on July 14, 2024

AlphaZero did not create any experiences. AlphaZero was software written by people to play board games and that's all it ever did.

rocqua · on July 14, 2024

Are you launching into a semantic argument about the word 'experience'? If so, it might help to state what essential properties alphago was missing that makes it 'not having an experience'.

Otherwise this can quickly devolve into the common useless semantic discussion.

soist · on July 15, 2024

Just making sure no one is confused by common computationalist sophistry and how they attribute personal characteristics to computers and software. People can have and can create experiences, computers can only execute their programmed instructions.

gowld · on July 15, 2024

Username `soist is an abbreviation for `solipsist, then?

HeatrayEnjoyer · on July 15, 2024

On what priors are you making that statement?

soist · on July 15, 2024

Rephrase your question. I don't know what you're asking.

nertirs · on July 15, 2024

I think he meant to ask, what is the difference between an experience and a predefined instruction?

visarga · on July 14, 2024

AZ trained in self-play mode for millions of games, over multiple generations of a player pool.

soist · on July 14, 2024

I am familiar with the literature on reinforcement learning.

pharrington · on July 14, 2024

They're saying the board games AlphaZero played with itself are experiences.

soist · on July 15, 2024

And I am saying they are confused because they are attributing personal characteristics to computers and software. By spelling out what computers are doing it becomes very obvious that there is nothing that can be aware of any experiences in computers as it is all simply a sequence of arithmetic operations. If you can explain which sequence of arithmetic operations corresponds to "experiences" in computers then you might be less confused than all the people who keep claiming computers can think and feel.

nyssos · on July 15, 2024

> By spelling out what computers are doing it becomes very obvious that there is nothing that can be aware of any experiences in computers as it is all simply a sequence of arithmetic operations.

By spelling out what brains are doing it becomes very obvious that it's all simply a sequence of chemical reactions - and yet here we are, having experiences. Software will never have a human experience - but neither will a chimp, or an octopus, or a Zeta-Reticulan.

Mammalian neurons are not the only possible substrate for intelligence; if they're the only possible substrate for consciousness, then the fact that we're conscious is an inexplicable miracle.

godelski · on July 15, 2024

If an algorithmic process is an experience and a collection of experiences is intelligence then we get some pretty wild conclusions that I don't think most people would be attempting to claim as it'd make them sound like a lunatic (or a hippy).

Consider the (algorithmic) mechanical process of screwing in a screw into a board. This screw has an "experience" and therefore intelligence. So... The screw is intelligent? Very low intelligence, but intelligent according to this definition.

But we have an even bigger problem. There's the metaset of experiences, that's the collection of several screws (or the screw, board, and screwdriver together). So we now have a meta intelligence! And we have several because there's the different operations on these sets to perform.

You might be okay with this or maybe you're saying it needs memory. If the later you hopefully quickly realize this means a classic computer is intelligent but due to the many ways information can be stored it does not solve our above conundrum.

So we must then come to the conclusion that all things AND any set of things have intelligence. Which kinda makes the whole discussion meaningless. Or, we must need a more refined definition of intelligence which more closely reflects what people actually are trying to convey when they use this word.

nyssos · on July 15, 2024

> If an algorithmic process is an experience and a collection of experiences is intelligence

Neither, what I'm saying is that the observable correlates of experience are the observable correlates of intelligence - saying that "humans are X therefore humans are Y, software is X but software is not Y" is special pleading. The most defensible positions here are illusionism about consciousness altogether (humans aren't Y) or a sort of soft panpsychism (X really does imply Y). Personally I favor the latter. Some sort of threshold model where the lights turn on at a certain point seems pretty sketchy to me, but I guess isn't ruled out. But GP, as I understand them, is claiming that biology doesn't even supervene on physics, which is a wild claim.

> Or, we must need a more refined definition of intelligence which more closely reflects what people actually are trying to convey when they use this word.

Well that's the thing, I don't think people are trying to convey any particular thing. I think they're trying to find some line - any line - which allows them to write off non-animal complex systems as philsophically uninteresting. Same deal as people a hundred years ago trying to find a way to strictly separate humans from nonhuman animals.

gowld · on July 15, 2024

Continuing this reductio ad abusrdum, you might reach the fallactious conclusion, as some famous cranks in the past did, that intelligence is even found in plants, animals, women, and even the uncivilized savages of the new continent.

Intelligence appears in gradients, not a simple binary.

godelski · on July 15, 2024

> Intelligence appears in gradients, not a simple binary.

Sure, I'm in no way countering such a notion and your snarky comment is a gross mischaracterization of my comment. So far off I have a difficult time believing it isn't intentional.

The "surprise" is not that plants, animals, or even women turn out to be intelligent under the definition of "collection of experiences" but that rocks have intelligence, atom, photons, and even more confusingly groups of photons, the set of all doors, the set of all doors that such that only one door per city exists in the same set. Or any number of meta collections. This is the controversial part, not women being intelligent. Plants are still up for debate, but I'm very open to a broad definition of intelligence.

But the issue is that I, and the general fields of cognitive science, neuroscience, psychology, and essentially everyone except for a subset of computer scientists, agree that intelligence is more than a collection of experiences (including if that collection has memory). In other words, it is more than a Turing Machine. What that more is, is debated but it is still generally agreed upon that intelligence requires abstraction, planning, online learning, and creativity. But all these themselves have complicated nuanced definitions that are much more than what the average person thinks they mean. But that's a classic issue where academics use the same words normal people do but have far more restrictions on their meaning. Which often confuses the average person when they are unwilling to accept this fact that words can have different meanings under different contexts (despite that we all do this quite frequently and such a concept exists in both our comments).

sebastiennight · on July 23, 2024

You seem to use the word intelligence to mean `consciousness` (if you replaced the first with the latter I would agree with your argument).

I would define "intelligence" as (1) the ability to learn or understand or to deal with new or trying situations and (2) the ability to apply knowledge to manipulate one's environment.

It turns out that this is also the Merriam-Webster definition [0]. By that definition, yes AlphaZero was learning and understanding how to deal with situations and is intelligent, and yes most machine-learning systems and many other systems that have a specific goal and manipulate data/the environment to optimize for that goal, are intelligent.

By this definition, a non-living, non-conscious entity can be intelligent.

And intelligence has nothing to do with "experiences" (which seem to belong in the "consciousness" debate).

[0]: https://www.merriam-webster.com/dictionary/intelligence

soist · on July 15, 2024

This is a common retort. You can read my other comments if you want to understand why you're not really addressing my points because I have already addressed how reductionism does not apply to living organisms but it does apply to computers.

Dylan16807 · on July 15, 2024

The comments where you demand an instruction set for the brain, or else you'll dismiss any argument saying its actions can be computed? Even after people explained that lots of computers don't even have instruction sets?

And where you decide to assume that non-computable physics happens in the brain based on no evidence?

What a waste of time. You "addressed" it in a completely meaningless way.

soist · on July 14, 2024

Every software platform is essentially a digital panopticon and as AI gets deployed more widely in the real world this will also be increasingly true for non-software interactions. My guess is everyone is eventually going to carry an AI "assistant" that records all signals and gives guidance to its owner just like most people in the developed world today carry a cell phone and a credit card.

soist · on July 10, 2024

LLMs can be whatever labels people choose to attribute to the system executing the instructions to generate "answers". It is fundamentally a category error to attribute any meaning to whatever arithmetic operations the hardware is executing because neither the hardware in the data center nor the software have any personal characteristics other than what people erroneously attribute to them because of confused ontologies and metaphysics about computers and software.

HeatrayEnjoyer · on July 11, 2024

At which point would such attributions be accurate? Humans are fundamentally just computers too. A different medium, but still transforming electrical signals.

soist · on July 11, 2024

Extremely weird to me when people compare themselves to computers. What is that philosophical stance called and do you have any references for long form writing which makes the case for why people are "just" computers?

radarsat1 · on July 11, 2024

It's definitely a philosophical position that exists beyond a random HN poster. See for example computational theory of mind https://plato.stanford.edu/entries/computational-mind/

soist · on July 11, 2024

I am familiar with the computational theory of cognition. What I wanted to know was whether there were any people who actually claimed their thinking is nothing more than programmed computation. I am very curious to know if they have mapped out the instruction set for their mind along the lines of something like the SKI combinators.

kybernetikos · on July 11, 2024

A mental instruction set would be extremely interesting. Unfortunately, nobody has that level of understanding of brain processes (and it might be quite difficult to formulate in such a linear way since the underlying mechanism is so very parallel), but the idea that human cognition is computable falls pretty naturally out of the idea that nature is computable which I think is a common position (sometimes called the Church Turing Deutsch principle).

soist · on July 11, 2024

Yes, I understand why some scientists claim that nature is "just" some computer but no one still has given an answer to my very basic question: what is the instruction set that the people who claim they are computers are using to think? Surely there must be one if they are nothing more than programmable computers as they claim.

vidarh · on July 11, 2024

Why do you think there would need to be an instruction set? And why do you think we'd need to know of one to conclude we're all computers?

soist · on July 11, 2024

Just trying to figure out how rigorously people have thought about this. A computer with an undefined instruction set seems somewhat useless as a computer.

vidarh · on July 12, 2024

Whether or not we can discern an instruction set is entirely orthogonal to whether or not something can compute.

soist · on July 12, 2024

Oh I see, so it's magic.

vidarh · on July 12, 2024

If you don't know how something works, do you assume it is magic? Why? It's a wildly irrational assumption to assume that it is magic rather than assume absent evidence to the contrary that it works according to known physics.

soist · on July 12, 2024

Well, that's essentially a logical tautology. Everything works according to the known laws of physics but it's certainly true that everything must also work with unknown laws of physics because of basic human ignorance.

vidarh · on July 12, 2024

No it is not. The argument is that absent any evidence of the existence of such unknown physics happening in the brain the most logical assumption is to assume there isn't any, rather that presuming the existence of something we've never observed any hint of.

soist · on July 12, 2024

Unknown physics is happening all the time almost by definition but as I said in another thread, good luck with your computations.

kybernetikos · on July 12, 2024

Rule 110 doesn't have a clearly defined instruction set but is known to be Turing complete.

soist · on July 12, 2024

Rule 110 can be specified with a rewrite system, also known as cellular automata: https://arxiv.org/abs/0906.3248. Cellular automatons have a correspondence with contextual grammars: https://www.cis.upenn.edu/~cis5110/notes/tcbook-lang.pdf. Each is equivalent to a Turing machine, another way of saying that there is a program for it which can be specified on a Turing machine with the usual Turing machine instruction set for writing, reading, and erasing binary digits on a tape. This usual program can then be "compiled" into a rewrite system corresponding to the instruction set for rule 110.

The reason rule 110 is said to be Turing complete is because someone went through the trouble of specifying an instruction set for rule 110 so that other people could verify that it would be possible to write programs with it. This is not the case for the people who claim that they are computers. They always leave the instruction set undefined which makes their claims hard to believe.

I personally have no problem with people who think they're computers but if they're not programmable then I'm not sure what the point would be of calling themselves computers.

vidarh · on July 12, 2024

> This is not the case for the people who claim that they are computers. They always leave the instruction set undefined which makes their claims hard to believe.

What is your alternative? Can you explain to us how the brain could possibly do something - down to the atomic level - that would allow it to do something that not only is not possible to simulate, but that also does not still constitute computation?

We don't even have language for talking about such "operations" that are so different from all forms of computation that it is not just another form of computation.

Just try to describe one such hypothetical state change that can not be reproduced with a Turing computable function.

At the same time, your insistence on "instruction sets" is meaningless. An "instruction set" the way we tend to consider them is not necessary to parameterize a function. A neural network with input/output used to provide the "tape" can trivially be made Turing complete. If you consider the weights or connections of the network an instruction set, then there you go - that we don't know how to measure and extract all the details of the neural network of a brain does not mean we can't observe their presence. And it also does not mean we haven't done a vast amount of measurements without observing any hint of unknown physics affecting state transitions.

To simplify it: Even a simple mechanical thermostat is parameterized - the dial provides "an instruction set" in the form of an ability to set a single threshold that alters the behaviour of the function computed.

But if you expect something that looks like what we typically talk about when we talk about an instruction set, then that is a very limiting view of computation, and one I've already pointed out to you is just one part of the multiple types of computational devices we've built. Including heavily parameterisable ones.

soist · on July 12, 2024

I expect claims to be backed by evidence that is consistent with our current state of knowledge. I have seen no such evidence so that's why I asked for references. In any case, this discussion has run its course, best of luck to you and your future computations.

prometheus76 · on July 11, 2024

And wouldn't that language need to be able to account for different physiological states? Thinking when one is hungry or sleepy is quite different than thinking when one is well-fed or fully rested.

soist · on July 11, 2024

Yes. To validate the claim would require not only a formal instruction set but also the code to account for all sorts of cognitive states and processes. I'm not ruling out that some people are indeed programmable computers but I would like to see some actual evidence presented by the people who make these claims about themselves.

vidarh · on July 11, 2024

For us to not be would require brains to be able to compute functions that can not be computed by an artificial computer. That would seem to be an extraordinary suggestion given we have no indication of unusual physics in the brain.

soist · on July 11, 2024

You'll have to define your terms first. Physicists now believe there is such a thing as dark matter and that there are objects so massive that no amount of observation can ever make sense of how massive they are because it is impossible to model it mathematically.

I am not the one making any extraordinary claims. Physicists themselves admit there are aspects of reality with no computational basis.

vidarh · on July 12, 2024

These terms have well understood meanings, and dark matter or black holes are entirely irrelevant to what I said.

For brains not to be computers would mean the physical Church-Turing thesis is invalid, and proof of that would be extraordinary enough to be Nobel Prize material.

soist · on July 12, 2024

Whether something is physical or not is orthognal to whether it computes or not. You're the one who brought up physics so that's why I showed why your logic was invalid. My contention was that calling something a computer without providing an instruction set was nonsensical and I wanted to know if someone had actually spent the time to rigorously think about what a computer without an instruction set would entail. So far it seems like no one has spent any time really thinking about it but that's probably for the best anyway. I'm sure an LLM will eventually figure out an instruction set for programming people and then take over the world.

vidarh · on July 12, 2024

The idea that a discernable instruction set is needed for something to compute suggests you don't understand how fundamental computation is.

We have built computers without instruction sets, e.g. in the form of mechanical devices to carry out calculations. Fairly complex computations were done that way before general purpose programmable computers, but even many early programmable computers had no fixed instruction set.

There is a rich history of computation through wiring up calculations without any instructions involved. And for that matter of mechanical computation.

Here's an outline for a simple computational device:

A bucket.

Pour predefined quantities of water into a bucket, and you can compute a threshold. Use buckets of different size and overflows, and you can separate a numeral into binary digits. Drain them into containers of different sizes and you can carry out logical operations. (Actual computation has been done this way - fluidics is one way, which dates back to the Tesla valve in 1920).

Every physical interaction is computation, whether or not it is useful computation. The notion computation requires an instruction set is confusing a very limited notion of classical programmable computers with the general concept of computation.

It is also a notion contradicted by the history of computation, which is full of computation without an instruction set, and of implementing computers with instruction sets in terms of computations of fixed function devices without one.

E.g. it's not turtles all the way down - that instruction set runs on a CPU that ultimately is built of fixed function logic.

Instruction sets are an optional high level abstraction.

baq · on July 11, 2024

Occam’s razor - ‘just computers’ is enough to explain the mind, so why bother with more?

soist · on July 11, 2024

Steam engines and gears were used for explaining the mind as well but those turned out to be incorrect metaphors.

vidarh · on July 12, 2024

Steam engines and gears are a specific physical manifestation of computation. Computation does not have a single, specific physical manifestation - it can, and has, been done with organic matter, electronics, gears, pipes of water, light.

Per the Church-Turing thesis these can all compute the same set of functions, and unless you can demonstrate that brains and only brains can evoke unknown physics that allows brains to compute a set of functions that can not be computed by other means, the most logical assumption is that it holds, including for brains.

Especially given how much we measure brains without seeing any signs of unusual physics.

soist · on July 12, 2024

I think I understand. So what you're saying is that every function that can be implemented with computers must be computable. Your claim is that the brain is actually a computable function, can you tell me which one it is using your favorite version of a Turing complete instruction set? Or maybe I misunderstood and what you're saying is that the brain is not the function but what it does is compute a specific function called your mind in some unknown instruction set?

vidarh · on July 12, 2024

I'm saying that per the physical Church-Turing thesis, any function that is computable by ordinary physical means are Turing computable, and we have no evidence that even hints at the physical Church-Turing thesis not holding.

For it not to hold, there would need to be something unique about the physics of a brain that allows it to compute a class of functions which are inherently impossible to compute by other means. That'd imply entirely new/unknown physics that we're somehow not seeing any hints of.

> Your claim is that the brain is actually a computable function, can you tell me which one it is using your favorite version of a Turing complete instruction set?

No, my claim is that absent evidence of unknown physics or another way of disproving the physical Church-Turing thesis, the rational assumption is that the brain follows the same laws of physics as everything else, and so is limited to computation that is equivalent in power to Turing computable functions, just like everything else we know of.

For the brain not to be a computer would imply "magic" - not just that we don't know how the brain works, but for the brain to work in ways inconsistent with all known physics, and inconsistent in ways impossible to simulate with Turing computable functions. No sign of any such unknown physics happening in the brain has ever been recorded.

soist · on July 12, 2024

The Church-Turing thesis is a meta-mathematical statement and it neither has a proof nor disproof. In any case, this discussion has run its course.

prometheus76 · on July 11, 2024

Does your computer supply different answers if it is hungry, or tired?

vidarh · on July 11, 2024

It would if it was programmed to. Modern OSs certainly takes into account environmental factors.

soist · on July 7, 2024

How did you reach these conclusions and have you validated them by asking these superior artificial agents about whether you're correct or not?

soist · on July 7, 2024

You made an incorrect assessment of a basic calculation in algebraic topology and claimed that it was correct. You didn't even look at what it was computing and simply looked at the final answer which lined up with the answer on Wikipedia. Simplicial calculations for projective planes are not simple. The usual calculations are done with cellular decomposition and that's why the LLM gives the wrong answer, the actual answer is not in the dataset and requires reasoning.

bubblyworld · on July 8, 2024

Are you confusing me with someone else? When I asked it GPT computed the homology from the CW decomposition of RP^2 with three cells. Which is a very simple exercise.

I recommend that you give it a try.

soist · on July 11, 2024

That's ok. It seems like LLMs know all about simplicial complexes and homology so I'll spend my time on more fruitful endeavors but thanks for the advice.

bubblyworld · on July 11, 2024

To be fair, it's not a simplicial complex, but simplicial and cellular homology coincide on triangulatable spaces like RP^2 so I gave it the benefit of the doubt =) algebraic topology is a pretty fun field regardless of how much a language model knows about it IMO.

soist · on July 12, 2024

Do you have a reference for this equivalency?

bubblyworld · on July 18, 2024

It's in Hatcher iirc

soist · on July 7, 2024

The computer can't do anything other than arithmetic

soist · on July 6, 2024

TypeScript has a Turing complete type system so it's as powerful as it gets in terms of what can be expressed in the type system. As for learning and understanding what is going on with type systems in general you'll have to go through a textbook if you really want to develop an understanding. Type systems (not Turing complete ones) are fundamentally logical systems and I don't know how you can understand type systems without actually going through the trouble of understanding the underlying logical fundamentals.

soist · on July 4, 2024

No one at the commercial AI research labs knows what they're doing. As far as they are concerned there is nothing beyond gradient descent but it's accepted among academic researchers that gradient descent is insufficient for creating anything truly intelligent.

soist · on July 1, 2024

Eventually people will realize any underdetermined system of equations has infinitely many solutions. Give me any open source AI model and I will beat any SOTA benchmark. Why am I so confident? Because curve fitting can be applied to any data set to get as good of a result as needed. Combine this approach with mixtures of "experts" and any predetermined set of benchmarks will fall to a curve fit to the benchmark.

The hype is really getting tiresome. There is no way to get from here to any intelligent system with the current techniques. New breakthroughs will require insights into discrete spaces which are not amenable to curve fitting with gradient descent.