This seems overly pessimistic. We regularly engage with other cultures and their texts and understand them, and, yes, it takes time and knowledge of context to do so. Someone needs to explain quite a bit about Roman society under Augustus for you to understand what is going on in detail in Ovid's Amores, and the Epic of Gilgamesh is pretty bizarre unless you know quite a bit about ancient Mesopotamia. The Tao te Ching suffers from being written in a style that was very much for people in the know in a certain milieu, limited texts, and a huge amount of cultural baggage on top. The most interesting recent scholarly translation I know of is by Victor Mair, of a different, recently discovered text, and his contention is that the book is a 'mirror for princes' and not a mystical text at all.
> they are still fighting over what 'Hwaet' means
I don't think anyone is particularly fighting over what it means, just how to translate it when there isn't a parallel in modern English. My personal favorite is a translation that opens with 'Bro!'.
>Ovid's Amores, and the Epic of Gilgamesh is pretty bizarre unless you know quite a bit about ancient Mesopotamia.
But these cultures and texts are much closer to us than the Chinese Tao te Ching.
Taxonomically Latin and English have a common ancestor in ProtoIndoeuropean sure, but English and the Romances (I'm spanish native), but there's a lot of horizontal influence of Latin in English. Netwon, (and many other English writers) wrote in both Latin and English. The aphabet is the same.
Regarding the Epic of Gilgamesh, I haven't read much about that, cuneiform must be insanely hard to read, even through translations, that said, the fact that it seems to be an influence for Noah's Ark story seems to bring it much closer to western culture than Asian culture.
Same thing with Greek literature, it's a bit farther away, but some stuff like ficticious Oddyssey will be somewhat approachable through a translation, the rhymes and a million temporal references will be completely lost obviously.
Even some Arabic math texts I would consider to be somewhat more approachable by virtue of being so foundational to maths in general.
But religious Chinese? must be one of the most unapproachable combos for reading as a non native reader.
> Copy-paste his post into any LLM and ask it whether the post is contradictory or whether it's ambiguous whether this is production-grade software or not. No objective reader of this would come to the conclusion that it's ambiguous or misleading.
That's hilarious! You might want to add a bit more transition for the joke before the other points above, though.
Indemnification only means something if the indemnifying party exists and is solvent. If copyright claims on training data got traction, it would be neither, so it doesn't matter if they provide this or not. They probably won't exist as a solvent entity in a couple years anyway, so even the question of whether the indemnification means anything will go away.
All these transforms are switching to an eigenbasis of some differential operator (that usually corresponds to a differential equation of interest). Spherical harmonics, Bessel and Henkel functions, which are the radial versions of sines/cosines and complex exponential, respectively, and on and on.
The next big jumps were to collections of functions not parameterized by subsets of R^n. Wavelets use a tree shapes parameter space.
There’s a whole, interesting area of overcomplete basis sets that I have been meaning to look into where you give up your basis functions being orthogonal and all those nice properties in exchange for having multiple options for adapting better to different signal characteristics.
I don’t think these transforms are going to be relevant to understanding neural nets, though. They are, by their nature, doing something with nonlinear structures in high dimensions which are not smoothly extended across their domain, which is the opposite problem all our current approaches to functional analysis deal with.
You may well be right about neural networks. Sometimes models that seem nonlinear turns linear if those nonlinearities are pushed into the basis functions, so one can still hope.
For GPT like models, I see sentences as trajectories in the embedded space. These trajectories look quite complicated and no obvious from their geometrical stand point. My hope is that if we get the coordinate system right, we may see something more intelligible going on.
This is just a hope, a mental bias. I do not have any solid argument for why it should be as I describe.
> Sometimes models that seem nonlinear turns linear if those nonlinearities are pushed into the basis functions, so one can still hope.
That idea was pushed to its limit by the Koopman operator theory. The argument sounds quite good at first, but unfortunately it can’t really work for all cases in its current formulation [1].
We know that under benign conditions and infinite dimensional basis must exist but finding it from finite samples is very non-trivial, we don't know how to do it in the general case.
I’m not sure what you mean by a
change of basis making a nonlinear system linear. A linear system is one where solutions add as elements of a vector space. That’s true no matter what basis you express it in.
For example, if you prameterize the x,y coordinates of a plane-circular trajectory in terms the angle theta, it's nonlinear function of theta.
However, if you parameterized a point in terms of the tuple (cos \theta, sin \theta) it comes out as a scaled sum. Here we have pushed the nonlinear functions cos and sin inside the basis functions.
A conic section is nonlinear curve (not a line) when considered in the variables of and y. However, in the basis of x^2, xy, y^2, x, y it's linear (well, technically affine).
Consider the Naive Bayes classifier. It looks nonlinear till one parameterized it in log p, then it's linear in log-p and log-odds.
If one is ok with dimensional basis this linearisation idea can be pushed much further. Take a look at this if you are interested
From the abstract and skimming a few sections of the first paper, imho it is not really the same. The paper is moving the loss gradient to the tangent dual space where weights reside for better performance in gradient descent, but as far as I understand neither the loss function nor the neural net are analyzed in a new way.
The Fourier and Wavelet transforms are different as they are self-adjoint operators (=> form an orthogonal basis) on the space of functions (and not on a finite dimensional vector space of weights that parametrize a net) that simplify some usually hard operators such as derivatives and integrals, by reducing them to multiplications and divisions or to a sparse algebra.
So in a certain sense these methods are looking at projections, which are unhelpful when thinking about NN weights since they are all mixed with each other in a very non-linear way.
Thanks a bunch for the references. Reading the abstract these used a different idea compared to what Fourier analysis is about, but nonetheless should be a very interesting read.
"Time Enough for Love" is the only Heinlein I've felt any inclination to reread in the past few decades that held up at all (and it's still a good read).
Lamport's website has his collected works. The paper to start with is "Time, clocks, and the ordering of events in a distributed system." Read it closely all the way to the end. Everyone seems to miss the last couple sections for some reason.
The actual constituted nations of Europe as they exist today? Aside from the UK, not long. Most of them are states that came into existence in the 20th century. Germany was the 1980's. The rest are generally 19th century.
Because they're not negligible. It's worth calculating it out. Length contraction produces a very slight increase in charge density of the nuclei, but there are a lot of charges and electromagnetism is very strong.
Perhaps this is part of it? Tens of thousands of lines of code seems like a very small repo to me.