More

aghilmort · 2025-12-24T21:18:45 1766611125

would poke around Vopson <> Verlinde or other primal / dual / gauge correspondences for space-time / gravity especially vs. info theory etc.,

https://pubs.aip.org/aip/adv/article/15/4/045035/3345217/Is-...

Jonghwa_Lee · 2025-12-26T07:29:53 1766734193

Really appreciate the pointers. I'll definitely look into them and use them as references.

aghilmort · 2025-12-24T02:34:40 1766543680

connect screenless devices, e.g., Echo Dot extend weak wireless range in hotel screen share or network between multiple devices eg travel with two laptops and can virtual KVM only have to do the captive device on one - many hotels limit number of devices extra security buffer phone can't bridge wifi for headless like this etc etc

aghilmort · 2025-12-19T19:10:19 1766171419

there’s decent work on computational reasoning power of transformers, SSMs, etc.

some approximate snippets that come to mind are that decoder-only transformers recognize AC^0 and think in TC^0, that encoder-decoders are strictly more powerful than decoder-only, etc.

Person with last name Miller iric if poke around on arXiv, a few others, been a while since was current top of mind so ymmv on exact correctness of above snippets

ctoa · 2025-12-19T19:14:37 1766171677

You are probably thinking of Merrill (whose work is referenced towards the end of the article).

aghilmort · 2025-12-21T17:44:22 1766339062

ah yes Merrill thx!

aghilmort · 2025-12-13T01:52:51 1765590771

interesting. like Excel Solver? or OpenSolver, Gurobi, other optimizers? or different objective?

fouronnes3 · 2025-12-13T07:16:00 1765610160

Never used any of those, so I don't know! I'd be curious to read a comparison from anyone who knows about them.

I think what's pretty unique about the bidicalc solver that I made is that it does not depend on the previous input values to update backwards. It's truly solving the root finding problem. The advantage is that there are never any "stuck in a local optimum" problems with the solver. So you can solve difficult problems like polynomials, etc.

pod_krad · 2025-12-19T06:06:10 1766124370

Excel Solver allows you to create target function with different variables and describe limits for them. Then you may try to find maximum, minimum or exact value for the target function.

aghilmort · 2025-10-08T14:30:38 1759933838

link doesn’t work

consultutah · 2025-10-08T14:39:03 1759934343

sorry - https://prdware.github.io

aghilmort · 2025-09-26T21:11:36 1758921096

awesome had thought about doing this / great to see will try!!!!!!!!!!!

aghilmort · 2025-09-26T21:12:58 1758921178

wait why subscription?

aghilmort · 2025-09-26T19:40:41 1758915641

Interesting. Modular manifolds are precisely what hypertokens use for prompt compiling.

Specifically, we linearize the emergent KVQ operations of an arbitrary prompt in any arbitrary model by way of interleaving error-correcting code (ECC).

ECC tokens are out-of-band tokens, e.g., Unicode's Private Use Area (PUA), interleaved with raw context tokens. This construction induces an in-context associate memory.

Any sort of interleaved labeling basis, e.g., A1, quick brown fox, A2, jumped lazy dog, induces a similar effect to for chaining recall & reasoning more reliably.

This trick works because PUA tokens are generally untrained hence their initial embedding is still random Gaussian w.h.p. Similar effects can be achieved by simply using token combos unlikely to exist and are often in practice more effective since PUA tokens like emojis or Mandarin characters are often 2,3, or 4 tokens after tokenization vs. codeword combos like zy-qu-qwerty every k content tokens, where can be variable.

Building attention architecture using modular manifolds in white / gray-box models like this new work shows vs. prompt-based black box injection is a natural next step, and so can at least anecdotally validate what they're building ahead of next paper or two.

Which is all to say, absolutely great to see others building in this way!

snake_doc · 2025-09-26T20:54:01 1758920041

Wot? Is this what AI generated non-sense has come to? This is totally unrelated.

aghilmort · 2025-09-26T21:00:03 1758920403

Nope. Construction induces ECC-driven emergent modular manifolds in latent space during KVQ maths. Can't use any ole ECC / crux why works. More in another reply.

glowcoil · 2025-09-26T20:02:52 1758916972

The original article discusses techniques for constraining the weights of a neural network to a submanifold of weight space during training. Your comment discusses interleaving the tokens of an LLM prompt with Unicode PUA code points. These are two almost completely unrelated things, so it is very confusing to me that you are confidently asserting that they are the same thing. Can you please elaborate on why you think there is any connection at all between your comment and the original article?

aghilmort · 2025-09-26T20:54:11 1758920051

Our ECC construction induces an emergent modular manifold during KVQ computation.

Suppose we use 3 codeword lanes every codeword which is our default. Each lane of tokens is based on some prime, p, so collectively forms CRT-driven codeword (Chinese Remainder Theorem). This is discretely equivalent to labeling every k tokens with 1x globally unique indexing grammar.

That interleaving also corresponds to a triple of adjacent orthogonal embeddings since those tokens still retain a random gaussian embedding. The net effect is we similarly slice the latent space into spaced chain of modular manifolds within the latent space every k content tokens.

We also refer to that interleaving as Steifel frames for similar reasons as the post reads etc. We began work this spring or so to inject that net construction inside the model with early results in similar direction as post described. That's another way of saying this sort of approach lets us make that chained atlas (wc?) of modular manifolds as tight as possible within dimensional limits of the embedding, floating point precision, etc.

We somewhat tongue-in-cheek refer to this as the retokenization group at the prompt level re: renormalization group / tensor nets / etc. Relayering group is the same net intuition or perhaps reconnection group at architecture level.

glowcoil · 2025-09-26T21:44:14 1758923054

I'm sorry, but even if I am maximally charitable and assume that everything you are saying is meaningful and makes sense, it still has essentially nothing to do with the original article. The original article is about imposing constraints on the weights of a neural network, during training, so that they lie on a particular manifold inside the overall weight space. The "modular" part is about being able to specify these constraints separately for individual layers or modules of a network and then compose them together into a meaningful constraint for the global network.

You are talking about latent space during inference, not weight space during training, and you are talking about interleaving tokens with random Gaussian tokens, not constraining values to lie on a manifold within a larger space. Whether or not the thing you are describing is meaningful or useful, it is basically unrelated to the original article, and you are not using the term "modular manifold" to refer to the same thing.

aghilmort · 2025-09-26T22:20:34 1758925234

hmm / hear you. my point wasn't that we are applying modular manifolds in the same way it was that we are working on model reliability from two extremal ends using the same principle. there are various ways to induce modular manifolds in model at various levels of resolution / power. we started at outside / working in level and so it works with any black-box model out of the box and zero knowledge needed, dont even need to know token dictionary to show effect.

We're already working on pushing construction deeper into model both architecture and training. currently that's for fine-tuning and ultimately full architecture shrinkage / pruning and raw training vs. just fine-tuning etc.

& it was just great to see someone else using modular manifolds even if they are using them at the training stage vs. inference stage. they're exploiting modular form at training, we're doing it at inference. cool to see.

aghilmort · 2025-09-07T19:43:46 1757274226

oy, clicked thinking was Bell Inequality meets Schrondinger's cat post

aghilmort · 2025-08-06T01:55:22 1754445322

switching models great best practice whether get stuck or not

can look at primal check the mean or dual get out of local minima

in all cases, model, tokenizer, etc is just enough different that will generally pay off in spaces quickly

aghilmort · 2025-08-05T00:55:16 1754355316

read something new every day before going to bed

journal before you start your day

buy some sort of electric kettle

davewasthere · 2025-08-05T06:24:46 1754375086

The fact that there's an entire country mostly unaware of the utility and ubiquitousness of a simple electric kettle, blows my mind. But then again, I'm a product of the Empire (British) not a North-American.

But while the idea of using a stove top kettle (have done so in the past) is fine, the thought of using a microwave to heat up a cup of water for tea seems abhorent. (although it's really not)

I guess it came about because 110V not being as efficient? Or more American's are coffee drinkers?

aghilmort · 2025-08-05T15:56:22 1754409382

intern suggested years ago and now electric kettle pretty much first thing buy anytime stay somewhere longer than a couple of weeks if doesn’t already have

sakesun · 2025-08-05T05:23:38 1754371418

I won't be able to have enough sleep then.