More

aabhay · 2025-08-04T21:23:39 1754342619

Seems a bit drastic to compare Ghibli style transfer to revenge porn, but you do you I guess.

Uehreka · 2025-08-04T21:28:34 1754342914

It’s the anti-consent thing that ties them together. The idea of “You asked us to leave you alone, which is why we’re targeting you.”

littlestymaar · 2025-08-04T21:38:58 1754343538

Why are you talking about revenge porn here?

aabhay · 2025-08-03T19:35:47 1754249747

I’m skeptical of the method but excited for the direction. Giving models different personalities is adjacent to giving models different values / morals. Having a diversity of model personalities is a step in the right direction.

Unfortunately, this research seems to use a very coarse method (giving the model instructions to be evil and then measuring its activation changes against a “non evil” model). However, this is not a self supervised approach — it requires you input your own heavy handed concept of persona into the system. Obviously a more complex and complete personality is more than the sum of your yes/no answers to personality test questions.

However, it’s very possible with low rank methods to soon perhaps be able to give models long lived, user-specific personalities that emerge across thousands of conversations. That’s what I would happily call a persona vector.

aabhay · 2025-08-01T20:39:45 1754080785

My guess is the todo list is carried across “compress” points where the agent summarizes and restarts with fresh context + the summary

aabhay · 2025-07-30T03:17:37 1753845457

Isn’t this what “GPTs” was supposed to be? Why not just use that if this is essentially just a system prompt?

oc1 · 2025-07-30T05:27:28 1753853248

"Was supposed to be". Well, now you know the real purpose of this gpt circus ;)

aabhay · 2025-07-21T23:50:41 1753141841

Based on the Google presented answers, its possible that the report is generated post-hoc as a summarization of the prior thoughts. One could also presume that this summarization step is part of the mechanism for running the Tree of Thoughts too, so that this wasn’t some manual “now give the final answer” command.

aabhay · 2025-07-20T18:28:21 1753036101

The main problem with the “Bitter Lesson” is that there’s something even bitter-er behind it — the “Harsh Reality” that while we may scale models on compute and data, that simply broadly inserting tons of data without any sort of curation yields essentially garbage models.

The “Harsh Reality” is that while you may only need data, the current best models and companies behind them spend enormously on gathering high quality labeled data with extensive oversight and curation. This curation is of course being partially automated as well, but ultimately there’s billions or even tens of billions of dollars flowing into gathering, reviewing, and processing subjectively high quality data.

Interestingly, in the time that this paper was published, the harsh reality was not so harsh. For example in things like face detection, (actual) next word prediction, and other purely self supervised and not instruction tuned or “Chat” style models, data was truly all you needed. You didn’t need “good” faces. As long as it was indeed a face, the data itself was enough. Now, it’s not. In order to make these machines useful and not just function approximators, we need extremely large dataset curation industries.

If you learned the bitter lesson, you better accept the harsh reality, too.

bobbiechen · 2025-07-20T18:49:36 1753037376

So true. I recently wrote about how Merlin achieved magical bird identification not through better algorithms, but better expertise in creating great datasets: https://digitalseams.com/blog/what-birdsong-and-backends-can...

I think "harsh reality" is one way to look at it, but you can also take an optimistic perspective: you really can achieve great, magical experiences by putting in (what could be considered) unreasonable effort.

mhuffman · 2025-07-20T19:56:40 1753041400

Thanks for the intro to Merlin! I just went outside of my house and used it on 5 different types of birds and it helped me identify 100%. Relevent (possibly out of date) xkcd comic

[0]https://xkcd.com/1425/

Xymist · 2025-07-20T20:52:32 1753044752

Relevant - and old enough that those five years have been successfully granted!

v9v · 2025-07-20T20:36:35 1753043795

I think your comment has some threads in common with Rodney Brooks' response: https://rodneybrooks.com/a-better-lesson/

vineyardmike · 2025-07-20T19:08:32 1753038512

While I agree with you, it’s worth noting that current LLM training uses a significant percentage of all available written data for training. The transition from GPT-2 era models to now (GPT-3+) saw the transition from novel models that can kinda imitate speech to models that can converse, write code, and use tools. It’s only after the readily available data was exhausted, that future gains came curation and large amounts of synthetic data.

aabhay · 2025-07-20T19:37:02 1753040222

Transfer learning isn’t about “exhausting” all available un-curated data, its simply that the systems are large enough to support it. There’s not that much of a reason to train on all available data. And its not all, there’s still a very significant filtration happening. For example they don’t train on petabytes of log files, that would just be terribly uninteresting data.

Calavar · 2025-07-20T20:10:00 1753042200

> The transition from GPT-2 era models to now (GPT-3+) saw the transition from novel models that can kinda imitate speech to models that can converse, write code, and use tools.

Which is fundamentally about data. OpenAI invested an absurd amount of money to get the human annotations to drive RHLF.

RHLF itself is a very vanilla reinforcement learning algo + some branding/marketing.

pphysch · 2025-07-20T18:54:39 1753037679

Another name for gathering and curating high-quality datasets is "science". One would hope "AI pioneer" USA would embrace this harsh reality and invest massively in basic science education and infrastructure. But we are seeing the opposite, and basically no awareness of this "harsh reality" among the AI hype...

aabhay · 2025-07-19T05:18:48 1752902328

I am deeply skeptical of the author’s conclusions to the point of being skeptical of their motive.

These HALO type acquisitions are awful for the tech industry and this trend is deeply concerning to me.

1. The author fails to consider the impact of this kind of acquihire on the company’s customers. We’ve already seen Scale AI have a dramatic change in customer behavior post hire. Windsurf was a real company with real revenue. How many of those customers want to use a product with a much less predictable future now? Where the team was basically disembowled?

2. The creation of a “class hierarchy” both within and between companies has a dis-spiriting king-maker vibe. 1/5 of the company ascends to heaven through this process and the other 4/5 is left to manage a shell corp? How is that supposed to feel?

3. The concept of windsurf rebuilding their product in google land is laughable. What product will they build but the same thing they already built? And windsurf is nowhere near the velocity of Cursor, so theyve burned months of time doing this and they’ll burn months more adapting to the new environment. How does this square with the thesis of google wanting their talent? Lets be real about the reason google bought windsurf — they got a discount from the $3B OpenAI acquisition and most of all — they wanted to fuck OpenAI. At this point, I would guess that these companies would do whatever they can to harm OpenAI, given the sheer velocity of their product and how terrified everyone is of them discovering something very soon.

4. I haven’t seen any example of a successful post-acquisition company operation. Inflection? Covariant? Character? What’s happened to these companies post change? Are all employees on strict NDAs? Why has nobody written about what it was like?

5. The conclusion I see here is straightforward — in a post AI economy, very very few people are “important” to these companies. The top 10 largest corporations in the world will basically be competing over a pool of ~5k individuals and everyone else is irrelevant.

If there’s anything that makes me wish we were living in Lina Khan’s world it’s this.

spearman · 2025-07-20T15:59:18 1753027158

These HALO deals are in large part a consequence of Lina Khan style anti-trust enforcement making acquisitions more difficult.

aabhay · 2025-07-12T13:07:31 1752325651

You will get at 20gb model. Distillation is so compute efficient that it’s all but inevitable that if not OpenAI, numerous other companies will do it.

I would rather have an open weights model that’s the best possible one I can run and fine tune myself, allowing me to exceed SOTA models on the narrower domain my customers care about.

aabhay · 2025-07-12T06:39:21 1752302361

Very standard yep. Sales folks are sort of trained /indoctrinated into telling white lies like that in order to get in the door. There are loads of examples of using fake momentum to close deals. If its a senior person it’s “My CEO asked me to personally reach out to you” or a fake email from the CEO forwarded by the rep. If one person at the company uses it, it’s “we’re negotiating a company wide license” or “we already have a group license with extra seats” or “one of your teammates sent us a list of priority teammates” yada yada.

aabhay · 2025-07-09T05:49:46 1752040186

As Albert mentioned, the benchmarks and data we use today heavily prioritize recall. Transformers are really really good at remembering parts of the context.

Additionally, we just don’t have training data at the size and scope that exceeds today’s transformer context lengths. Most training rollouts are fairly information dense. Its not like “look at this camera feed for four hours and tell me what interesting stuff happened”, those are extremely expensive data to generate and train on.