For a bit now ChatGPT has been able to reference your entire chat history. It was one of the biggest and most substantial improvements to the product in its history in my opinion. I'm sure we'll continue to see improvements in this feature over time, but your first item here is already partially addressed (maybe fully).
I completely agree on the third item. Carefully tuned pushback is something that even today's most sophisticated models are not very good at. They are simply too sycophantic. A great human professional therapist provides value not just by listening to their client and offering academic insights, but more specifically by knowing exactly when and how to push back -- sometimes quite forcefully, sometimes gently, sometimes not at all. I've never interacted with any LLM that can approach that level of judgment -- not because they lack the fundamental capacity, but because they're all simply trained to be too agreeable right now.
There's limited or no evidence of this in other domains where astonishing pay packages are used to assemble the best teams in the world (e.g., sports).
There's vast social honour and commensurate status attached to activities like bing a sports / movie star. Status that can easily be lost, and cannot be purchased for almost any amount of money. Arguably that status is a greater motivator than the financial reward - ie.: see the South Korean idol system. It's certainly not going to be diminished as a motivator by financial reward. There's no equivalent for AI researchers. At best the very best may win the acclaim of their peers and a Nobel prize. It's not a remotely equivalent level of celebrity / access to all the treasures the world can provide.
Top AI researchers are about the closest thing to celebrity status that has ever been attainable for engineering / CS folks outside of winning a Nobel Prize. Of course, the dopamine cycle and public recognition and adoration are nowhere near the same level as professional sports, but someone being personally courted by the world's richest CEOs handing out $100M+ packages is still decidedly not experiencing anything close to a normal life. Some of these folks still had their hiring announced on the front pages of the NYT and WSJ -- something normally reserved for top CEOs or, yes, sports stars and other legitimate celebrities.
Either Meta makes rapid progress on frontier-level AI in the next year or it doesn't -- there's definitely a feedback loop that's measured in tangible units of time. I don't think it's unreasonable to assume that when Zuck personally hires you at this level of compensation, there will be performance expectations and you won't stick around for long if you don't deliver. Even in top-tier sports, many underperformers manage to stick around for a couple years or even a half-decade at seven or eight figure compensation before being shown the door.
In reality all frontier models will likely progress at nearly the same pace making it difficult to disaggregate this team's performance compared to others. More importantly, it'll be nearly impossible to disaggregate any one contributor's performance from the others, making it basically impossible to enforce accountability without many many repetitions to eliminate noise.
> Even in top-tier sports, many underperformers stick around for a couple years or a half-decade at seven or eight figure compensation before being shown the door.
This can happen in the explicit hopes that their performance improves, not because it's unclear whether they are performing, and not generally over lapses in contract.
There are plenty of established performance management mechanisms to determine individual contributions, so while I wouldn't say that's a complete nonissue, it's not a major problem. The output of the team is more important to the business anyway (as is the case in sports, too).
And if the team produces results on par with the best results being attained anywhere else on the planet, Zuck would likely consider that a success, not a failure. After all, what's motivating him here is that his current team is not producing that level of results. And if he has a small but nonzero chance of pushing ahead of anyone else in the world, that's not an unreasonable thing to make a bet on.
I'd also point out that this sort of situation is common in the executive world, just not in the engineering world. Pretty much every top-tier executive at top-tier companies is making seven or eight figures as table stakes. There's no evidence I'm aware of that this reduces executive or executive team performance. Really, the evidence is the opposite -- companies continue paying more and more to assemble the best executive teams because they find it's actually worth it.
> There are plenty of established performance management mechanisms to determine individual contributions
"Established" != valid, and literally everyone knows that.
The executives you reference are never ICs and are definitionally accountable to the measured performance of their business line. These are not superstar hires the way that AI researchers (or athletes) are. The body in the chair is totally interchangeable so long as the spreadsheet says the right number, and you expect the spreadsheet performance to be only marginally controlled by the particular body in the chair. That's not the case with most of these hires.
I'd say execs getting hired for substantial seven- and eight-figure packages, with performance-based bonuses / equity grants and severance deals, absolutely do have a lot more in common with superstars than with most other professionals. And, just like superstars, they're hired based off public reputation more than anything else (just the sphere of what's "public" is different).
It's false that execs are never ICs. Anyone who's worked in the upper-echelon of corporate America knows that. Not every exec is simply responsible 1:1 for a business line. Many are in transformation or functional roles with very complex responsibilities across many interacting areas. Even when an exec is responsible for a business line in a 1:1 way, they are often only responsible for one aspect of it (e.g., leading one function); sometimes that is true all the way up to the C-suite, with the company having literally only a single exception (e.g., Apple). In those cases, exec performance is not 1:1 tied to the business they are 1:1 attached to. High-performing execs in those roles are routinely "saved" and banked for other roles rather than being laid off / fired in the event their BU doesn't work out. Low-performing execs in those roles are of course very quickly fired / re-orged out.
If execs really were so replaceable and it's just a matter of putting the right number in a spreadsheet, companies wouldn't be paying so much money for them. Your claims do not pass even the most basic sanity check. By all means, work your way up to the level we're talking about here and then report back on what you've learned about it.
Re: performance management and "everyone knowing that", you're right of course -- that's why it's not an interesting point at all. :) I disagree that established techniques are not valid -- they work well and have worked for decades with essentially no major structural issues, scaling up to companies with 200k+ employees.
I did not say their performance is 1:1 with a business line, but great job tearing down that strawman.
I said they are accountable to their business line -- they own a portfolio and are accountable for that portfolio's performance. If the portfolio does badly, it means nearly by definition that the executive is doing badly. Like an athlete, that doesn't mean they're immediately put to the streets, but it also is not ambiguous whether they are performing well or not.
Which also points to why performance management methods are not valid, i.e. a high-sensitivity, high-specificity measure of an individual executive's actual personal performance: there are obviously countless external variables that bear on the outcome of a portfolio. But nonetheless, for the business's purpose, it doesn't matter. Because the real purpose of performance management methods is to have a quasi-objective rationalization for personnel decisions that are actually made elsewhere.
Perhaps you can mention which performance management methods you believe are valid (high-specificity and high-sensitivity measures of an individual's personal performance) in AI R&D?
"Pretty much every top-tier executive at top-tier companies is making seven or eight figures as table stakes". In this group, what percentage are ICs? Sure there are aberrational celebrity hires, of course, but what you are pointing to is the norm, which is not celebrity hires doing IC work.
> If execs really were so replaceable... companies wouldn't be paying so much money for them
High-level executives within the same tier are largely substitutable - any qualified member of this cohort can perform the role adequately. However, this is still a very small group of people ultimately responsible for huge amounts of capital and thus collectively can maintain market power on compensation. The high salaries don't reflect individual differential value. Obviously there are some remarkable executives and they tend to concentrate in remarkable companies, by definition, and also by definition, the vast majority of companies and their executives are totally unremarkable but earn high salaries nonetheless.
Lack of differentiation within a tiny elite circle of candidates does not imply that salaries do not reflect individual differential value broadly. While these people control a large amount of capital, they do not own that capital -- their control is granted due to their talent and can be instantly revoked at any moment. They have no leverage to maintain control of this capital except through their own reputation and credibility. There is no "tenure" for executives -- the "status" of the role must essentially be re-earned constantly over time to maintain it, and those who don't do so are quickly forced out.
The researchers being hired here are just as accountable as the execs we're talking about -- there is a clear outcome that Zuck expects, and if they don't deliver, they will be held accountable. I really, genuinely don't see what's so complicated about this.
Accountability to a business line does not imply that if that business does poorly then every exec accountable to it was doing poorly personally. I'm actually a personal counter-example and I know a number of others too. In fact, I've even seen execs in failing BUs get promoted after the BU was folded into another one. Competent exec talent is hard to find (learning to operate successfully at the exec level of a Fortune 50 company is a very rarefied skill and can't be taught), and companies don't want to lose someone good just because that person was attached to a bad business line for a few months or years.
Something important to understand about the actual exec world is that executives move around within companies constantly -- the idea that an executive is tied to a single business and if something goes wrong there they must have sucked is just not true and it's not how large companies operate generally. When that happens, the company will figure out the correct action for the business line (divest, put into harvest mode, merge into another, etc., etc.), then figure out what to do with the executives. It's an opportunity to get rid of the bad ones and reposition the top ones for higher-impact work. Sometimes you do have to get rid of good people, though, which is true of all layoffs -- but even with execs there's a desire to avoid it (just like you'd ideally want to retain the top engineers of a product line being shuttered).
I disagree. If you want similarly-tight feedback loops on performance, pair programming/TDD provides it. And even if you hate real-time collaboration or are working in different time zones, delightful code reviews on small slices get pretty close.
It's not the end of observability as we know it. However, the article also isn't totally off-base.
We're almost certain to see a new agentic layer emerge and become increasingly capable for various aspects of SRE, including observability tasks like RCA. However, for this to function, most or even all of the existing observability stack will still be needed. And as long as the hallucination / reliability / trust issues with LLMs remain, human deep dives will remain part of the overall SRE work structure.
What this tells me is that xAI / X / Grok have together become a much bigger threat to OpenAI than they anticipated. And I believe it's true -- Grok's progress has been much faster than ChatGPT's over the past year, even if we can nitpick evals over which is technically "better" at any moment. The fact that it's hard to tell is itself a huge statement considering the massive advantage OpenAI and ChatGPT had not too long ago. I mean, Grok was basically a joke when it first launched!
That's a crazy statement. It is clearly not true that every single person in the US capable of making board games now or in the future is instead already making high-grade aerospace and medical components.
Depends -- do you want the US to become a vassal state of China? That's the trajectory we were on. China is going to catch up rapidly on technology, AI, and services, and before a few months ago the US was going to continue falling behind in every other conceivable area.
That’s a hilarious thing to say considering our behavior towards trade lately. We’ve burned bridges with our closest trading partners and made everyone else uncomfortable to trade with us because they don’t know what the eventual tariff rate will be, or if it will change tomorrow. We’re retreating from the world stage, and guess who’s sitting there ready to take the reins. It’s genuinely the opposite of what you seem to want.
I believe the point is power, and from that lens everything makes perfect sense. Trump is exercising available levers of global influence -- for good or for bad -- in a way that hasn't occurred since Hitler initiated World War II.
Tariffs are appealing to him because they are incredibly forceful blunt instruments over which he alone has almost complete control. They give him immense, immediate influence over the entire world. What we're seeing is that the US President today, if the full capacities of that office are pushed as far as possible without violence, is arguably one of (if not the) most powerful human beings ever.
Beyond this, Trump has said that one of his greatest weapons is uncertainty. He wants to be feared. Having people genuinely afraid of you is the next step of power that he is already flirting with by posting videos of people being blown up in warfare on social media.
And what will he do when China and rest of the world tired of his untreated Narcissistic Personality Disorder will start selling US debt, like $trillions in bonds in say less than a week? I bet you finally someone broke the secret to him today, so he reverted the policy.
I don't have much sympathy for this. This country has long expected millions and millions of blue collar workers to accept and embrace change or lose their careers and retirements. When those people resisted, they were left to rot. Now I'm reading a sob story about someone throwing a fit because they refuse to learn to use ChatGPT and Claude and the CEO had to sit them down and hold their hand in a way. Out of all the skillset transitions that history has required or imposed, this is one of the easiest ever.
They weren't fired; they weren't laid off; they weren't reassigned or demoted; they got attention and assistance from the CEO and guidance on what they needed to do to change and adapt while keeping their job and paycheck at the same time, with otherwise no disruption to their life at all for now.
Prosperity and wealth do not come for free. You are not owed anything. The world is not going to give you special treatment or handle you with care because you view yourself as an artisan. Those are rewards for people who keep up, not for those who resist change. It's always been that way. Just because you've so far been on the receiving end of prosperity doesn't mean you're owed that kind of easy life forever. Nobody else gets that kind of guarantee -- why should you?
The bottom line is the people in this article will be learning new skills one way or another. The only question is whether those are skills that adapt their existing career for an evolving world or whether those are skills that enable them to transition completely out of development and into a different sector entirely.
> These are rewards for people who keep up, not for those who resist change.
lol. I work with LLM outputs all day -- like it's my job to make the LLM do things -- and I probably speak to some LLM to answer a question for me between 10 and 100 times a day. They're kinda helpful for some programming tasks, but pretty bad at others. Any company that tried to mandate me to use an LLM would get kicked to the curb. That's not because I'm "not keeping up", it's because they're simply not good enough to put more work through.
Wouldn't this depend a lot on how management responds to your use? For example, if you just kept a log of prompts and outputs with notes about why the output wasn't acceptable, that could be considered productive use in this early stage of LLMs, especially if management's goal was to have you learning how to use LLMs. Learning how not to use something is just as important in the process of adapting any new tool.
If management is convinced of the benefits of LLMs and the workers are all just refusing to use them, the main problem seems to be a dysfunctional working environment. It's ultimately management's responsibility to work that out, but if the management isn't completely incompetent, people tasked with using them could do a lot to help the situation by testing and providing constructive feedback rather than making a stand by refusing to try and providing grand narratives about damaging the artistic integrity of something that has been commoditized from inception like video game art. I'm not saying that video game art can't be art, but it has existed in a commercial crunch culture since the 1970s.
Anything with even vaguely complicated TypeScript types, hallucinating modules, writing tests that are useful rather than just performative, as recent examples…
If you're not doing the work, you're not learning from the result.
The CEOs in question bought what they believed to be a power tool, but got what is more like a smarter copy machine. To be clear, copy machines are not useless, but they also aren't going to drive the 200% increases in productivity that people think they will.
But because management demands the 200% increase in productivity they were promised by the AI tools, all the artists and programmers on the team hear "stop doing anything interesting or novel, just copy what already exists". To be blunt, that's not the shit they signed up for, and it's going to result in a far worse product. Nobody wants slop.
Having spend hours upon hours with image snythesis for artistic hobby purposes, it is indeed an awesome tool. If you get into it you might learn about its limitations though.
Real knowledge here is often absend from the strongest AI prosletisers, others are more realistic about it. It still remains an awesome tool, but a limited one.
AIs today are not creative at all. They find statistical matches. They perform a different work than artists do.
But please, replace all your artwork with AI generated ones. I believe the forced "adapt" phase with that approach would realize itself rather quickly.
> It still remains an awesome tool, but a limited one.
And that's enough to drive significant industry-wide change. Just because it can't fully automate everything doesn't mean companies aren't going to expect (and, indeed, increasingly require) their employees to learn how to effectively utilize the technology. The CEO of Shopify recently made it clear that refusal to learn to use AI tools will factor directly into performance evaluations for all staff. This is just the beginning. It's best to be wise and go where the puck is headed.
The article gives several examples of where these tools are used to rapidly accelerate experimentation, pitches, etc. Supposedly this is a bad thing and should be avoided because it's not sufficiently artisan, but no defensible argument was presented as to why these use cases are illegitimate.
In terms of writing code, we're entering an era where developers who have invested in learning how to utilize this technology are simply better and more valuable to companies than developers who have not. Naysayers will find all sorts of false ways to nitpick that statement, yet it remains true. Effective usage means knowing when (and when not) to use these tools -- and to what degree. It also, for now at least, means remaining a human expert about the craft at hand.
They're not bragging. They're pointing out that the ceiling is dramatically too low, which has caused elite universities to spend decades creating more and more elaborate and more and more detached and meaningless gatekeeping mechanisms. The national average does not matter in the context of the nation's most competitive schools.
Just because it doesn't surprise you doesn't mean it's okay. You have to acknowledge that as an admissions consultant you're part of a small gatekeeping community bubble. Even though I attended one of these schools, I can recognize that universities have been rapidly losing their credibility, and this is only going to accelerate that trend. And by the way, this person is probably more accomplished than I am, even though I am now quite a bit older and my essay was apparently good enough to tick off the checkboxes.
The question you need to be asking is how the university system made an enemy out of someone who is clearly one of the most talented members of his age cohort in the nation. That's a failure no matter how hard you try to explain or justify the status quo. It's time for some real accountability and soul searching from the system, not excuses. Trying to nit pick the essay and pointing out how he should have done X or Y instead is completely missing the point.
I completely agree on the third item. Carefully tuned pushback is something that even today's most sophisticated models are not very good at. They are simply too sycophantic. A great human professional therapist provides value not just by listening to their client and offering academic insights, but more specifically by knowing exactly when and how to push back -- sometimes quite forcefully, sometimes gently, sometimes not at all. I've never interacted with any LLM that can approach that level of judgment -- not because they lack the fundamental capacity, but because they're all simply trained to be too agreeable right now.