Instead of having my computer be the one running Claude Code and executing tasks, I might want to prefer to offload it to my other homelab servers to execute agents for me, working pretty much like traditional CI/CD, though with LLMs working on various tasks in Docker containers, each on either the same or different codebases, each having their own branches/worktrees, submitting pull/merge requests in a self-hosted Gitea/GitLab instance or whatever.
However, you're not supposed to really use it with your Claude Max subscription, but instead use an API key, where you pay per token (which doesn't seem nearly as affordable, compared to the Max plan, nobody would probably mind if I run it on homelab servers, but if I put it on work servers for a bit, technically I'd be in breach of the rules):
> Unless previously approved, Anthropic does not allow third party developers to offer claude.ai login or rate limits for their products, including agents built on the Claude Agent SDK. Please use the API key authentication methods described in this document instead.
It just feels a tad more hacky than just copying an API key when you use the API directly, there is stuff like https://github.com/anthropics/claude-code/issues/21765 but also "claude setup-token" (which you probably don't want to use all that much, given the lifetime?)
Good idea! We're already showing the transaction price per share from the SEC filing. Are you thinking more along the lines of showing the current stock price alongside it, or maybe a price chart showing the stock's performance since the insider trade?
An anecdote: On one project, I use a skill + custom cli to assist getting PRs through a sometimes long and winding CI process. `/babysit-pr`
This includes regular checks on CI checks using `gh`. My skill / cli are broken right now:
`gh pr checks 8174 --repo [repo] 2>&1)`
Error: Exit code 1
Non-200 OK status code: 429 Too Many Requests
Body:
{
"message": "This endpoint is temporarily being throttled. Please try again later. For more on scraping GitHub and how it may affect your rights, please review our Terms of Service (https://docs.github.com/en/site-policy/github-terms/github-terms-of-service)",
"documentation_url": "https://docs.github.com/graphql/using-the-rest-api/rate-limits-for-the-rest-api",
"status": "429"
}
1. Sit out and buy the tech you need from competitors.
2. Spend to the tune of ~$100B+ in infra and talent, with no guarantee that the effort will be successful.
Meta picked option 2, but Apple has always had great success with 1 (search partnership with Google, hardware partnerships with Samsung etc.) so they are applying the same philosophy to AI as well. Their core competency is building consumer devices, and they are happy to outsource everything else.
This whole thread is about whether the most valuable startup of all time will be able to raise enough money to see the next calendar year.
It's definitely rational to decide to pay wholesale for LLMs given:
- consumer adoption is unclear. The "killer app" for OS integration has yet to ship by any vendor.
- owning SOTA foundation models can put you into a situation where you need to spend $100B with no clear return. This money gets spent up front regardless of how much value consumers derive from the product, or if they even use it at all. This is a lot of money!
- as apple has "missed" the last couple of years of the AI craze, there has been no meaningful ill effects to their business. Beyond the tech press, nobody cares yet.
I mean, they tried. They just tried and failed. It may work out for them, though — two years ago it looked like lift-off was likely, or at least possible, so having a frontier model was existential. Today it looks like you might be able to save many billions by being a fast follower. I wouldn’t be surprised if the lift-off narrative comes back around though; we still have maybe a decade until we really understand the best business model for LLMs and their siblings.
I think you are right. Their generative AI was clearly underwhelming. They have been losing many staff from their AI team.
I’m not sure it matters though. They just had a stonking quarter. iPhone sales are surging ahead. Their customers clearly don’t care about AI or Siri’s lacklustre performance.
> Their customers clearly don’t care about AI or Siri’s lacklustre performance.
I would rather say their products didn’t just loose in value for not getting an improvement there. Everyone agrees that Siri sucks, but I’m pretty sure they tried to replace it with a natural language version built from the ground up, and realised it just didn’t work out yet: yes, they have a bad, but at least kinda-working voice assistant with lots of integrations into other apps. But replacing that with something that promises to do stuff and then does nothing, takes long to respond, and has less integrations due to the lack of keywords would have been a bad idea if the technology wasn’t there yet.
We do know that they made a number of promises on AI[1] and then had to roll them back because the results were so poor[2]. They then went on to fire the person responsible for this division[3].
That doesn't sound like a financial decision to me.
They tried to do something that probably would have looked like Copilot integration into Windows, and they chose not to do that, because they discovered that it sucked.
So, they failed in an internal sense, which is better than the externalized kind of failure that Microsoft experienced.
I think that the nut that hasn't been cracked is: how do you get LLMs to replace the OS shell and core set of apps that folks use. I think Microsoft is trying by shipping stuff that sucks and pissing off customers, while Apple tried internally declined to ship it. OpenClaw might be the most interesting stab in that direction, but even that doesn't feel like the last word on the subject.
Well they tried and they failed. In that case maybe the smartest move is not to play. Looks like the technology is largely turning into a commodity in the long run anyways. So sitting this out and letting others make the mistakes first might not be the worst of all ideas.
Sure, Siri is, but do people really buy their phone based off of a voice assistant? We're nowhere near having an AI-first UX a la "Her" and it's unclear we'll even go in that direction in the next 10 years.
I think Apple is waiting for the bubble to deflate, then do something different. And they have the ready to use user base to provide what they can make money from.
If they were taking that approach, they would have absolutely first-class integration between AI tools and user data, complete with proper isolation for security and privacy and convenient ways for users to give agents access to the right things. And they would bide their time for the right models to show up at the right price with the right privacy guarantees.
They apparently are working on and are going to release 2(!) different versions of siri. Idk, that just screams "leadership doesn't know what to do and can't make a tough decision" to me. but who knows? maybe two versions of siri is what people will want.
It sounds like the first one, based on Gemini, will be more a more limited version of the second ("competitive with Gemini 3"). IDK if the second is also based on Gemini, but I'd be surprised if that weren't the case.
Seems like it's more a ramp-up than two completely separate Siri replacements.
For CC, I suspect it also need to be testing and labeling separate runs against subscription, public API and Bedrock-served models?
It’s a terrific idea to provide this. ~Isitdownorisitjustme for LLMs would be the parakeet in the coalmine that could at least inform the multitude of discussion threads about suspected dips in performance (beyond HN).
What we could also use is similar stuff for Codex, and eventually Gemini.
Really, the providers themselves should be running these tests and publishing the data.
The availability status information is no longer sufficient to gauge the service delivery because it is by nature non-deterministic.
This is a great question, would love some some feedback on this.
I assume they stuck with realsense for proper depth maps. However, those are both limited to a 6 meters range, and their depth imaging isn't able to resolve features smaller than their native resolution allows (gets worse after 3m too, as there is less and less parallax among other issues). I wonder how they approached that as well.
reply