> The ChatGPT integration, *powered by GPT-4o*, will come to iOS, iPadOS, and ma...

ra7 · on June 10, 2024

Didn’t Apple say they’re using their own hardware for serving some of the AI workloads? They dubbed it ‘Private Cloud Compute’. Not sure how much of a vote of confidence it is for Nvidia.

whimsicalism · on June 10, 2024

not for gpt4o workloads they aren't going to

stetrain · on June 10, 2024

Right, but are those going to run on Apple-owned hardware at all? It seems like Apple will first prioritize their models running on-device, then their models running on Apple Silicon servers, and then bail out to ChatGPT API calls specifically for Siri requests that they think can be better answered by ChatGPT.

I'm sure OpenAI will need to beef up their hardware to handle these requests - even as filtered down as they are - coming from all of the Apple users that will now be prompting calls to ChatGPT.

whimsicalism · on June 10, 2024

they're going to be using nvidia (or maybe AMD if they ever catch up) to train these models anyways

kolinko · on June 10, 2024

not necessarily so, in terms of tflops per $ (of apple’s cost of gpus, nit consumer), and tflops per watt their apple silicon is comparable if not better

talldayo · on June 10, 2024

> and tflops per watt their apple silicon is comparable if not better

If Apple currently ships a single product with better AI performance-per-watt than Blackwell, I will eat my hat.

whimsicalism · on June 10, 2024

flops/$ is simply not all (or even most) that matters when it comes to training LLMs.... Apple releases LLM research - all of their models are trained on nvidia.

ra7 · on June 10, 2024

Which is only a subset of requests Apple devices will serve and only with explicit user permission. That’s going to shrink over time as Apple continue to advance their own models and silicon.

jsheard · on June 10, 2024

Plus even if Apple is using their own chips for inferencing, they're still driving more demand for training, which Nvidia still has locked down pretty tight.

ra7 · on June 10, 2024

Apple said they’re using their own silicon for training.

Edit: unless I misunderstood and they meant only inference.

whimsicalism · on June 10, 2024

without more details hard to say, but i seriously doubt they trained any significantly large LM on their own hardware

people on HN routinely seem to overestimate Apple's capabilities

e: in fact, iirc just last month Apple released a paper unveiling their 'OpenElm' language models and they were all trained on nvidia hardware

jsheard · on June 10, 2024

Interesting, I thought Apple Silicon mainly excelled at inferencing. Though I suppose the economics of it are unique for Apple themselves since they can fill racks full of barebones Apple Silicon boards without having to pay their own retail markup for complete assembled systems like everyone else does.

talldayo · on June 10, 2024

They trained GPT-4o on Apple Silicon? I find that hard to believe, surely they only mean that some models were trained with Apple Silicon.

ra7 · on June 10, 2024

Not GPT-4o, their own models that power some (most?) of the “Apple Intelligence” stuff.

lxgr · on June 10, 2024

They're even explicitly saying:

> These models run on servers powered by Apple silicon [...]

That doesn't mean that there are no Nvidia GPUs in these servers, of course.

bbatsell · on June 10, 2024

They say user data remains in the Secure Enclave at all times, which Nvidia GPUs would not be able to access. I am quite certain that their private cloud inference runs only Apple silicon chips. (The pre-WWDC rumors were that they built custom clusters using M2 Ultras.)

justinsteven · on June 11, 2024

> They say user data remains in the Secure Enclave at all times

No they don't. They say that the Secure Enclave participates in the secure boot chain, and in generating non-exportable keys used for secured transport. It reads to me as though user devices will encrypt requests to the keys held in the Secure Enclave of a subset of PCC nodes. A PCC node that receives the encrypted request will use the Secure Enclave to decrypt the payload. At that point, the general-purpose Application Processor in the PCC node has a cleartext copy of the user request for doing the needful inference, which _could_ be done on an NVidia GPU, but appears to be done on general-purpose Apple Silicon.

There is no suggestion that the user request is processed entirely within the Secure Enclave. The Secure Enclave is a cryptographic coprocessor. It almost certainly doesn't have the grunt to do inference.

talldayo · on June 10, 2024

Not that it matters anyways, since Apple refuses to sign Nvidia GPU drivers for MacOS in the first place. So if they own any Nvidia hardware themselves, then they also own more third-party hardware to support it.

7speter · on June 11, 2024

Maybe this is way too science fiction, but what are the chances Apple's GPU/AI engine designs on Apple Silicon were a testbed for full sized, dedicated GPU dies that could compete with Nvidia's power in their own data centers?

talldayo · on June 11, 2024

Very low? I guess anything is possible, but the M1 through M4 GPUs weren't really anything to write home about. It more closely resembles AMD's raster-focused GPU compute in my opinion, which is certainly not a bad thing for mobile hardware.

Nvidia's GPUs are complex. They have a lot of dedicated, multipurpose acceleration hardware inside of them, and then they use CUDA to tie all those pieces together. Apple's GPUs are kinda the opposite way; they're extremely simple and optimized for low-power raster compute. Which isn't bad at all, for mobile! It just gimps them design-wise when they go up against purpose-built accelerators.

If we see Apple do custom Apple Silicon for the datacenter, it will be a pretty radically new design. The first thing they need is good networking; a full-size Nvidia cluster will use Mellanox Infiniband to connect dozens of servers at Tb/s speeds. So Apple would need a similar connectivity solution, at least to compete. The GPU would need to be bigger and probably higher-wattage, and the CPU should really emphasize core count over single-threaded performance. If they play their cards right there, they would have an Apple Silicon competitor to the Grace superchip and GB200 GPU.

Dunedan · on June 10, 2024

That quote is about their own LLMs, not about the use of ChatGPT.

lxgr · on June 10, 2024

Yes, but GP was talking about the AI workloads Apple will be running on their own servers (which are indeed distinct from those explicitly labeled as ChatGPT).

KeplerBoy · on June 10, 2024

Not sure Nvidia is too happy with Apple.

They are the first ones to ship on-device inference at scale on non-nvidia hardware. Apple also has the means to build data center training hardware using apple silicon if they want to do so.

If they are serious about the OAI partnership they could also start to supply them with cloud inference hardware and strongarm them into only using apple servers to serve iOS requests.

talldayo · on June 10, 2024

> They are the first ones to ship on-device inference at scale on non-nvidia hardware

Which is neat, but it's not CUDA. It's an application-specific accelerator good at a small subset of operations, controlled by a high-level library the industry is unfamiliar with and too underpowered to run LLMs or image generators. The NPU is a novelty, and today's presentation more-or-less confirmed how useless it is for rich local-only operations.

> Apple also has the means to build data center training hardware using apple silicon if they want to do so.

They could, but that's not a competitor against an NVL72 with hundreds of terabytes of unified GPU memory. And then they would need a CUDA competitor, which could either mean reviving OpenCL's rotting corpse, adopting Tensorflow/Pytorch like a sane and well-reasoned company, or reinventing the wheel with an extra library/Accelerate Framework/MPS solution that nobody knows about and has to convert models to use.

So they can make servers, but Xserve showed us pretty clearly that you can lead a sysadmin to MacOS but you can't make them use it.

> they could also start to supply them with cloud inference hardware and strongarm them into only using apple servers to serve iOS requests.

I wonder how much money they would lose doing that, over just using the industry-standard Nvidia servers. Once you factor in the margins they would have made selling those chips as consumer systems, it's probably in the tens-of-millions.

Miraste · on June 10, 2024

> reinventing the wheel with an extra library/Accelerate Framework/MPS solution that nobody knows about and has to convert models to use.

This is Apple's favorite thing in the world. They already have an Apple-Silicon-only ML framework as of a few months ago, called MLX. Does anyone know about it? No. Do you need to convert models to use it? Yes.

KeplerBoy · on June 10, 2024

You're approaching this from a developers point of view.

Users absolutely don't care if their prompt response has been generated by a CUDA kernel or some poorly documented apple specific silicon a poor team at cupertino almost lost their sanity to while porting the model.

And haven't they already spent quite a bit on money on their pytorch-like MLX framework?

talldayo · on June 10, 2024

> Users absolutely don't care if their prompt response has been generated by a CUDA kernel or some poorly documented apple specific silicon

They most certainly will. If you run GPT-4o on an iPhone with MLX, it will suck. Users will tell you it sucks, and they won't do so in developer-specific terms.

The entire point of this thread is that Apple can't make users happy with their Neural Engine. They require a stopgap cloud solution to make up for the lack of local power on iPhone.

> And haven't they already spent quite a bit on money on their pytorch-like MLX framework?

As well as Accelerate Framework, Metal Performance Shaders and previously, OpenCL. Apple can't decide where to focus their efforts, least of which in a way that threatens CUDA as a platform.

PartiallyTyped · on June 10, 2024

Imho, the stronghold of cuda is slowly eroding.

Inference can run without it, and could so for years via ONNX. Now we are starting to see more back-ends becoming available.

see https://github.com/openxla

alextheparrot · on June 10, 2024

Bit of a detail, but where are you deriving “with hundreds of terabytes of unified GPU memory” from?

talldayo · on June 10, 2024

I was an order of magnitude off, at least in the case of NVL72: https://www.nvidia.com/en-us/data-center/gb200-nvl72/

But the point stands, these systems occupy a niche that Apple Silicon is poorly suited to filling. They run normal Linux, they support common APIs, and network to dozens of other machines using Infiniband.

mistersquid · on June 10, 2024

> Apple also has the means to build data center training hardware using apple silicon if they want to do so.

> If they are serious about the OAI partnership they could also start to supply them with cloud inference hardware and strongarm them into only using apple servers to serve iOS requests

Apple addressed both these points in today’s preso.

1. They will send requests that require larger contexts to their own Apple Silicon-based servers that will provide Apple devices a new product platform called Private Cloud Compute.

2. Apple’s OS generative AI request APIs won’t even talk to cloud compute resources that do not attest to infrastructure that has a publicly available privacy audit.

wmf · on June 10, 2024

I'm pretty sure those points do not apply to ChatGPT integration. ChatGPT is still running on Nvidia.

mistersquid · on June 10, 2024

> I'm pretty sure those points do not apply to ChatGPT integration.

You’re absolutely right. I got too excited about Apple’s strategy to encourage developers to use Apple Private Cloud Compute.

The UX for ChatGPT as shown for iOS 18 makes it obvious that you are sending data outside the Apple Silicon walled garden.

wmf · on June 10, 2024

I would say MS Copilot+ is shipping on-device inference a few months before Apple, although at 1000x lower volume.

whimsicalism · on June 10, 2024

> Apple also has the means to build data center training hardware using apple silicon if they want to do so.

i'm seeing people all over this thread saying stuff like that, it reads like fantasyland to me. Apple doesn't have the talent or the chips or suppliers or really any of the capabilities to do this, where are people getting it from?

KeplerBoy · on June 10, 2024

Apple is already one of the largest (if not the largest) customers of TSMC and they have plenty of experience designing some of the best chips on the most modern nodes.

Their ability to design a chip and networking fabric which is fast/efficient at training a narrow set of model architecture is not far fetched by any means.

talldayo · on June 10, 2024

It's worth noting that one of Apple's largest competitor at TSMC is, in fact, Nvidia. And when you line the benchmarks up, Nvidia is one of the few companies that consistently beats Apple on performance-per-watt even when they aren't on the same TSMC node: https://browser.geekbench.com/opencl-benchmarks

Whatarethese · on June 10, 2024

ChatGPT will only be invoked if on device and apple intelligence servers cant handle request.

croes · on June 10, 2024

To be useful Apple has to share the data with OpenAI

Workaccount2 · on June 10, 2024

I can only imagine Apple has some kind of siloing agreement with OpenAI, Apple can easily afford whatever price to do so.

noahtallen · on June 10, 2024

Yes, also covered explicitly in the keynote that Apple user's requests to openAI are not tracked. (Plus you have the explicit opt-in to even access chatGPT via siri in the first place.)

talldayo · on June 10, 2024

Surely Apple wouldn't simply market privacy while lying to their users about who can access their data: https://arstechnica.com/tech-policy/2023/12/apple-admits-to-...

hnaccount_rng · on June 11, 2024

There is a wide gap between complying with law enforcement requests and judicial orders and intentionally lying. Yes, if Apple can (trivially) read your data, then one must assume that at least the US government can access your data! Though if that's in your threat model I have a couple of other bad news items for you. Apple actively reduces that surface by moving ~everything to ee2e storage with keys held on customer devices. This is pretty transparently the attempt to be able to say "sorry can't do that without changing OS code and for _that_ discussion we have won in court. Really sorry that we can't help you". And yes, that's probably just to decrease the compliance costs. Still same result

swatcoder · on June 10, 2024

Apple's put ChatGPT integration on the very edge of Apple Intelligence. It's a win for OpenAI to have secured that opportunity, and Nvidia wins by extension (as long as OpenAI continues to rely on them themselves), but the vast majority of what Apple announced today appears to run entirely on Apple Silicon.

It's not especially big news for Nvidia at all.

dereg · on June 10, 2024

If we know anything about Apple, they're going after Nvidia. If anyone can pull it off, it's going to be them.

MR4D · on June 10, 2024

Why do you think that?

You seem to be positioning this as a Ford vs Chevy duel, when (to me at least) the comparison should be to Ford vs Exxon.

Nvidia is an infrastructure company. And a darned good one. Apple is a user facing company and has outsourced infrastructure for decades (AWS & Azure being two of the well known ones).

dereg · on June 10, 2024

Apple outsourced chips to IBM (PowerPC) for a long time and floundered all the while. They went into the game themselves w/ the PA Semi acquisition and now they have Apple Silicon to show for it.

MR4D · on June 10, 2024

But Apple is vertically integrating. Thats like Ford buying Bridgestone.

The only way it hurts Nvidia is if Apple becomes the runaway leader of the pc market. Even then, Apple hasn’t shown any intent of selling GPUs or AI processors to the likes of AWS, or Azure or Oracle, etc.

Nvidia has a much bigger threat from with Intel/AMD or the cloud providers backward integrating and then not buying Nvidia chips. Again, no signs that Apple wants to do this.

whimsicalism · on June 10, 2024

i would strongly take the other side of that bet

Ancapistani · on June 10, 2024

Personally, I'm taking _both_ sides of that bet.

I think Apple is going to make rapid and substantial advancements in on-device AI-specific hardware. I also think nVIDIA is going to continue to dominate the cloud infrastructure space for training foundational models for the foreseeable future, and serving user-facing LLM workloads for a long time as well.

whimsicalism · on June 10, 2024

edge inference? sure - but nvidia is not even a major player in that space now so i wouldn't really count that as 'taking on nvidia'.

dereg · on June 10, 2024

Nvidia obviously has an enormous, enormous moat but I do think this is one of the areas in which Apple may actually GAF. The rollout of Apple Intelligence is going to make them the biggest provider of "edge" inference on day one. They're not going to be able to ride on optimism in services growth forever.

whimsicalism · on June 10, 2024

Apple simply does not have the talent pool to take on either nvidia or the big LLM providers anywhere on the stack except for edge inference.

If you're saying Apple is going to 'take on nvidia' in edge inference, then I don't disagree but I would hardly even count that as taking on nvidia.

dereg · on June 10, 2024

I can't really dispute any of that.

It took almost a decade but the PA Semi acquisition showed that Apple was able to get out of the shadow of its PowerPC era.

Nvidia will remain a leader in this space for a long time. But things are going to play out wonky and Apple, when determined, are actually pretty good at executing on longer-term roadmaps.

01100011 · on June 10, 2024

Apple could have moved on Nvidia but instead they seem to have thrown in the towel and handed cash back to investors. The OpenAI deal seems like further admission by Apple that they missed the AI boat.

gizmo · on June 10, 2024

Exactly. Apple really needs new growth drivers and Nvidia has a 3bn market cap Apple wants to take a bite out of. One of the few huge tech growth areas that Apple can expand into.

ThinkBeat · on June 10, 2024

I am of course wrong frequently, but I cannot see how that would happen. If they create cpu/gpus that are faster/better than what Nvidia sells, but they only sell them as part of a Mac desktop or laptop systems it wont really compete.

For that they would have to develop servers that has a mass amount of whatever it is or sell the chips in the same manner Nvidia does today.

I dont see that future for Apple.

Microsoft / Google / or other major cloud companies would do extremely well if they could develop it and just keep it as a major win for their cloud products.

Azure is running OpenAI as far as I have heard.

Imagine if M$ made a crazy fast GPU/whatever. It would be a huge competitive advantage.

Can it happen? I dont think so.

talldayo · on June 10, 2024

Well, good luck to Apple then. Hopefully this attempt at killing Nvidia goes better than the first time they tried, or when they tried and gave-up on making OpenCL.

I just don't understand how they can compete on their own merits without purpose-built silicon; the M2 Ultra doesn't shine a candle to a single GB200. Once you consider how Nvidia's offerings are networked with Mellanox and CUDA universal memory, it feels like the only advantage Apple has in the space is setting their own prices. If they want to be competitive, I don't think they're going to be training Apple models on Apple Silicon.

0xWTF · on June 10, 2024

S&P 500 average P:E - 20 to 25

NASDAQ average P:E - 31

NVidia's P:E - 71

That's a market of 1 vendor. That's ripe for attack.

anvuong · on June 10, 2024

It's ripe for attack. But Nvidia is still in its growing phase, not some incumbent behemoth. The way Nvidia ruthlessly handled AMD tell us that they are ready for competition.

talldayo · on June 10, 2024

Let's check in with OpenCL and see how far it got disrupting CUDA.

You see, I want to live in a world where GPU manufacturers aren't perpetually hostile against each other. Even Nvidia would, judging by their decorum with Khronos. Unfortunately, some manufacturers would rather watch the world burn than work together for the common good. Even if a perfect CUDA replacement existed like it did with DXVK and DirectX, Apple will ignore and deny it while marketing something else to their customers. We've watched this happen for years, and it's why MacOS perennially cannot run many games or reliably support Open Source software. It is because Apple is an unreasonably fickle OEM, and their users constantly pay the price for Apple's arbitrary and unnecessary isolationism.

Apple thinks they can disrupt AI? It's going to be like watching Stalin try to disrupt Wal-Mart.

labcomputer · on June 10, 2024

> Let's check in with OpenCL and see how far it got disrupting CUDA.

That's entirely the fault of AMD and Intel fumbling the ball in front of the other team's goal.

For ages the only accelerated backend supported by PyTorch and TF was CUDA. Whose fault was that? Then there was buggy support for a subset of operations for a while. Then everyone stopped caring.

Why I think it will go different this time: nVidia's competitors seem to have finally woken up and realized they need to support high level ML frameworks. "Apple Silicon" is essentially fully supported by PyTorch these days (via the "mps" backend). I've heard OpenCL works well now too, but have no hardware to test it on.

riquito · on June 10, 2024

> That's a market of 1 vendor. That's ripe for attack.

it's just a monopoly [1] , how hard can it be?

/s

- [1] practically, because of how widespread cuda is

baq · on June 10, 2024

cuda is x86. the only way from 100% market share is down.

…though it took two solid decades to even make a dent in x86.

talldayo · on June 10, 2024

CUDA is also ARM: https://developer.nvidia.com/cuda-downloads?target_os=Linux

baq · on June 11, 2024

nono - I don't mean cuda works on x86. I mean cuda is x86 - for gpgpu workloads - as in a defacto standard.

cube2222 · on June 10, 2024

Eh, it seems from the keynote that ChatGPT will be very selectively used, while most features will be powered by on-device processing and Apple's own private cloud running apple silicon.

So all in all, not sure if it's that great for Nvidia.

01100011 · on June 10, 2024

If OpenAI is furiously buying GPUs to train larger models and Apple is handing OpenAI cash, then this seems like a win for Nvidia. You can argue about how big of a win, but it seems like a positive development.

What would not have been positive for Nvidia is Apple saying they've adapted their HW to server chips and would be partnering with OpenAI to leverage them, but that didn't happen. Apple is busy handing cash back to investors and not seriously pursuing anything but inference.

rldjbpin · on June 11, 2024

given their history, he would only be satisfied when apple is forced to directly rely on nvidia hardware.

current situation is like nvidia devs using macs at work giving mr cook some satisfaction or something.