More

thomasskis · 2025-08-28T06:50:54 1756363854

I run 4 Mac Studio ultras at work (they’re pricy when maxed out), for local-first AI dev services. But there’s a few things that make me want to switch to the Spark. Networking is the biggest one, the Macs have Thunderbolt and Ethernet, but if I run distributed inference with EXO over Thunderbolt; the drop in tokens/second is massive. These Sparks get RDMA and can stack nicely. The other big one is access to CUDA, MLX has come a long way but being able to have CUDA and GPU access in containers would simplify the stack so nicely. If I had a USB-C/Thunderbolt backplane it might compare, but scaling with the Spark is likely a lot more straightforward.

I call the stack with Mac Studios “MacAIver” because it feels like a duct tape solution, but the Spark equivalent would likely be more elegant.

aurareturn · 2025-08-28T08:48:20 1756370900

You'd have to stack 16 of these to get 2TB of VRAM, equivalent to 4 Mac Studios 512GBs chained together.

16 compared to 4. Surely even much faster networking in the Spark would degrade with that many devices?

Biggest problem with Macs is that they don't have dedicated tensor cores in the GPU which makes prompt processing very slow compared to Nvidia and AMD.

themgt · 2025-08-28T09:31:05 1756373465

n.b. there's been a little speculation that Apple adding TensorOps to Metal 4 suggests M5/M6 may get tensor cores.

https://x.com/liuliu/status/1932158994698932505

https://developer.apple.com/metal/Metal-Shading-Language-Spe...

aurareturn · 2025-08-28T11:23:47 1756380227

Nice. I hope so. That would make Macs the best local LLM machines for the masses by far.

thomasskis · 2025-08-28T15:00:07 1756393207

It’s $12k for each Mac Studio, and the networking makes them only effective individually (it’s like less that 15 tokens/s with EXO) while NVLINK is very effective. The Spark is definitely more scalable, but the MLX and metal teams are cooking, so honestly either way is still winning.

thomasskis · on Jan 29, 2025

EXO is also great for running the 6bit deepseek, plus it’s super handy to serve from all your devices simultaneously. If your dev team all has M3 Max 48gb machines, sharing the compute lets you all run bigger models and your tools can point at your local API endpoint to keep configs simple.

Our enterprise internal IT has a low friction way to request a Mac Studio (192GB) for our team and it’s a wonderful central EXO endpoint. (Life saver when we’re generally GPU poor)

thomasskis · on April 17, 2024

IQ doesn’t seem to be an ideal metric for macro level civilisation intelligence. A lot of what it optimises for is replaced by technology. If we’re measuring intelligence at that level I’d hope for something like theory of mind/spiral dynamics, testing how wide someone’s perspective is.

Is their world model limited to themselves, their family, their village, their country, their race, their world, their species, their system? Combined with their understanding of social dynamics at each level.

I think the average GenZ in the US is at beginning of global scale, which is great for acceptance of different cultures; but without understanding the social dynamics yet. Giving us “woke” because they understand different cultures have different contexts, this is good but incomplete. E.g. some cultures aren’t aligned with the entire species, like controlling women in a way that leads to bad outcomes as technology adds complexity to our social dynamics.

Measuring this perspective level would help us understand which cultural models work best in adapting with growth and are more likely to be sustainable post-scarcity.

thomasskis · on Nov 19, 2023

Fine-tuning mistral for tool use like metasploit is effective, but even default mistral with a basic system prompt is very capable and doesn’t often say it can’t do things. ChatGPT obviously needs a lot of coaxing, “my job depends on this” is a hilarious way to get it to be helpful here. But for cyber security tooling I think we’ll see things more akin to David Shapiro style swarms with small models that are domain specific coordinating with each other (very basic discovery focused models communicating with a more complex reasoning model to validate findings, then remediation)

The tricky part here is (football metaphor) so far I’m having to train the “strikers” before I can effectively train the “goalie”. Which feels bad for AI safety. I think this is why we’re not seeing a lot of work in the open here.

But we’re planning to open source the goalie, which will look more like Markov/monte-Carlo traditional ML on specific bits, like infrastructure as code.

If you want to work on this stuff, especially in EU, DM me; we’re hiring ;)

insanitybit · on Nov 20, 2023

Haha, I have done the "my job depends on this" kind of thing. I think I did something like "someone's life depends on this". I just feel like that's flaky.

Sounds very cool :D I'm US based and currently taking a long time off, but I wish you the best of luck.

thomasskis · on Nov 19, 2023

I’m really curious how about how good it used to be? It feels to me like there’s a cultural resistance to be self critical of the whole, and that is slowing down wide-scale progress.

I moved to Sweden 2 months ago (from Seattle, where I grew up) but over the last decade I’ve lived in Seoul, Singapore, Bangkok, and Reykjavik. The government here feels most similar to the US, worse in some ways. Paperwork is like a hobby and doing it slowly is almost glorified.

I love it in many ways, but it’s baffling that I still don’t have a government ID, and am months away from getting a bank account. (Which would block me from getting a paycheck if I wasn’t high enough to at my company to get an exception and wire to my U.S./EU accounts. But for most relocated employees, they just don’t get paid for the first few months). The HR here says this is something they can’t change because it’s legal requirements from the government? But they just shrug and say that’s the Swedish way it works because it’s been working. And now I’m too embarrassed to hire the people I need to until I have good workarounds for these processes (there are many more issues).

As a Swede with history of what used to be good, and what’s working in other places, what would it take to get Sweden to correct these aspects of the way things are run?

codr7 · on Nov 20, 2023

Yeah, tell me about it. The thing that sickens me the most is that Swedes in general are still very keen to pat themselves on the shoulder about what a great country they live in; even though it's mostly memories by now.

Been there, done that. Getting a bank account once I moved back was a major pita. I still haven't been able to renew my drivers license since I've been moving around without a fixed address. That part isn't new though, it's been like that as far back as I can remember.

But at least back in the days, the system worked, you got something back for your troubles.

whstl · on Nov 19, 2023

I have seen these issues here in Germany too. Fintech-banks like N26 or Revolut used to fill the gap and give you a quick bank account before you could settle down in a permanent address.

The problem is that they require a work permission for the account, which used to be quick, but since the immigration system is now overworked, they're giving one year Visas and taking their time to give the work permission. So N26 or Revolut are only really available for European citizens (that was the reality for a few years now, dunno if it changed).

Funny enough, more traditional banks have picked up the slack and dropped the strict requirements, and you can open an account from day one of being here with them.

CalRobert · on Nov 20, 2023

Is n26 an option? A newcomer should be able to get a German account pretty easily from them, and with SEPA it doesn't matter which country your IBAN is in. Or is the currency an issue?

thomasskis · on Nov 15, 2023

How about using an LLM to help you write a MuZero-like model designed for a specific task? (Also MuZero took like 12 hours to train on old hardware, so my MacBook might be good enough here) obviously we’re not here yet, but it doesn’t seem far away. Hell, you could train a small LLM specifically just to do this.

thomasskis · on Oct 23, 2023

I think the subjective time here is especially relevant. The post previous mentioned having a conversation with ChatGPT about the topic. ChatGPT probably had multiple human lifetimes of conversations during that one conversation. Would it think of humans the same way we think of trees? Too slow to have meaningful behaviour? Maybe on H100s not just yet.

justinclift · on Oct 24, 2023

Interesting angle. Hadn't thought of that, and it's probably good to be aware of. :)

thomasskis · on Aug 10, 2023

Hey, this is a great idea! I’ve found myself using dating apps for this kind of purpose and I think this could be a better approach, especially from the support side. (I’ve had a few instances where being supportive but not wanting a romantic relationship with that person to be really frustrating for that person (which is understandable given the context of a dating app))

A feature that would be great here though is to have realtime conversations, I think you can connect a lot more when it’s realtime. Also supporting text replies could be really helpful.

Anyway, love what you’re building here and hope it goes well!

jaybhum · on Aug 10, 2023

Thank you for your kind words, and thank you for sharing your past experience on dating apps! It is interesting that you have used them to support people, I have never heard of anyone use it for that purpose, but that is very altruistic of you haha

In fact, we tried making real-time conversation feature in our previous mini-MVP, and the difficulty there was that 1. people were too hesitant to jump in a phone call 2. people were never online at the same time 3. conversations could not be enjoyed by other listeners in different times.

I have tried using Clubhouse for a while, and did not like the dynamics there, where you would only have a few speakers talking at the same time like MCs while the audience can only listen. I think by focusing on asynchronous voice messages, this platform can help individuals make connections with other individuals, which is not offered by any platforms out there.

I hope you continue to enjoy using it! And we would appreciate any feedback you have :)

thomasskis · on April 1, 2023

Ha I just commented above at the pattern of people in this camp using “GTP” fairly consistently.

What a curious psychological study, maybe dyslexic people feel more threatened by a large language model so clearly understanding words that they’re more likely to attempt to discredit it?

thomasskis · on April 1, 2023

Weird behaviour I’ve noticed is a lot of folks on the unimpressed/doomism side of AI consistently say GTP instead of GPT, I wonder why this pattern exists?