Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is exactly what I do. I have two 3090s at home, with Qwen3 on it. This is tied into my Home Assistant install, and I use esp32 devices as voice satellites. It works shockingly well.


I run Home Assistant on an RPi4 and have an ESP32-based Core2 with mic (https://shop.m5stack.com/products/m5stack-core2-esp32-iot-de...), along with a 16GB 4070 Ti Super in an always-on Windows system I only use for occasional gaming and serving media. I'd love to set up something like you have. Can you recommended a starting place, or ideally, a step-by-step tutorial?

I've never set up any AI system. Would you say setting up such a self-hosted AI is at a point now where an AI novice can get an AI system installed and integrated with an existing Home Assistant install in a couple hours?


I mean - the AI itself will help you get all that setup.

Claude code is your friend.

I run proxmox on an old Dell R710 in my closet that hosts my homeassistant (amongst others) VM and then I've setup my "gaming" PC (which hasn't done any gaming in quite some time) to dual boot (Windows or Deb/Proxmox) and just keep it booted into Deb as another proxmox node. That PC also has a 4070 Super that I have setup to passthru to a VM and on that VM I've got various services utilizing the GPU. This includes some that are utilized by my hetzner bare metal servers for things like image/text embeddings as well as local LLM use (though, rather minimal due to VRAM constraints) and some image/video object detection stuff with my security cameras (slowly working on a remote water gun turret to keep the racoons from trying to eat the kittens that stray cats keep having in my driveway/workshop).

Install claude code (or, opencode, it's also good) - use Opus (get the max plan) and give it a directory that it can use as it's working directory (don't open it in ~/Documents and just start doing things) and prompt it with something as simple as this:

"I have an existing home assistant setup at home and I'd like to determine what sort of self-hosted AI I could setup and integrate with that home assistant install - can you help me get started? Please also maintain some notes in .md files in this working directory with those note files named and organized as you see appropriate so that we can share relevant context and information with future sessions. (example: Hardware information, local urls, network layout, etc) If you're unsure of something, ask me questions. Do not perform any destructive actions without first confirming with me."

Plan mode. _ALWAYS_ use plan mode to get the task setup, if there's something about the plan you don't like, say no and give it notes - it will return with a new plan. Eventually agree to the plan when it's right - then work through that plan not in plan mode, but if it gets off the plan, get back in plan mode to get the/a plan set and then again let it go and just steer it in regular mode.


> I mean - the AI itself will help you get all that setup.

Or, ask somebody who already has it set up working.

That way you can get certain results, without guessing around why it works for them and not for you.

(I, too, am interested in the grandparent poster's setup.)


>use opus (get the max plan)

I dont have max plan, but on the Pro i tried for a month, i was able to blow trough my 5 hour limit by a single prompt (with 70k context codebase attached). The idea of paying so much money to get few questions per "workday" seems insane to me


I just wanted to touch on this despite being days later in hopes you see this - I've seen this sort of feedback about the Pro plan quite a bit. I skipped it and went for max so I don't have any experience with it but I can tell you that I've _never hit my/any usage limit_ with the max plan.

Like, I don't know if my account is broken or everyone else just uses things differently. I use claude code, I have it hard-stuck to Opus 4.1 - I don't even touch Sonnet. I _abuse_ the context - I used to /compact early or /clear often depending on the task... but these days (Opus seems much better with nearly full context than Sonnet was) if I'm still on the same task/group of tasks or I think that the current context would be useful for the next thing/task/step I don't even /compact anymore. I've found that if I just run it right up to full and let it auto /compact it does a _really_ good job picking up where it left off. (Which wasn't always the case) Point being - I'm exclusively using Opus 4.1 while also constantly cycling through and maxing out context only to restart with /compact'd context so it's not even starting empty and just keep going.

Hours a day like this. Never hit a limit. (I've said elsewhere that I do believe the general time I work, which is late evening and early morning in north america, does have something to do with this but I don't actually know)


Sonnet blows through the limit much slower, and is often great tbh


That's great to hear. I was mostly impressed with Qwen3 coder on my 4090, but am hobbled by the small memory footprint of the single card. What motherboard are you using with your 3090s? Like the others, I too am curious about those esp32s and what software you run on them.

Keep up the good hacking - it's been fun to play with this stuff!


I actually am not using the 3090s as one unit. I have Qwen3-30B-A3B as my primary model and it fits on a single GPU, then I have all the TTS/STT on the other GPU.


Ooo interesting, I'd love to hear more about the esp32's as voice satellites!


For the physical hardware I use the esp32-s3-box[1]. The esphome[2] suite has firmware you can flash to make the device work with HomeAssistant automatically. I have an esphome profile[3] I use, but I'm considering switching to this[4] profile instead.

For the actual AI, I basically set up three docker containers: one for speech to text[5], one for text to speech[6], and then ollama[7] for the actual AI. After that it's just a matter of pointing HomeAssistant at the various services, as it has built in support for all of these things.

1. https://www.adafruit.com/product/5835

2. https://esphome.io/

3. https://gist.github.com/tedivm/2217cead94cb41edb2b50792a8bea...

4. https://github.com/BigBobbas/ESP32-S3-Box3-Custom-ESPHome/

5. https://github.com/rhasspy/wyoming-faster-whisper

6. https://github.com/rhasspy/wyoming-piper

7. https://ollama.com/


> 1. https://www.adafruit.com/product/5835

The nails in the video made me laugh


I assume it's very similar to what Home Assistant's backing commercial entity Nabu Casa sells with the "Home Assistant Voice PE" device, which is also esp32-based. The code is open and uses the esphome framework so it's fairly easy to recreate on custom HW you have laying around.


Seems interesting setup, do you have it documented anywhere, thinking of building one!


Can you tell me about these voice satellites?


He is referring the M5 Atom's I believe. I strongly recommend the ESP32 S3 box now, you can fire up Bobbas special firmware for it, search on Github, and its a blast with Home Assistant.


I'm actually using the esp32 s3 boxes myself!


omg, this is something I've had in mind for quite some time, I even bought some i2s devices to test it out. Do you have some pointers on how to do it?


Do you also add custom tools to turn on/off the lights?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: