Hacker Newsnew | past | comments | ask | show | jobs | submit | thedangler's commentslogin

How did you do this locally? Tools? Language?

I just followed the Quickstart[1] in the GitHub repo, refreshingly straight forward. Using the pip package worked fine, as did installing the editable version using the git repository. Just install the CUDA version of PyTorch[2] first.

The HF demo is very similar to the GitHub demo, so easy to try out.

  pip install torch torchvision --index-url https://download.pytorch.org/whl/cu128
  pip install qwen3-tts
  qwen-tts-demo Qwen/Qwen3-TTS-12Hz-1.7B-Base --no-flash-attn --ip 127.0.0.1 --port 8000
That's for CUDA 12.8, change PyTorch install accordingly.

Skipped FlashAttention since I'm on Windows and I haven't gotten FlashAttention 2 to work there yet (I found some precompiled FA3 files[3] but Qwen3-TTS isn't FA3 compatible yet).

[1]: https://github.com/QwenLM/Qwen3-TTS?tab=readme-ov-file#quick...

[2]: https://pytorch.org/get-started/locally/

[3]: https://windreamer.github.io/flash-attention3-wheels/



It flat didn't work for me on mps. CUDA only until someone patches it.

Demo ran fine, if very slowly, with CPU-only using "--device cpu" for me. It defaults to CUDA though.

Try using mps I guess, I saw multiple references to code checking if device is not mps, so seems like it should be supported. If not, CPU.


Kind of a noob, how would I implement this locally? How do I pass it audio to process. I'm assuming its in the API spec?

Scroll down on the Huggingface page, there are code examples and also a link to github: https://huggingface.co/Qwen/Qwen3-TTS-12Hz-0.6B-Base

I wanted to try this locally as well so I have asked AI to write CLI for me: https://github.com/daliusd/qtts

There are some samples. If you have GPU you might want to fork and improve this, but otherwise slow, but usable on CPU as well.


Dollar milkshake theory

Isn’t this type of software illegal?

If I went to try and sell it , I’d be arrested.


But who's gonna arrest the police? Seems like more and more people are testing/stretching the limits of what's actually enforced.

Similarly, hearing about the Eppstein files makes me sick:

- deadlines? not met - limited redactions? full documents redacted - redactions explained? not at all


I highly doubt that…

Thank you! I'll give it a shot.

If its delicious, and I don't know what its made from, but confirm its healthy. I'll eat it. Ignorance is bliss sometimes.

How can you confirm that a food is healthy, if you don't know what it's made from?

Tailwind is nice and all be it’s crazy verbose, I still am a fan of bootstrap. In the days of AI and tokens. Tailwind classes and styling cure through tokens. lol


I've only started but I mostly use Claude Code for building out code that has been done a million times. So its good at setting up a project to get all the boiler plate crap out of the way.

When you need to build out specific feature or logic, it can fail hard. And the best is when you have something working, and it fixes something else and deletes the old code that was working, just in a different spot.


I had a nice little app that I would run once and a while and it would take all my Weekly Playlist Spotify built, remove duplicates and make one playlist.

Spotify built playlists are no longer accessible in the API.

I do not like them now.


Bring back Pebble Steal. I lost my original and got the pebble time 2 but its just not the same.


You can get a Pebble Steal next time you see someone wearing one in public...


eBay has them (used).


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: