Hacker News new | past | comments | ask | show | jobs | submit login

"I want AI to do my laundry and dishes so that I can do art and writing, not for AI to do my art and writing so that I can do my laundry and dishes." (C) authorjmac



This AI stuff is overhyped and has resulted in the creation of a lot of slop and spam and is fraught with unresolved ethical issues, but AI is just computers and computers are just automation, which has been used to accelerate art pipelines for decades. It doesn't really have to be all or nothing either, I've made a few AI generated songs using Udio and Suno and all of them were my own lyrics with no generative AI assistance.

The main problem I have with generative AI tools in an artistic sense is when they lack the ability to convey specificity of intent, word prompts alone aren't good enough.


I agree, and it's frustrating that there's so much fixation on this "single text prompt to [other thing]" use case in how people are building these things out. I think that drives a lot of the "slop" feel of these things, because the target consumer of the tools isn't someone who wants to engage with an artistic process to create something, which to me is a process of refinement and a feedback loop with one's tools, no matter what those tools are

I think this might be a good research paper proof of concept for a model, and a lack of explanation of how it works is disappointing but expected. I think as a product, the target audience for this thing isn't people who want to make art, but people who like the idea of generative AI per se. Maybe it'll go more toward being a tool artists can use in the future, but I don't think that's what gets you funded in this environment, and it seems much harder to make things that work that way. The coolest uses of and tooling for generative image models have been created by the open-source communities around them, and I think the same will be true of audio


> a process of refinement and a feedback loop with one's tools...

Yes! While technically impressive, these "text prompt to finished song" AI tools currently only solve low-value problems for already over-saturated markets. I just don't see a good path to a real business from "finished song" as the use case.

* With Spotify, Soundcloud, etc music consumers already have access to more new, human-created songs than they can possibly listen to - all at historically low cost.

* Buyers of custom created music such as video makers and game studios already have more stock music library choices and custom creation options (from Fivr etc) than ever - also at historically low costs.

These are already low-value, commoditized markets and, once the novelty wears off, can't generate VC-level returns. And, no, I don't think AI is going to take a meaningful part of the high-end music market from the likes of Taylor Swift. It's not that I doubt AI will eventually make music that good - it's that high-earning pop stars like Taylor Swift, Beyonce, etc are much more than their songs. They are global brand businesses that generate more revenue from touring, merch and product tie-ins than the music itself.

However, there is a potentially profitable market for AI music tools that no one's targeting yet. It's a smaller market but it's accessible, scalable and immediately viable for even a beta-level, "research-to-product" solution. Don't generate finished songs. Instead, make an interactive tool which collaborates with human music makers in a much more granular way by generating the elements and components of music (called stems) as well as the underlying MIDI data. There's a whole industry selling human-created element libraries consisting of stems, loops, backing tracks, samples and style-based construction kits. These are used in a lot of the human-created music we hear. But they aren't interactive, adaptive or collaborative.

AI can provide a superior solution right now and it doesn't even need to be 'top human' quality to be useful. Pop stars like Taylor Swift etc can afford to hire the best, proven, human-producers, studio musicians and mixing engineers to collaborate with but there's a significant market of people, from students and hobbyists to indie producers and semi-pro musicians who can't afford human collaborators.

To me this looks like a pretty rare thing in AI: A classic "Two Pizza"-type startup opportunity where a modest seed round can get to product-market fit and real cash flow. You also won't have to out-market Taylor Swift, outspend FAANG or target fickle consumers.

I'm just a long-time music making hobbyist and I consistently spend several hundred dollars a year buying such libraries, stems, loops and samples. It's far more than I pay for all my subscriptions to 'finished music' combined. And I have no aspirations to make money with my music. Hell, no one outside family and a few friends ever even hear it. Making music is just an extremely enjoyable creative activity I like to spend time (and money) on. But, as a potential customer, I have no use for a tool that generates finished songs. However, an AI that takes text prompts along with some midi chords and musical phrases I provide and then generates a variety of suggestions in the form of separate stem tracks with MIDI which I can further mix and modify would be an 'instant buy' for me. It doesn't need to be as good as a human collaborator because it's better in other ways: always available, non-judgemental, infinitely patient and yet has no opinions or emotional needs of its own.



Image gen is streets ahead of music in terms of control, as long as you stick to the FOSS stuff as DALL-E is too limited. I’m only an observer for now and haven’t actually used it much, but both StableDiffusion and SDXL have ControlNet and a bunch of other things that let you, for example, draw a stick man in a specific pose and the AI will generate a realistic man in that pose. Or edit one specific part of the generation and continue iterating from there.

The day we get a similar level of control with AI music will be a dream come true for me. We really need stems or at least MIDI files for these tools to be more than just soulless jingle generators imo.


I've been using Krita with the Stable Diffusion plugin, it's pretty amazing to use at times. I often read critics say things like 'you can't do layers with generative AI' and, uh, nuh? Though you can't, say, generate a shadow with adjustable alpha transparency, this doesn't seem like something that's impossible to do with the technology eventually. To think the tools won't improve would've been like looking at MacPaint and saying that digital art will never be a thing because it's always going to be low resolution and monochrome.

What I'd love is Suno/Udio as a VST plugin. Being able to supply MIDI or audio samples to pull melodies from, to generate from arbitrary audio on a timeline.


To that end, look up LayerDiffusion. It works amazingly well.


True, but for example for myself as an indie game developer with no musical talent or the financial resources to pay for unique music from an artist this is extremely valuable.

- Is it as good as N targeted music tracks that fit together to match my game? No.

- Is it better than something I can create myself? By far.

- Is it better than a random few open-source or cheap tracks that you can buy on any random storefront? Sometimes.

So at the very least it has a foot in the door as far as I'm concerned.


Extremely valuable and helpful tool to make things good


> I want AI to do my laundry and dishes

Machines for those tasks are already commonplace. Is loading/unloading them really that much effort?


Despite these machines, a person can easily spend an hour or two every day on common household tasks like cleaning the kitchen and doing laundry. That’s time that many people would love to get back.


Remembering the episode from the game Detroit: Become Human where an android was making the drawing


That quote has been attributed to many different people in similar forms over the last year or two.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: