This is primarily architecturally interesting in my opinion. Output songs have unusual noticeable artifacts, and I would guess they become more noticeable the more you listen.
That said, wow. An end to end FAST architecture that can infer a 4.5 minute song in 10 seconds is a compelling thing. I didn’t see if we got open weights, but my guess is that this is not crazy challenging to train, and some v2/v3 versions of this are likely to be good-to-very-good.
The huge missing issue is direction. Songs are way more than just a 10 second style reference and lyrics. Even the most generic pop song from the 90s had recognizable choruses some repeated bars and some ebb and flow to the song that connected to the lyrics to make it interesting to the human ear. Right now the generated songs, as you noted, somewhat glitchy lyrics over a bland backing track that just sort of goes at one speed and note for the whole of the lyrics.
"Electronic music aren't real songs there's no real instruments involved". Let's be a bit creative with these tools. Sure the pure output isn't always plesant or listenable but there's probably an interesting genre to carve out here
If I am to retain any interest as an amateur music writer without proaudio engineering skills and equipment, but with a day job, , I want tools that help me enact MY vision to reality. That means multi tracking, ability to hum or score a melody and have it transfer to musical instrument, ability to enter existing tracks, provide a temporal segment for diffusion, and ask it to 'generate a counterpoint to the melody with strings, etc. The most exciting possibilities of this is enabling talented writers with day jobs, not one click song writing.
As an amateur musician, I'd like tools that help me be more productive musically - those that complement my skills (whatever they may be).
All the things you mentioned above, namely, ability to score a melody via a simple hum, transfer to various instruments, generate proper responses to calls, generate melodies within a framework, etc., all these would be super valuable to me.
I'm an OK guitar + bass + keyboard player, I'd LOVE to have an AI assistant that accompanies along. That would make my own jammin' so much richer.
I dont think we have seen the end of AI-driven tools in music-tech yet. I'm cautiously hopeful.
I definitely see this happening. Music generation has lagged behind image generation but is following more or less the same path. Early image generation models were completely unconditional; all you could do was sample an image. Then coarse conditioning methods such as text prompts and depth images came along; then additional tooling to tune images in a more fine-grained way.
That said, there is a difference to images in that music also has a "symbolic" level to it that is closer to text than images [1]. There's other work out there that uses LLM-type tools for direct melody generation (no audio). And of course, there's lyrics. I do expect commercial tools to start integrating all these capabilities gradually, it's just a matter of time.
[1] I guess there's also vector images (like SVG) - I've seen work in generating those as well, though it's less mature than directly generating pixels.
The request is valid; you just need the right tools for the job.
Story Jam lets you design chord progressions without needing to know about music theory, instead offering intuitive terms like "lightness", "darkness", "drifting" and "roaming". They mean about what you think they mean.
Yeah, I'd think that it will take commoditized generation tools that existing or new composition multi-tracking tools could incorporate. i.e. FLStudio plugin
The people writing the one-shot tools are living a pipe dream and are riding the hype wave. One-shot AI music will have a short amount of interest based on its novelty, but the very next generation of humans will revolt against it as a cringe decision of the old guard. Form there it might finally be applied more realistically as an aid to human expression instead of a replacement.
Goodness, the music that is produced has almost no discernible time signature. I don't know if my brain is faulty, but I find it extremely annoying to listen to.
Cool. Obviously needs some work. Lots of artifacts. Something to build on though.
Lots of sour grapes comments from folks. Too bad. Not what I expect out of Hacker News. Glad people are pushing the technological envelope and exploring this space despite the strong negative emotions.
I get it, but the anguish of people who were paid dramatically less and treated dramatically worse than software developers whose livelihoods are being deliberately harpooned by the software business is not "sour grapes." I'd expect the software culture to be tolerant of the upper class being being indifferent to that anguish, but I see people around here all but dancing on their graves with no real push back. I think y'all generally do a good job with this place but frankly, the utter lack of consideration for the people most affected while being quick to make sure they don't get too touchy about it says everything there is to know about SV culture: all the talk of ethics and fairness get thrown right out the window the second they start to conflict with the bottom line. Fortunately American culture on a whole seems to be following suit so at least it won't seem weird. I don't think I can even passively participate in this culture anymore. I bid you well.
I'm not disagreeing with you! We just need people to make this sort of point thoughtfully, in keeping with the site guidelines. I know that's not always easy when strong feelings are present. (It happens to me too.)
I also don't disagree with you about people's lack of consideration for others, but I think the scope is much too narrow to talk about that as a HN thing or a SV thing or even a national thing. It's present way beyond those levels and may just be a fact of human nature (though fortunately not the only fact).
I'm friends with plenty of starving artists or folks working day jobs and forced to keep their music on the side. If you're sad about not making money in music then there are dozens of other things you can blame besides AI that doesn't actually work well enough to take the job you currently don't have because of reasons not controlled by the evil software nerds.
I’m talking about problems a lot more broad than music, hobby artists are a completely different entity than commercial artists and conflating them is targeted willful ignorance. And frankly, developers comprise the most coddled, overprivileged, out-of-touch professional sector south of the c-suite in any industry: your musings about income for anyone outside of the software business are meaningless.
Enjoy your mass layoffs because they aren’t stopping in my sector. Oh, are you one of the special ones too talented to be affected? That’s good news because all developers are according to them. Good luck.
It's fine that you don't like it. It's odd that it's such a popular opinion on a site frequented by "hackers" and "disruptors". Why is this opinion more prevalent on posts for synthetic music generation and not other synthetic output? Why does synthetic music generation stop you from making music the old way? Are you upset that a business model that is already dead will be pushed further into irrelevance?
This is why we never should have invented the phonograph. People who want to listen to music can just buy a record, making it literally impossible for them to perform an activity humans ENJOY doing. Without it everyone would surely be making all their own music, and nothing valuable would be lost
The ability to record has led the greatest expansion I musical artistry in human history.
Ty it don’t think peasants were listening t to Bach, do you? Only the extraordinarily wealthy could afford to have music as anything like an every day thing.
And with more affordable and easier-to-learn tools, the creation of music will be similarly made much more accessible?
DAWs and virtual instruments running on regular laptop was one step, generative AI models will be another?
You're debating two different things, two different experiences
Creation is a human activity, charged with emotions, efforts, which are their own rewards, as much as the end-product, which is invested of this human (sometimes collective and not instant) effort and intention and creative loopbacks. Let's call that some kind of history (because the process did happen).
Generation short-circuits that entirely, as it happens at non-human speeds, and non-human scales. It's something _else_ entirely. You do get an end-product. It may be fun and useful for some; it sometimes is. However, you don't get the process, the collaboration and the inner transformation it comes with.
Adding: with two different end-products, the issue is then how they are perceived, received, appreciated and valued by those not "in the know" of how they were made. And that is both an artistic, aesthetic and economic problem. Generating soulless shit that isn't invested with a human sentiment miseducates people and destroys taste.
I agree with your overall description of creation. But I do not agree that generative models are something else entirely. They are tools, and while their affordances do influence what people do with it, in the end the responsibility is on the creator. You can make "soulless shit" or "thoughtful commentary" or anything else you put your mind to, by using these tools in combination with all the existing ones.
Models that are oriented around one-shot, text-only direction are pretty limiting in creative flow. This will hopefully continue to improve.
To make what I consider a halfway decent song with these current easiest-to-use services (like Suno and Udiio) takes a few hours in my experience.
To get there one has to work with the text, the song structure, find a decent style, and then do corrections on sections where the models goes off track.
To make something that is closer to "good", I would go and re-record all the lead vocals myself, and then mix this in a DAW.
The tools and knowledge for making music are already unbelievably accessible. Anyone with an internet connection and a decent computer can read about music theory, learn to use a DAW, and get some basic virtual instruments. The same goes for producing art, which doesn't even require anything digital.
This does not augment the music making process in any way, it simply replaces it with what might as well be a gacha game. There's no low-level experimentation, no knowledge acquisition, no growth, and you can't even truly say you made whatever comes out.
It's not a tool for music creators, it's a tool for people who want slop that's "good enough".
Sure, with several hundred hours to spare one can make some songs in a DAW. Now one can make something as good/bad in maybe 1/10x the time. Or, given the same time investment, one can possibly make something better!
The goal of AI automating labor should be to give us more leisure time to pursue hobbies, not to fill our limited leisure time with low quality substitutes for those hobbies.
Making an activity in which the primary limiting factors for most people are the time, knowledge, and effort required (as opposed to expensive tools) into an effortless slot machine pull is enfeebling to human creativity and agency. Who will spend the hours of making bad music to get to the point where they become good if they can just rely on something else to generate music that's "good enough"?
There's something to be said about all this which is related to AI generated images that I rarely see brought up: people with specific skills play roles within groups, so AI making their hobby that they dedicated so much time to more easily accessible makes them lose social value, which might make them quit altogether.
The common response that "people should make art because they love it, not for attention" is a prescriptive statement that supposes there are more or less "pure" forms of performing an activity and also ignores that art is a form of communication.
"low quality substitute" and "effortless" are value judgements on your behalf. Many made similar judgements about DAWs and VSTs. And that is your right. But not everyone sees it in the same way - for some generative models are opening up a new world of possibilities.
I agree that the slot machine pull of current models is tedious and boring. I look forward to models/systems which better facilitate more creative control, directed exploration and iterative refinement.
Yes, there are a TON of free tools and endless instruction on using them. If you move your budget up to making one-time payments for things that cost less than one month using a subscription service, you get an astonishing breadth of new options. Beyond that, so many of the more expensive music making tools are one-time payments rather than subscription services. Buy Ableton once? You own it. You can get the latest version at a discount, but there's absolutely nothing stopping you from using the version you bought, in perpetuity.
Lots of common people did listen to Bach, because he wrote many works for church organ. Church attendance was almost universal, and even small churches had (small) pipe organs.
His work was not commonly performed in his lifetime, and I think you're rather proving my own point? Yes, they could perhaps occasionally listen to Bach, if the organist at their church was aware of him (most would not have been, not until hundreds of years later), had the music, were willing to perform it, and you happened to be in attendance when they did. That's a lot of chained ands.
There are like 6 core activities that bind humans together: shared creation of food, myth and music; co habitation, protection, child rearing.
We've done these things ourselves for hundreds of thousands of years. As we are increasingly convinced to buy them for convenience we loose the very things that make us know our connectedness.
So ya, there are real problems caused by the convenience of technology
People will still enjoy making music. Musicians will make music quite regardless of whether anyone is listening or whether there’s recordings or AI available.
I don't think there's any stopping it, unfortunately. The internet is too good at "optimising" content. The future is Mr Beast, Instagram hotties and 6 pack guys, tiktok morons and onlyfans. Be happy, the market has spoken.
People that never considered the value of artistic process until it was the topic du jour unilaterally decided that it was inefficient, oppressive, complex, frivolous, and unfairly inaccessible to those that hadn't put any sustained effort into developing theirs. If you didn't understand what they don't, you'd realize that companies spending billions of dollars to create tools that make cheap simulacra of artists' work to sell them at a loss to crush them in their own markets was merely the natural progression of artistic praxis. Despite it being economically unsustainable and clearly only cheap until it craters the value of artistic skill, these tools have democratized creativity. Instead of creation only being available to those with the interest and willingness to practice and develop their artistic sense, process, and skill, they're now broadly available to anyone willing to pay money for a subscription service that will obviously soon be a hell of a lot more expensive, or shell out a few thousands dollars for a top-tier video card that you almost certainly already have in your gaming rig, anyway. This is silicon valley progress and if you don't like it, you're a communist.
Totally with you. But it's the trend we get to re-balance in a good way:
> People that never considered the value of artistic process until it was the topic du jour unilaterally decided that it was inefficient, oppressive, complex, frivolous, and unfairly inaccessible to those that hadn't put any sustained effort into developing theirs.
This is eerily reminiscent of what's happening inside the USA government & administration today...
It's incredibly elitist to gatekeep people having their plans, actions, opinions, and philosophical ideas taken seriously just because they haven't trudged through the onerous process of considering what humanity has already learned about those things. Do these people expect everybody that wants to profit must try to predict the damage that their actions could cause among people that will obviously be affected? Some people just don't like ethics that much, and expecting them to be beholden to their boundaries is pretty old fashioned.
For sure! After all, what could be more democratic than a monthly subscription that could get snatched away at any moment - and clearly there's nothing more creative than pressing a button and waiting for 20 seconds!
I like the part where you confuse being sarcastic with being intelligent. A language model somewhere is taking notes.
> People that never considered the value of artistic process
One certainly learns of crazy things on HackerNews. Apparently people have never considered the value of artistic process, and not only that, but you also happen know that exactly.
> the topic du jour unilaterally decided
You're literally in this thread disagreeing.
> it was inefficient, oppressive, complex, frivolous, and unfairly inaccessible
Very interesting claims, too bad they were only stated in your imagination. That being said, your imagination I think is surprisingly close to my opinions! Let's discuss each point:
- it is very time-intensive to produce creative works of any kind, and indeed to perform any kind of mental work at all
- it does get pretty complex too, and because of this, some mental efforts are even shot down for being too frivolous (such as that bit of automation that is not worth making because it would never pay itself off)
- oppressive is a bit of an odd one, but if I think hard enough, I guess I can see how having to use the output of e.g. my work (software) can be oppressive
- same for unfairly inaccessible - lately there's been a trend where various services would only be available online, and the only contact you'd get is a self-service form or two. Maaaybe you'd get an AI chatbot to chat with. Certainly, to those with minimal to no tech literacy, this will be inaccessible and it will feel unfair.
> was merely the natural progression of artistic praxis
If only there was a way to disagree with this without being a dickhead!
> these tools have democratized creativity
How does one democratize an innate property of people? Surely you mean that they have democratized the production of creative works rather, and even of those only the less high-art ones, which I'm sure you never fail to point out when shown one?
> they're now broadly available to anyone willing to pay money for a subscription service that will obviously soon be a hell of a lot more expensive, or shell out a few thousands dollars for a top-tier video card that you almost certainly already have in your gaming rig, anyway.
And what happens after that? Artists will be like "oh gee, well I'm not doing this again!"?
> This is silicon valley progress
And also Hangzhou and Shenzen, China.
> and if you don't like it, you're a communist
Are you? You seem to be more of a raging idiot than anything to me at least.
Related: why are programmers racing to make the perfect AI coding tool? It's an activity many programmers enjoy, and more importantly, if the pace continues, they will likely be automating themselves (or at least a large portion of programmers globally) out of a job.
Granted, many people are benefiting from these tools (myself included) but at some point a lot of us are going to have to find a new job (assuming the progression continues unabated), and I'm not sure what new jobs are going to exist when LLM coders replace many or most of us.
Not everyone enjoys composing music, and for a large group of people paying an artist is not an option. There's a lot to critizise about current AI tech, saying of all things this has no net benefit seems like the wrong thing to call out, and incredibly short sighted for HN.
You're not composing music with an AI generator either: you're pushing a button with a few, limited instructions, and expect something that rewards your perception of what makes good music for your intention.
If you don't enjoy composing music, just don't do it, and give it to someone who does, and has the experience/knowledge/culture/practice/gut to do it.
> If you don't enjoy composing music, just don't do it
This supposes that the music is the end goal, and the very point of my comment is that it doesn't always have to be, and in those cases "just don't do it" also means not doing whatever comes after.
Just as you state below, this doesn't replace creating music for the creation's sake. I don't believe it will, or should. It merely replaces having nothing at all, or having the 100,000th video with the same upbeat stock sound.
What an incredibly elitist, smug attitude. You're basically saying people only have the right to hear the music that professionals think they should hear.
That's not smug at all. That's not what I'm saying either.
It's just that.. you can't master something you don't practice and understand. It's true in every single thing in life you do, sports, literature, maths, music, cuisine, kindness, etc.
If you don't like to compose music, why suffer this and even submit to the randomness of some computer program, rather than giving the opportunity to another fellow human to open your ears and your mind to what they appreciate doing?
You can generate your music if you like. It just cannot compare to something a human really did on her own, and invested of her desire, time, practice, research, even a beginner.
It's not a matter of being professional or not. The best musicians I know are not professionals, they all have a day job.
For every famous star for one given instrument, you have 10s of undiscovered/local better musicians that just are carpenters, cooks, painters, drivers, factory workers.
I don't understand the "give it [the task?] to someone who does" part. Obtaining a hobbyist composer who is available at short notice and obeys instructions for free is not usually an option. Maybe there's a website for this, but it would have to be humming with idle composers in order to offer quick and satisfactory results.
I think "stop not enjoying it" is a better line to take. Like with AI illustrations (where I'd much rather see a blog author's crappy biro drawings instead), terrible amateur efforts with some online 808 emulator or whatever would be more entertaining and interesting than AI output.
"Stop not enjoying it" is indeed a way better take!
"Giving it someone who does" is also an opportunity to socialise and grow a mutual understanding of said music desire. That's sometimes even how collaborations start. But that's not on short notice...
Perhaps generators could be also seen as some kind of introductory instruments to wet the appetite of becoming musicians?
So putting paid humans out of business is your position then? Please explain why you believe in the long sighted view AI reducing already poverty level wages to zero is beneficial.
Do you not see how your argument could be applied to steam engines putting human laborers out of work? Or computers putting (human) calculators out of work? Do you think inventing the steam machine or computers was a mistake too?
What new jobs are going to be created when AI does everything humans currently do? In the past when new technologies were created, new jobs and industries were also created, but with AI, jobs are already being lost but I don't see many new jobs being created to replace them, other than "people who know how to talk to an AI to get what they want" and I have a feeling this will be a rather miniscule number of jobs.
Steam engines and computers solve problems to improve human life. They more efficiently perform tasks to free up time for humans to do other things they would rather be doing.
These API composers perform a task many humans want to do. And there are roughly zero consumers of music saying "you know what's missing from the music market? music made with no human input."
This strictly serves capital. The goal is to destroy more artists livelihood to marginally increase the wealth of already wealthy people.
If you're trying to maximize employment, composers aren't the first, second, or tenth place to go looking. If you're trying to say artists will bleed income, they already have for decades, and will continue to. The ones that make a living out of it mostly get their income from live performances and merch, and maybe adtech on social media platforms.
By the same logic synthesizers shouldn't have been invented that allowed people to make advanced sounds without tediously learning an instrument first, consumers should remain priced out of microphones and editing software, etc.
Like I said, I am not trying to feign ignorance on the drawbacks of the tech which is very real and far from negligible. I am not a tech bro AI maximalist. I just do believe that hyperbole will not put the djinn back into the bottle, and pretending like there isn't a real market between nothing and paying or being a composer isn't adding anything to the conversation.
In this particular case it is totally black and white.
Prove me wrong.
Tell me one example how music gen in any way benefits anybody to the level that is worth putting out of business the last few artists that make ends meet?
The difference between today and the hypothetical case of not one artist making ends meet from their music is what, 0.1%? 0.01%?
We would be better off if the other 99.9% didn't have worry about making ends meet, than if we do whatever it takes to keep the status quo of the 0.1% intact. That does not only go for artists.
Yes, we've taken a wrong turn into a hellish dystopia.
We've created machines to replace humans doing things humans enjoy doing. Leaving the drudgery machines were supposed to eliminate to be done by humans.
>You are automating an activity humans ENJOY doing.
There's at least an order of magnitude more people who enjoy making music than there are people with the actual skill/talent to make music. Music generation AI is an absolute blessing to the untalented among us who'd love to make a song in a certain style or with certain lyrics but lack the time, talent or ability to do it ourselves.
But don't mistake one thing for the other: how is it different than, say, being Emperor Joseph II asking Mozart in Vienna to write an opera for him?
Mozart wrote the music, not Joseph.
Similarly, you can hike across France, from South to Britain for several days. Or you can take the train. Or a car, alone, or with a driver. Or a plane, in the pilot or the passenger seat.
You'll get in the same place in the end. The experience will be totally, fundamentally different for you, as well as for others.
why make this- because people like music? I want to use it to make my own music, according to you, I can't because it deprives some /real/ musician of making money? what an insane argument- ban singing unless you're in a choir?
Some humans honestly enjoy automating stuff. We wouldn't want to be taking away something that humans enjoy, would we?
I'm a musician myself, but I sadly suspect that most music made today "benefits humanity" very little... Is music making always a net positive? If nothing else, these tools will allow more music will be made.
Yes because the act of making music, even not very good music, is what has value. Music generated without human input has no discernible value.
> If nothing else, these tools will allow more music will be made.
By machines that, as far as we can tell, take no enjoyment from making it. And eliminates any possibility of emotional connection between the artist and the listener. Which is the entire source of music's value.
There's already a huge library of royalty free music available and even just AIing some music up doesn't fully protect you from strikes if it hallucinates something close enough to an existing song.
Doesn't seem like much you could actually tweak beyond the short style queue and lyrics. To really be customizable you'd need a method to tweak the generation to inform the tone, inflection, and flow of the music at any given point to satisfy what you wanted to pair it with.
Songs have lulls and swells to go with the tone and emotion you're trying to create/communicate these are just strings of lyrics over boring barely connecting backing tracks.
rap producers are running out of samples to use for new tracks. Everything vintage that can be used has been used, sampling real estate is pretty much dried up. Why would you take away a project that brings a net benefit to humanity?
None of this is music. It is noise that sounds likes music. Pretty analogous to how AI slop is not information, but just words that are arranged to look like information.
Business doesn’t hate creatives, and is not specifically targeting creatives to automate them away. Any job that can be done as good for a lower price or better for the same price is going to be a target.
Let’s follow the AI and automation craze to its eventual conclusion - automations everywhere, humans are either employed in automation industry, or are unemployed at a massive scale.
Stable jobs are replaced by ever-optimized gig economy for some, and chronic poverty for others. For there to even be economy - the massive underemployed population subsists on government welfare.
Cynic in me thinks that all of the wealth generated by enormous productivity gains resulting from automation will not find its way towards population displaced by it. Those cashiers, toll booth, and warehouse workers did not find themselves in much more lucrative careers - I don’t see why it will be any different for truck and cab drivers who will be joining them in the near future.
If you see a future where these people who suddenly found all this extra leisure time o. Their hands and no income - are somehow blossoming in creative directions and realizing their own potential - I’d like to have it painted for me, as it all looks pretty bleak to me. Just not quiet sure of the timeline.
Best I can come up with is an emergence of some kind of counter-cultural protest market where people buy and sell “made by humans” products, and are continuously attacked by various regulations originating from mega corporations who captured the government.
> Cynic in me thinks that all of the wealth generated by enormous productivity gains resulting from automation will not find its way towards population displaced by it.
Empirically, that's not true.
Unemployment was at an all-time low after most of those jobs were eliminated, and wages after adjusting for inflation continued to rise in real terms.
I am inclined to doubt the sources of these empirical observations. Statistics are funny like that, “average patient temperature in the hospital” effect and frequent inability to correctly attribute confounding factors outside of observed window.
Equally bad is anecdotal evidence, but I’ll drop some anyway. For a while now I am observing a crisis thats, admittedly subjectively, easy to see - but is somehow absent in those empirical sources citing economic accomplishments. An indirect evidence of what I am talking about - is crushing defeat of democrats/establishment in last election, following among other reasons, quite a backlash for boasting about said accomplishments.
But rather than picking issue with one of my points - I still would like someone to describe the counterpoint to my dystopian expectations - where, for example, would all those professional drivers I mentioned earlier go?
Ps. Oh speaking of statistics - remember Greenspan’s “there’s no real estate bubble, there’s froth in individual markets” right before 2008 financial crisis? It be funny like that, sometimes much derided common sense is all you need /shrug.
> where, for example, would all those professional drivers I mentioned earlier go?
Wherever all the cashiers, toll booth operators, and farmers went after automation took their jobs.
New jobs are created, the people displaced have to migrate to them.
Is it fun for them? No.
Is it how the world works? Yes.
Technology thus far has a VERY VERY long and established role of creating more jobs than it eliminates.
See >95% of the population being employed in agriculture for tens of thousands of years and being reduced to about 5% over the course of 100 years (and civilization being FAR FAR better off for it).
Will that trend one day end? Probably.
Will it be doomsday for the plebs? Who knows.
Is it happening within a timeframe worth worrying about? Unlikely.
That's right, they don't just hate creatives. They'll go after anyone.
I wonder what the hyper-capitalist's end game looks like. One giant company that covers everything with one man sitting at a dashboard, tweaking parameters? Is that one man even necessary?
I wonder what our plans are for when "the economy" prefers to do it's thing without us. Writing poems all day? What capitalist instrument will provide "money" for us to spend in this giant machine?
I don't think its at all extremist to look at that picture, realize it won't really have made any sense for the majority of the people on the planet well before it gets to that point, and that consequently some type of major global revolution will prevent that from happening.
Yes, this has always been the case. This is why capital holders are actively hostile to labor organizing and tend to back fascism when liberalism falls into crisis.
They don't hate at all. They are just maximising profit (which they have an obligation to do). If they didn't replace you with more efficient things, they would be outcompeted and die.
So, feel free to criticise capitalism and how inhumane it is, but don't anthropomorphise it by ascribing human emotions to the system.
whatever can be automated isn't "true" creativity. these models merely generate an average music, but the outputs of creative musicians always stand out.
If I was a business I'd "hate" creatives too, and I'd also want to automate them away. The costs of producing (truly) creative works is utterly bonkers, and so are the risks associated.
That's why corporations that have made creative products have traditionally never gone anywhere. They all just went out of business. And all the artists got rich.
It’s just combining sample WAV files without human coordination, talk about a lame-ass achievement. It’s already easy enough to set BPM and load in files in Ableton and warp them into unison, from what I heard this is basically just that with”HOORAY FOR AI” slathered as a veneer on top.
If you think I’m being harsh, I have my reasons as a professional musician to critique these things in an unflattering light because they are my competition. Thankfully actually “generated” AI music is trash. Copyright is problematic in the US, I admit, but tech bros using copyrighted material to train programs to put us out of business - without paying a penny which even Spotify doesn’t per stream - yeah, I’ll have some disdain about this scenario and I feel it’s justified.
Sorry no. Here on HN, your having a vested interest in some market makes your opinion entirely invalid. That is, enless you're interested in one of the correct markets such as software or AI services.
One thing that strikes me about almost every AI-generated track (from academic or commercial generators), is that even if it's often "competent" - in that it has reasonable melodies, chord progressions, etc - is how average it is. Mediocre, taking the term literally. In a way that also highlights cliches and crutches that are common in human-made music. Somewhat reminiscent of GPT text that drones on and on in a grammatically correct way but conveys little of interest. This is of course not unexpected, given how these models are trained. I wonder if this will have an effect of pushing (human) musicians to be more experimental - to move away from the conventions that are now just a click away for anyone.
Yeah-- in a professional workflow, at best, these tools are for getting ideas rather than creating output that will be used directly. Lots of folks use them for actual creation because they're just so enamored with the ability to create vaguely technically competent output from text, but they're all pretty much a bee-line to mediocre, and overcoming mediocrity is absolutely the most difficult part of working with AI output. The same is true with text, as you mentioned, and image generators. As Charles Eames said, "The details are not the details. They make the design." Well, these tools suck with details, and details convey character, perspective, message, meaning, etc. Surely the tooling will improve this in years to come, but it certainly hasn't yet.
That said, wow. An end to end FAST architecture that can infer a 4.5 minute song in 10 seconds is a compelling thing. I didn’t see if we got open weights, but my guess is that this is not crazy challenging to train, and some v2/v3 versions of this are likely to be good-to-very-good.