I really wish this trend of prompting gen AI models with text would stop. It's really meaningless. Musicians need gen AI they can prompt with a melody on their keyboard. Or a bit of whistling into the microphone. Or a beat they can tap on the table. That is what allows humans to unleash their creativity. Not AI generating random bits that fit a distribution of training data. English language is not the right input for anything except for information retrieval tasks.
Agreed! Those will be much more fun and we plan to support that. However, right now we're focused on making the base model slightly better, then we can easily add all of those controls (a-la ControlNets with Stable Diffusion).
But this is not easy, it's the real challenge here as there are lots of text-to-audio models out there. It is far from solved for Stable Diffusion as well. ControlNet is pretty bad. Just try taking the photo of an empty room and asking an image model to add furniture. Or to change a wall colour. Or to style an existing photo as per the style of another and so on. We are very far from being able to truly control the output generated by the AI models, which is something that a DAW excels at. I'd start with an AI-powered DAW rather than text-to-audio and try to add controls to it. It's like Cursor vs Lovable if you get my drift.
> Not AI generating random bits that fit a distribution of training data
How is that specific to text prompting? If you tap your fingers to a model and it generates a song from your tapping, it's still just fitting the training data as you say.
I may be an outlier but I don't get it - I much prefer a visual interface of Google Flights that gives me all the options. I'd hate to have to explain to a real PA all my preferences for each and every journey (every flight has a very different set of trade-offs). It's like asking a realtor to show me the 3 houses she thinks would be best for me. Thank God for technology that allows us to see all the options.
The real value is in the actual booking aspect: fill in all the forms on the airline's website and checkout for me!
It seems to me that we've stumbled upon this method of GPU-heavy matrix-multiplications in deep neural nets, and have only scratched the surface of alternative methods that are actually optimized for current CPU architectures such as Tsetlin Machines, Hyperdimensional Vectors, etc.
Google's AdWords business is worth about 10X that of Open AI ($1.5T) and does $250B of revenue. P/S of about 6X. If everyone stops googling and starts talking to ChatGPT instead, and GPU costs continue to fall such that ad-supported ChatGPT becomes feasible, it's not unreasonable to assume that Open AI can grow revenues by 100X from here.
Which assumes that Google will stand still, instead of cannibalizing its own business model.
The numbers look extremely bad if you're expecting OpenAI to take a significant chunk of Google's business. It's a rounding error compared to the numbers search does.
"GPU costs continue to fall" laughable prediction. They need more powerful GPUs not less as time goes by. These companies need monster GPUs with monster clusters that need monster energies. OpenAI is a bankrupt company as of now, their valuation means nothing as long as they keep loosing money as they do now.
Google and Bing have been crawling the web for decades. Perplexity crawler is just for marketing purposes, there's no way they even have 1% of Google's index. So yeah, in reality they just query Bing API.
Forget bytes, go for bits. Vocab of size 2. At a theoretical level all of AI comes down to a classifier that is able to predict the next bit given a string of bits. Check out Tsetlin Machines. At some point we will be doing it in hardware.
My worry is that going down the rabbit hole of what is plagiarism and copyright is not productive. Humans are inherently rip off engines by this definition: everything we create is some remixed version of acquired knowledge created by someone else. Where do you draw the line? How much novelty must there be? Can you police this remixing at scale?
Tough questions when a machine can create novels in seconds that are as good as human written novels over many years. Value of knowledge is about to plummet.
> Where do you draw the line? How much novelty must there be? ...
> Tough questions when a machine can create novels in seconds that are as good as human written novels over many years. Value of knowledge is about to plummet.
You draw the line in a different way, different regimes for humans and machines: a friendly one for human creativity, a prejudicial one for machine "creativity". Sort of like how it's murder to physically crush and dispose of one of your human employees, but it's fine to crush and dispose of an old computer or car.
It's a probably pipe dream, because our society is cruel and indifferent, but it's the way it should be if you actually care more about people than systems.
Maybe we should start talking in terms of scale then. Yes, we humans are inherent rip off machines, melding that together with our conscious experiences is where creativity is born. We aren't capable of ripping off at industrial scales, we get inspired by other work, a few people we look up to, and over time we develop our sense of taste and use our creativity to transform all these experiences into something new.
A rip off industrial machine ingesting all creation to spit out something else is not something we've seen or dealt with before, if we ignore the scale aspect of it we are very ill-equipped to deal with the consequences.
Scale always matter. A pretty good example of that is social media, we always had the village cranks, the conspiracy theorists, etc. but given the scale social media amplifies that we are now dealing with issues never seen before.
Humans have the capability to draw upon their influences to re-interpret and innovate in ways that lead to a new, unique interpretation, moving standards forward. AI always mimics, nothing more.
Eh, the rabbit hole already lead to you being in the wrong when copy other humans. Why would it be any different if you as a human get an AI to copy another human compared to if you as a human use a paintbrush to copy another human?
> lead to you being in the wrong when copy other humans
How so?
> Why would it be any different if you as a human get an AI to copy another human compared to if you as a human use a paintbrush to copy another human?
No difference indeed. AI is just a tool, like Photoshop or a paintbrush. Unfortunately it's practically impossible to argue in court when two works are "substantially similar" at scale. This happens for extremely high profile cases atm, most artists are copied and never seen a dime, because the definition of "substantially similar" isn't black and white.