Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

When I first used ChatGPT's voice assistant's I was like "Wow, this one is clearly Scarlett Johansson from Her, they even copy her mannerisms."

No amount of unverifiable "records" (just pieces of paper provided by somebody who has a multimillion dollar incentive to show one outcome) will change my mind.

But if they can produce the actual voice artist I'd be more open-minded.



Funny, I'm the opposite. I saw clips from the film after the controversy (it's been ten years since I saw the film itself) and Sky sounds nothing like Johansson to me. No amount of unverifiable "records".


1. The sky voice currently available in the app is a different model from the one they presented (one is pure TTS, the new one in GPT-4o is a proper multi modal model that can do speech in and out end to end)

2. Look at these images and tell me they didn't intend to replicate "Her": https://x.com/michalwols/status/1792709377528647995


Which one are we saying sounds like Johansson? I'm talking about the TTS voice in the app, is everyone else talking about the multimodal voice from the 4o demos?

Also, whether they *intended* to replicate Her and whether they *did* in the end are very different.



OK, I watched this expecting to be convinced.

I think they might have mimicked the style. The voice, though, is not even close. If I heard both voices in a conversation, I would have thought 2 different people were talking.


Truthfully, you can no longer trust yourself (whichever side you're on in this debate). We're all now primed and we'll pick up any distinguishing characteristics. You'd have to listen to them in a blind test and do this with several clips that do not reveal which ones are OpenAI and which are from a movie or something else that spoils it.

And I wouldn't put the metric at 50/50, needs to be indistinguishable. It would be a reasonable amount where it sounds __like__, which could be identifying the chatbot 100% of the time! (e.g. what if I just had a roboticized version of a person's voice) Truth is that I can send you clips of the same person[0], tell you they're different people, and a good portion of people will be certain that these are different people (maybe __you're different__™, but that doesn't matter).

So use that as the litmus test in either way. Not if you think they are different, but rather "would a reasonable person think this is supposed to sound like ScarJo?" Not you, other people. Then, ask yourself if there was sufficient evidence that OpenAI either purposefully intended to clone her voice OR got so set in their ways (maybe after she declined, but had hyped themselves up) that they would have tricked themselves into only accepting a voice actor that ended up sounding similar. That last part is important because it shows how such a thing can happen without ever explicitly (and maybe even not recognizing themselves) stating such a requirement. Remember that us humans do a lot of subconscious processing (I have a whole other rant on people building AGI -- a field I'm in fwiw -- not spending enough time understanding their minds or the minds of animals).

Edit:

[0]I should add that there's a robustness issue here and is going to be a distinguishing factor for people determining if the voices are different. Without a doubt, those voices are "different" but the question is in what way. The same way someone's voice might change day to day? The difference similar to how someone sounds on the phone vs in person? Certainly the audio quality is different and if you're expecting a 1-to-1 match where we can plot waveforms perfectly, then no, you wouldn't ever be able to do this. But that's not a fair test


I agree they don't sound the same. But, since it's a subjective test, OpenAI was pretty Twitter-foolish to push the "Her" angle after being explicitly rejected by SJ. It's just inviting controversy.


Without commenting on the debate at large, it’s a bit funny to read this comment.

I mean voice cloning a year or two ago was basically science fiction, now we’re talking about voices being distinguishable as proof it’s not cloned, sourced, or based on someone.

FWIW I also thought it was supposed to be the her/sj voice for a long time, until I heard them side by side. Not sure where to stand on the issue, so I’m glad I’m on the sidelines :)


Thank you for providing a nice side-by-side. This makes it clear to me the voices are not very similar at all. If Johansson had agreed, I have to imagine they would've been able to make a much closer (and less annoying!) voice.


The cadence and speed in Her is much too fast for any mass customer product


I keep reading in the media that Sky was introduced as part of ChatGPT-4o, but that's incorrect. Sky's been around since they introduced the mobile iOS app.

While Sky's voice shares similar traits to SJ, it sounds different enough that I was never confused as to whether it was actually SJ or not.


I don’t think you understand. 4o introduces a new multimodal Sky replacing the old one. They have only released clips of the new voices. It’s never been in the iOS app. The one you refer to is the old voice model. If you listen to the linked video above it’s very obviously not the same voice (I use Sky on iOS btw)

To be honest the new sky is obnoxious and overly emotive. I’m not trying to flirt with my phone.


I've listened to the clips and yes, while 4o Sky is more emotive, it's just that - a more emotive Sky. All the elements that people are pointing to - the husky/raspiness - were present in the pre-4o Sky.


Well, I thought it will be similar, but at least with how sky voice sounds through the phone speakers, I can hardly find any resemblance.


Those don't sound anything alike, except being two female voices. Sky is clearly a bit lower and with a lot more vocal fry.


Are you using this as an argument about how similar they are? The voice sounds distinctly different, no problem discerning between the two.


I am of two minds here, regardless of the "closeness" there is a whole field of comedy that does impressions of others. That is what is so difficult about the AI discussion. Clearly, there are plenty of humans who can mimic other humans visually, in prose for writing, in voice and mannerisms etc.

Leaving the IP issue aside, they could clearly have hired a voice actor to closely resemble Johansson maybe without additional tweaks to the voice in post processing. If they did do that, I am not totally sure what position to take on the matter


The important thing is that they never said it was Johansson. They were not pretending to be her. They are not imitating her likeness whatsoever.


Some employees were definitely thinking of Scarlett Johansson, even ignoring the reference to the film "Her":

https://x.com/karpathy/status/1790373216537502106


Karpathy doesn't work for OpenAI anymore tho.


The OpenAI one is recording the audio from a phone, where as the movie version is into a mic directly. They will sound different, but there are elements that are the same. Anyone using these to compare though and saying they don't hear the difference isn't comparing apples to apples.

However, the fact that there is a debate at all proves there should be more of an investigation done.


Holy Crappyness Batman! The OpenAI clip is so bad. Homeboy keeps stepping on "her" lines. So from this I come away with he's just a rude asshat that doesn't know how to socially interact with people, she's just too damn chatty and doesn't know when to shut up, or maybe it was just really bad editing? Either way, it's not an intriguing promo to me in the least.


I think the whole thing was scripted beforehand and approved by Sam Altman, of course.


That doesn't really make it better because now a) it was a horrible script, b) the fact they didn't try to clean up the audio from "her" with anything more than a fade. If you told me this was just some intern making a video, then maybe, but now you've told me it was scripted just oh so makes it worse to me.


Genuine question, what's wrong with trying to replicate in real life an idea from a SciFi movie ?

I understand that it could be problematic if OpenAI did one of two things:

- imitated Scarlett Johansson's voice to impersonate her

- misled people into believing that GPT-4o is an official by-product of the film Her, like calling it “the official Her AI”

The first point is still unclear, and that's precisely the point of the article

For the second point, the tweets you posted clearly show that the AI from Her served as an inspiration for creating the GPT-4o model, but not a trademark infringement

Will Matt Damon receive royalties if a guy is ever stuck on Mars ?


> Genuine question, what's wrong with trying to replicate in real life an idea from a SciFi movie ?

The thing is, there are several cases where a jury found this exact thing to warrant damages.

But honestly, that is irrelevant. The situation here is that OpenAI is facing a TON of criticism for running roughshod over intellectual property rights. They are claiming that we should trust them, they are trying to do the right thing.

But in this case, they're dancing on the edge of right and wrong.

I don't mind when a sleazy company makes "MacDougals" to sell hamburgers. But it's not something to be proud of. And it's definitely not a company that I'd trust.


Pretty sure the CEO of OpenAI tweeted "Her." after the reveal of the voice.

Isn't that a suggestion that what they're doing is similar to "the Her AI"?


Yes, the unprecedented conversational functionality of the GPT-4o demo could be compared to the AI in the movie. Why assume that the tweet was about the voice sounding like Scarlett Johansson?


It's a suggestion that they were inspired by the movie, not that they are releasing a product under the "Her" trademark

It's a movie, not a patent on women voice AI assistants


Imagine if Facebook came to you and wanted an exclusive license to white label whatever you work on, then after you rejected them they went and copied most of your code but changed the hue or saturation of some of the colors and shipped it to all of their customers (There's definitely hours of Scarlet Johanssons talking in the dataset that GPT4o was trained on).

Would that be ethical?

EDIT: or even better, imagine how OpenAI would react if some company trained their own model by distilling from GPT4 outputs and then launched a product with it called “ChatGPC”. (They already go after products that have GPT in their name)


> then after you rejected them

The article shows the timeline would make this them already licensing a similar product to your more famous one, then you saying no, and them continuing to use the existing similar one.

> But while many hear an eerie resemblance between “Sky” and Johansson’s “Her” character, an actress was hired to create the Sky voice months before Altman contacted Johansson, according to documents, recordings, casting directors and the actress’s agent.


Facebook does do this, and Google, and Microsoft, and Apple. I believe they call it "Getting Sherlocked."


Same here. In the demo it never sounded like SJ to me. After the story broke I listened to clips from Her and the 4o demo. It doesn't sound like SJ.


And then there's me, and I'm somewhere in the middle. When I first heard that voice, I didn't really think anything of it. But retrospectively given the media reporting from Sam Altman tweeting about the movie and the reports of approaching Scarlet Johansson, I can make that connection. But I would not have without the context. And without real reporting I would have dismissed it all as speculation.


Yeah, I can hear the resemblance, but it's not the same. I actually said they should copy SJ's voice for a bigger "her" effect when I saw the demo.


They voice artist put out a statement through her lawyer. She also stated her voice has never been compared to Scarlett in real life by anyone who knows her.


that's because scarlett's voice is pretty generic white upper middle class woman with a hint of vocal fry, and a slight hint of california (pretty typical given pervasiveness of media from california).

She's not exactly gilbert gottfried or morgan freeman.


Now I'm just sad that it doesn't respond in a flirty Gilbert Gottfried style voice.


This is Gilbert Gottfriend reading 50 Shades of Grey.

https://youtu.be/XkLqAlIETkA?si=8nLtWaBwq3Swum1i


I heard this comment in my mind in a flirty Gilbert Gottfried voice.

Thank you for the laughter.


I'd like to hear her raw voice compared to the polished product. Listen to famous singers' acoustic vs. heavily audio-engineered final cuts. Big difference. I think if you played this OpenAI "Sky" voice to a sample population and said it was a famous person's voice, SA would come up frequently.


This is just Scarlett Johansson trying to destroy some small voice actor. I greatly dislike what OpenAI is doing, but this is just ridiculous.


Scarlett Johansson is apparently so devious she managed to get OpenAI to reach out to her to license her voice and likeness.

She even set up the CEO by having him directly negotiate with her, which I’m sure he also did with the alleged small voice actor. Then she perfected her scheme by having that same CEO publicly tweet “her” - timed with the release of the voice product - referencing JS’s movie of the same name where she voiced a computer system with a personality.

She even managed to get OpenAI to take down the voice in OpenAI’s words “out of respect” for SJ while maintaining their legal defense publicly that the voice was not based on hers.


Is it illegal to hire a voice actor that sounds like Darth Vader? No. Is it illegal to hire a voice actor that sounds like Her? No. Would it be appealing to have SJ voice act for them? Sure. Does that mean it's illegal for another voice actor to (according to some) sound similar to a character from a poplar movie? No. All of these things can be true together


The issue isn’t hiring a voice actor to imitate someone, that can be fine. The issue is what you can do with the recordings after you have them.

Making a YouTube instructional view on how to imitate voices that includes clips of a film for example would be fine. Reusing the exact sounds from that YouTube video in a different context and you’re in legal trouble.


Right, but making a YouTube instructional video on how to imitate voices, where you only use the imitation voice, is fine. Which is closer to what happened here it seems like.


Illegal?

It probably isn’t criminal, which is what you seem to be asking, although it very well might be depending on the facts.

More importantly, under the available facts JA likely has a claim for civil damages under one of more causes of action. Her claims are so strong this will likely end up with a confidential settlement in an undisclosed sum without even needing to file a lawsuit. If she does file a lawsuit, there is still greater than 90% likelihood OpenAI settles before trial. In that less than 3% chance the case proceeds to a verdict, then you’ll have your answer without having to make bad arguments on HN.


> In that less than 3% chance the case proceeds to a verdict, then you’ll have your answer without having to make bad arguments on HN.

From the HN guidelines: Be kind. Don't be snarky. Converse curiously; don't cross-examine. Edit out swipes.


You should reread your comment I was replying to where you asked and (incorrectly) answered multiple rhetorical questions and reflect on the HN policy you cited.

Allow me to help you correct your answers:

>Is it illegal to hire a voice actor that sounds like Darth Vader? No.

Actually, yes it can be.

>Is it illegal to hire a voice actor that sounds like Her? No.

Once again it can be.

>Does that mean it's illegal for another voice actor to (according to some) sound similar to a character from a poplar movie? No.

Yet again it can be.

As a lawyer that’s been practicing for over 10 years, IP law and contract law is far more complex and nuanced that your rhetorical questions and answers suggest.


From the HN guidelines:

Be kind. Don't be snarky. Converse curiously; don't cross-examine. Edit out swipes.

Comments should get more thoughtful and substantive, not less, as a topic gets more divisive.

Please don't fulminate. Please don't sneer, including at the rest of the community.

Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize. Assume good faith.

Have a nice day, or actually don't, since you're working one of the most evil professions there is ;)


You asked yourself questions and answered your own questions wrong, it’s not a big deal. Read the the HN guidelines and as they say try to be more curious, there’s really no need to walk through life so angry and bitter when you are wrong.

Also rest assured evil lawyers are not the source of your problem. Maybe one day if you and your IP/likeness get ripped off like OpenAI did to SJ, you’ll find lawyers aren’t so evil after all. Maybe you will come to realize for every evil lawyer there is always lawyer fighting on the other side against that evil.

Once again what do I know, I’m an evil lawyer that does pro bono legal services for children that have been abused, abandoned and neglected as well as victims of torture at the hands of foreign governments as part of my evil profession.

Good luck to you!


SJ doesn't know who the voice actor is. Her objection is with OpenAI's actions.


Why would she?


I can't read minds unfortunately.


This shows how bad it is. If you're proactively sharing a package of docs with the Washingington Post, you're toast.

Altman's outreach, his tweet, and the thousands of tweets and comments talking about how similar Sky is to ScarJo is enough to win the case in California.


The Washington Post comprehensively refuted the story. This is like the "this is good for Bitcoin because ____" meme, but in reverse.


They literally didn't question any of OAI's claims. They just regurgitated them.

They were desperate for a non-union-only actor in their casting. But repeatedly kept hitting up a union actor.

What fears for the actress' safety have been portrayed such that not only does she needs to stay anonymous, but her agent does too?

"Altman was not involved"... yet he personally reached out to SJ to try to close the deal?


They refuted it based on select documents handed to them by OpenAI.


Then we can add this to the long list of insane lawsuits going the wrong way in California.

They asked SJ, she said no. So they went to a voice actor and used her. Case closed, they didn't use SJ's voice without her permission. That doesn't violate any law to any reasonable person.


Likeness rights are a real thing, and it's not out there to have infringed on them by going to a famous person to use their likeness, getting denied, then using another actor telling them to copy the first actor's likeness.

This is why all Hollywood contracts have actors signing over their likeness in perpetuity now; which was one of the major sticking points of the recent strikes.


>> "then using another actor telling them to copy the first actor's likeness"

Assumes facts not in evidence


And in fact clearly rebutted by the evidence that the actor says they never told her to copy anyone or ever mentioned Johansson or Her.


At a base minimum, the would have given her direction to sound the way she does. Voice actors have lots of range, and that range would have been on her demo reel.


Agreed. I was married to a voiceover actress. Their range can be quite large :)


The anonymous actor, as reported by the anonymous agent, "fearing for her safety".


It even assumes the opposite, since they asked SJ after recording the original voice.


It's nice of you to clearly state what reasonable persons should believe violates the law. Alas, your contention about what reasonable people believe about the law isn't actually the law.


> They asked SJ, she said no. So they went to a voice actor and used her.

My guess is they would have went with that voice actor either way. They had four different female voices available (in addition to multiple male voices) - 2 for the api, and I believe 2 for ChatGPT (different api voices are still available, different ChatGPT ones aren’t). If Johanssen had said yes, it’s likely they would have added a fifth voice, not gotten rid of Sky.


This has echoes of Crispin Glover and Back to the Future 2. They didn't rehire him and got someone else to play his character.


> That doesn't violate any law to any reasonable person.

Midler v Ford is already precedent that using a different actor isn't inherently safe legally.


I predict the case will have parallels with Queen's lawsuit against Vanilla Ice: the two songs (under pressure and ice ice baby) are "different" in that one has an extra beat, yet it's an obvious rip-off of the former.

Perhaps merely having person A sound like person B isn't enough, but combined with the movie and AI theme it will be enough. Anyway I hope he loses.


You have no idea what they did, unless you work there.

All you know is that somebody being sued for multi-millions of dollars (and who's trustworthiness is pretty much shot) is claiming what they did. And frankly given the frequency and ease of voice cloning, there are very few people who can say with confidence that they know 100% that nobody at the company did anything to that effect.

What employee, if any, could say with 100% confidence that this model was trained with 100% samples from the voice actress they alledge and 0% from samples from Scarlett Johansson/her? And if that employee had done so, would they rat out their employer and lose their job over it?


It's not (or shouldn't be) about things that have some finite probability (no matter how small) of being true, but rather about what can be proven to be true.

There's no doubt a very small (but finite) probability that the voice sounds like a grey alien from Zeta Reticuli.

That doesn't mean the alien is gonna win in court.


I'm not saying they'll necessarily win in court, all I'm saying is I'd wager my life savings that they intentionally created a voice that sounded like Scarlet's character from Her.

Anybody on this forum who says that it's entirely impossible or that it's conclusive that they didn't use her voice samples simply isn't being logical about the evidence.

TBH I really like the voice and the product, but I'm having a lot of trouble wrapping my head around the number of people who seem rather tribal about all this.


If they did clone her voice, they did a poor job of it. Other than that the voice is female there's not a whole lot of resemblance in tone and timbre.


"Reasonable" is doing a ton of work here.


"Reasonable" does a lot of work throughout the entire legal system.

If there's one constant that can be relied upon, it's that "things that are reasonable to a lawyer" and "things that are reasonable to a normal human being" are essentially disjoint sets.


> “Reasonable” does a lot of work throughout the entire legal system.

Yes, but here it’s not being invoked in the sense of “would a reasonable person believe based on this evidence that the facts which would violate the actual law exist” but “would a ‘reasonable’ person believe the law is what the law, indisputably, actually is”.

It’s being invoked to question the reality of the law itself, based on its subjective undesirability to the speaker.


>"Reasonable" does a lot of work throughout the entire legal system.

Yet it never becomes anywhere near the significant fulcrum you made it out to be here, filtering between the laws you think are good and the laws you think are bad. Further, you seem to mistake attorneys with legislators. I'd be surprised if a reasonable person thinks it is okay to profit off the likeness of others without their permission. But I guess you don't think that's reasonable. What a valuable conversation we're having.


No, it has nothing to do with "legislators". The "reasonable man" standard is all over case law, and there are about a bazillion cases where attorneys have argued that their client's behavior was "reasonable", even when it was manifestly not so by the standards of an actual reasonable man.

You can, as they say, look it up.

https://en.wikipedia.org/wiki/Reasonable_person


>No, it has nothing to do with "legislators".

You seem incredibly confused. Legislators pass legislation, not lawyers. So it was never a question as to what lawyers thought reasonable laws are. State representatives determined that it was a good idea to have right of publicity laws and that is why they exist in many large states in the US.

> The "reasonable man" standard is all over case law

Yes, as I already pointed out to you, and another poster did as well, this "reasonable man" standard has nothing to do with your prior use of the word reasonable as an attempt to filter out which laws are the ones you think are okay to enforce.

>You can, as they say, look it up.

You should take your own advice!


> You seem incredibly confused. Legislators pass legislation, not lawyers.

I'm not "confused" about anything.

Yes, legislators pass laws, but how those laws are actually applied very much depends on the persuasive skills of lawyers.

If your hypothetical where you could use the printed law as passed by legislators essentially as a lookup table, lawyers would serve no purpose.

But somehow people spend tons of money on them nonetheless.


>I'm not "confused" about anything.

You are very confused. The reasonable person standard has absolutely nothing to do with your initial post where you quoted it.

>If your hypothetical where you could use the printed law as passed by legislators essentially as a lookup table, lawyers would serve no purpose.

What the fuck are you talking about? The stuff I see people here say about the law is INSANE. You don't need a lawyer in the US if you are an individual person, you can represent yourself. What the hell does any of it have to do with a lookup table? I've never seen something so deeply confused and misguided.


> "things that are reasonable to a lawyer" and "things that are reasonable to a normal human being" are essentially disjoint sets.

In litigation, any question whether X was "reasonable" is typically determined by a jury, not a judge [0].

[0] That is, unless the trial judge decides that there's no genuine issue of fact and that reasonable people [1] could reach only one possible conclusion; when that's the case, the judge will rule on the matter "as a matter of law." But that's a dicey proposition for a trial judge, because an appeals court would reverse and remand if the appellate judges decided that reasonable people could indeed reach different conclusions [1].

[1] Yeah, I know it's turtles all the way down, or maybe it's circular, or recursive.


I don't think the mannerisms of a performance something that's copyrightable though. It sounded like they used a voice actor who was instructed to speak with a similar intonation as Her, but Scarlet Johansson's voice is more raspy, whereas Sky just sounds like a generic valley girl.


For a case to the contrary: Midler v. Ford -- a case in which Ford hired one of Bette Midler's ex-backup singers to duplicate one of her performances for an ad (after trying and failing to get Midler herself). Ford never said this was actually Midler -- and it wasn't -- but Midler still sued and won. https://law.justia.com/cases/federal/appellate-courts/F2/849...


Ford gave explicit instructions to imitate a copyrighted performance. Because that specific recording as owned by a record studio.

If you can describe a woman's voice and mannerisms and the result sounds similar to a copyrighted performance, that is natural circumstance.

If you want an example of purposefully imitating something with a copyright, look at GNU. Anyone who looked at the UNIX code was realistically prevented from writing their own kernel with similar functions. But if a handful of folks describe what the kernel ended up doing and some <random> guy in his own head comes up with some C code and assembly to do end up with the same high level functions, well thats just fine, even if you include the original name.

The details matter. There is absolutely enough vocal difference, it doesn't take an audiologist to hear the two voices do sound different but very close. It would not be hard for the producers to describe "a" voice and that description would overlap heavily with ScarJo, and wow the marketing team reached out to see if she would attempt to fill the existing requirements. When she said no, they found a suitable alternative. If the intent was to have ScarJo do the voice and she said no and they did it anyways, thats illegal.


Off topic to the thread and your point, but are you confusing GNU with the Compaq BIOS reverse engineer and reimplementation? I hadn't heard this story about GNU (and what kernel)?


> Ford gave explicit instructions to imitate a copyrighted performance.

That case isn't copyright law, Ford had obtained rights to use the song itself.


Copyright isn't at issue here; it's instead likeness rights.


> I don’t think the mannerisms of a performance something that’s copyrightable though.

Yes, this discussion is about right of publicity, not copyright.

Copyright is not the whole of the law.


"Her" is one of my favorite movies of all time, and not once while watching the demo did I think that it sounded specifically like ScarJo. The whole concept, of course, made me think of "Her", but not the voice itself.


As a non-American I only hear Scarlett Johansson's voice in the examples I've heard, to me it clearly is an impersonation. Maybe state-side that specific voice sound is more common and thus less recognisable as Scarlett Johansson's.


They did produce the actual voice artist!


Where? Right now you have "An anonymous person says that an anonymous person said this to him in an email".

That's a pretty low bar for "produced the actual voice artist".


To the Washington Post, which verified it. The Post doesn't much care if you can verify their work, because no reasonable person believes they're making this up.


Words are important. The WaPo didn't verify the voice actor at all:

- "The agent said the actress confirmed..."

- "In a statement from the actress provided by the agent..."

The WaPo hasn't spoken to or verified who the voice actor is.


I don't see that here: https://openai.com/index/how-the-voices-for-chatgpt-were-cho...

Is my my google-fu failing me and I'm not looking in the right place?


If you read the WaPo article that's the topic of this thread, you'll see that the actual voice artist is quoted in the article.


No. You'll see that the anonymous artist's anonymous agent supplied a quote he got in an email to WaPo.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: