> No wake words: it should listen to everything, process it, and understand when...

lxe · 2024-12-21T19:15:25 1734808525

Wake words are different from "listen to everyhing until name is called". A wake work is needed for both privacy and technical reasons -- you can't just have alexa beaming everything it hears to amazon. So instead it uses a local lightweight "dumb" system to listen to specific words only.

That's exactly why there's massive latencies between command recognition, processing, and execution.

Imagine if it had sub-ms response to "assistant, add uuh eggs and milk to the shopping list... actually no just eggs sorry"

danparsonson · 2024-12-22T11:10:22 1734865822

Sure OK, maybe it's a beneficial side effect then. However you look at it, trying to get the computer to decide when you are addressing it, without using a name of some sort, could be a very challenging problem to solve, one that even humans struggle with. Surely you've been in a situation where you say something to a room and multiple people think you're talking to them? To borrow an example from elsewhere in the thread, if you say "turn on the lights", are you talking to the computer controlling the room lights, or the human standing next to the Christmas tree?

> Imagine if it had sub-ms response to "assistant, add uuh eggs and milk to the shopping list... actually no just eggs sorry"

Could you elaborate on that? What if that were true?

antonyt · 2024-12-20T05:07:18 1734671238

Yeah, I’m having a hard time imagining how no-wake-word could work in practice.

lukifer · 2024-12-20T19:06:17 1734721577

This is one advantage of a system with a constrained set of commands/grammars, as opposed to the Alexa/Siri model of trying to process all arbitrary text while in active mode. It can simply ignore/discard any invocations which don't match those specific grammars (and no need to wait to confirm that the device is awake).

"Computer, turn lights to 50%" -> "turn lights to fifty percent" -> {action: "lights", value: 50}

"My new computer has a really beefy graphics card" -> "has a really beefy graphics card" -> {action: null}

ethbr1 · 2024-12-20T13:18:04 1734700684

Like that really annoying friend who jumps in every other sentence with "Well actually..."

marcosdumay · 2024-12-20T15:56:15 1734710175

I have a coworker that set up an Alexa an year or so ago, I don't know what was the issue, but it would jump into Teams meetings after every noise in his house.

fragmede · 2024-12-20T06:44:21 1734677061

after setting up the system, if I say "turn the ceiling lights to 20%", who else would be changing the lights?

But also, post-fix wake word would also be natural if it was recording all the time. "turn on the lights, Google", for instance

danparsonson · 2024-12-20T23:21:44 1734736904

Sure, if the system is set up to only respond to very specific commands that humans would not respond to, I guess that could work. I was thinking more about the other way around, where a person might speak to someone else in the room and be overheard and acted upon - "turn on the lights!" could be a command for the computer controlling the room, or the human standing next to the Christmas tree, for example.

TheCoelacanth · 2024-12-20T17:04:10 1734714250

Someone in a TV show that you're watching?

joshstrange · 2024-12-21T12:53:03 1734785583

I’ve never had Alexa control a device via a TV show’ audio but playing back a video of me testing my home automation (“Alex, do X”) triggered my lights.

I’d love a no-wake-word world where something locally was always chewing on what you said but I’m not sure how well it would work in practice.

I think it would only take 1-2 instances of it hearing “Hey, who turned off the lights?” in a show turning off my lights for real (and scaring the crap out of me). Doctor Who isn’t particularly scary but if I was watching Silence in the Library and that line turned off my lights I’d be spoked and it would take me a hot minute to realize what happened.