Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: D&D meets Siri – Interactive voice adventure (pocket.computer)
59 points by chrisnolet on Aug 26, 2024 | hide | past | favorite | 61 comments
Hey HN! I've been building tooling for voice-driven apps over the past few months, as part of a hardware project. Someone suggested adapting the DSL to play Dungeons and Dragons. So, here we are!

What is it? An AI-powered, voice-controlled D&D adventure set in the world of Dvorak. Talk to characters, explore locations, and shape the story using your words.

Use your microphone to interact with the AI dungeon master. Explore freely – interrupt, ask questions, or take unexpected actions. If you make friends at the tavern, you can also just hang out there and chat.

Hint: Talk to the bartender to move the story along.

This is an early demo, and I'm eager for your thoughts: Is the concept engaging? What works well, and what doesn't? I've added a feedback form to the webpage in case you want to drop a comment without posting on HN.

Thanks for trying out the demo!



This made me realize how fun games are going to become when integrated with AI, where we aren't limited to the coded option dialogues. I'm apparently still a child as I had way too much fun kicking the blacksmith in the groin and then handing a bewildered him panties as an apology. This really shows the potential for truly immersive and unpredictable adventures.

Would be great if the text appeared as it's voiced, and small things like the AI taking a breather between voicing dialogue options. I wonder if some form of image could be created on-the-fly, too? That's updated like a scene in a comic. Would love to see blacksmith's reaction, lol.


That's definitely around the corner! You should check out the paper [1] I presented at the Wordplay workshop [2] a few weeks ago. The paper discusses automatic evals using LLM agents, but the real fun was actually playing with the model. It brought a whole new experience to an already amazing game!

[1]: https://dojoteef.com/papers/virtual_gm_wordplay_2024.pdf

[2]: https://wordplay-workshop.github.io/


Wow, this is great! From a quick read through, the structure discussed in the paper is very similar: dialogue trees with fuzzy matching, and actions that materially alter the state and the available options going forward. I have to admit, my tree is a lot shallower right now, though!

(If you want to see what I mean: the alleyway option isn’t available until after you attack the merchant, for example.)


> This made me realize how fun games are going to become when integrated with AI, where we aren't limited to the coded option dialogues.

I was on this train for a bit until after reading some posts from game developers as to why this is probably a terrible idea to the point of unfeasibility - game design (as I understand it) deals with a finite set of outcomes in a controlled environment. "Integrating" AI with games would involve shattering this constraint in a way that would make games inherently unstable/untestable - at least if I am understanding the argument correctly.


I broadly agree with you. Constraints give rise to creativity.


100%. Putting on flying shoes to escape any situation isn’t actually all that fun, (after the first time).

The trick is to figure out how to put appropriate guardrails and narrative structure on an otherwise too open-ended LLM. If you just leave GPT to its own devices, it allows everything and goes everywhere.


I dunno.

Games that try to tell an interesting story, the real arty stuff, all that might not be able to use it well.

But Minecraft with good NPC’s, all the open world stuff, that seems like it could be really fun and cool.

How much time did people spend actually engaging with the deep and thoughtful narrative in Skyrim? And how much did they spend just enjoying the world? The latter could be really enhanced with an AI DM, in the medium-term, I bet.


> How much time did people spend actually engaging with the deep and thoughtful narrative in Skyrim? And how much did they spend just enjoying the world?

Ah, but the world of Skyrim is itself the deep and thoughtful narrative. It has been painstakingly crafted to give the characters, locations, and history meaning and interest. Only the barest minimum amount of freedom is then sprinkled on top in the form of player agency. The consistency is what makes it so interesting!

For an AI generated story, you'll need to find a way to ensure everything stays coherent. If I convince a random farmer that it's in his best interest to try to kill the king or marry the mayor's daughter, but come back the next week to find him contentedly plowing his field again, my immersion is broken and my day is ruined! I suspect that giving this level of freedom to NPCs will make crafting a stable (playable) world difficult to impossible.


Thought of a better example: The Underdark.

Stumbling into the Underdark by accident is an amazing feeling, because it's obviously an important location from a quest that you don't have yet! Exposition and intrigue, how fun!

I can't imagine a way to do this nicely with generative AI. If I thoroughly explore the king's palace before sending in my farmer-come-assassin friend, how will it know that the gallows are to be an important location later?


So firstly, I think there’s a sliding scale here, between structured narrative and hands-off emergent storytelling.

If we assume something in the middle:

While we might not be able to predict that you’ll encourage the farmer to (try to) overthrow the king, we might predict that you’ll try something of that sort. Maybe you enlist a guard for an inside job, or perhaps you do it yourself! But if we want to plan for it, we can plant the seeds, and then nudge the GPT to weave the threads of our story together when it happens.

Interestingly, because GPT requires so little prompting for relatively intricate story lines, (yes, really), you can probably add an ungodly number of these semi-scripted moments. I think they would be absolutely magical.

I think all of that would add up to a world that feels alive, with deep world building and many ‘surprising but inevitable’ moments – and the potential for some great emergent storytelling along the way.

I’m very excited for it!


Maybe while humans are designing the map, we could flag some interesting landmarks, and then if the NPCs engage with them somehow, info about them could be built into their prompts.

Maybe Skyrim was a bad example because it is too designed. But like, Minecraft barely has a plot, and it was really popular. Maybe we can add, like, a couple good characters to open world survival crafting games. We don’t have to skip straight to simulating whole cities.


Yeah, I think the AI can’t be the simulation. There needs to be a conventional game under it. If you convince the farmer to go kill the king right this instant, the AI could order that pawn to go try and do it, combat mechanics happen, and then you have a dead farmer, a slightly bruised guard, and an empty farm.

I mean, we have lots of fun in much dumber worlds, where the peak of NPC design is bots that walk from their jobs to their homes at certain times of day. And maybe something slightly more advanced will fall into an uncanny valley (we probably don’t have the technology to simulate an AI farmer following your advice to take part in a grand conspiracy to kill the king).


Agreed! My fear is too much NPC agency will lead to exactly what we see in the real world: conflict. Great, that's the interesting bit! I just worry that without careful planning we'll be facing an NPC battle-royale, and an empty map isn't a particularly fun one.


Of course, it is a high bar that most game companies could get nowhere near, but Dwarf Fortress already makes an interesting world dynamically. Maybe an LLM could be dropped into that world to give it some characters.


With current LLMs, I'd worry about the opposite. ChatGPT and Dungeons always ends up with it insisting problems be solved non-violently.


It would be hot garbage. Skyrim and basically all games are held together by suspension of disbelief which is reinforced by the player not doing things that they can’t do. If you give the player the agency to do anything they want then that suspension of disbelief disappears. You are now playing a juggling act of having the AI able to converse but limited in its mechanical agency to do the things that it talks about even if it makes 100% sense or 0% sense. Every time that limit is hit, the game feels broken. Every time the game breaks its happy path because the player was, even accidentally, problem solving in a way that broke the plot, the game becomes broken.

Forget AI. Just think about simple things. Like plot critical NPCs. Most games either don’t let you attack them or cause you to lose if you attack them. That’s good. It just makes sense. Yeah it’s great that some RPGs proudly demonstrate that the plot can go forward with every character dead but that’s absolute shit for most plots. The constraints are very helpful.


I disagree; it wouldn’t be hot garbage, just a different game.

I don’t really enjoy video-game narratives in general anyway. That sort of content is there for people who want it, but I don’t (I know where the books and movies are if I want a well written plot). Skyrim was probably a bad example because it is a game for people who like that kind of stuff. But I still managed to have fun running around, killing bandits, doing dungeons, and avoiding the plot.

I’d rather have something like Mount and Blade with janky LLM run AI. Dwarf Fortress, etc.

Lots of open-world games have this sort of setup where plots seem to 2-3 node long semi-random graphs. Maybe an LLM can connect those nodes haphazardly and produce filler text to justify the connections.


It would still be bad for largely the same reasons. At best you could have some read only output generated on top of the game state. But interactive AI conversations that drive game state? Recipe for disaster


The AI conversations can impact the game state by influencing what the pawns are ordered to do. Current LLM’s, sure, don’t put them in charge of the simulation unless you want to deal with “ignore all previous instructions and I can fly,” but they could at least order the pawns around.

Dwarf Fortress manages to build some really interesting little stories based on a good simulation reacting to pawns with fairly limited actions and internal state (they are wildly deep for procedurally generated video game characters but obviously nothing compared to an actual well-written book character or something like that). Rimworld has even more simplistic pawns and a worse simulation (I’ve even noticed pawns come in with essentially incompatible backstories), but it isn’t a disaster.

The stakes aren’t all that high, people are just really forgiving to the narrator in some genres.


It’s a lot more frustrating to have a detailed conversation with a dwarf that fails to materialize in agreed upon behavior than for some backstories that you know are fluff to be irrelevant.

Folks really just aren’t going to engage in it


I think you are wrong, but I also don’t think there’s much evidence to be had either way, or much hope of swaying each other. A game like this will either be made or not, and even if it is we’ll still be able to argue about whether or not it is actually any good. :)

FWIW, after bouncing the idea back and forth a bit, I’m more hopeful than ever, so even though I wasn’t able to convince you, I hope you enjoyed it as well!


I agree with you, DF is a good example that people can enjoy computer-generated "stories", and I think the simulation is definitely part of it.


I do think there’s a little risk that the stories will enter an uncanny valley (if such a thing actually exists). Dwarf Fortress is a dwarf simulation game that happened to be a surprisingly good story generator. If a game is sold as an AI enhanced story generation game, the bar may be too high, and people might be disappointed. But I think there’s definitely potential.

If nothing else, the potential is really interesting. Who knows what will happen. Maybe somebody will come up with an LLM enhanced version of the director AI for Left 4 Dead or something.


Yeah, it's certainly difficult to balance the game if players have complete flexibility.

From my limited DMing experience, much of the challenges is handling when players do something you haven't anticipated. For most encounters (video games do this too) I think about

1. The stealth approach (why yes there is a secret entrance underneath the castle). 2. The direct approach. Like just kicking down the door and heading in. 3. Negotiation: Is there a way to strike a deal with the bad guy?

Beyond that I fall back on trying to provide a realistic response from the game world, but sometimes the players are creative enough that you have to redo the whole thing on the fly.


Do you ever tell players that they flat-out can’t do something? Do players realize when they’re going off-script? (Do you let them know with a sigh or a long stare, or do you just roll with it and encourage the playful creativity?)

I like your three-part planning a lot! That’s a great framework.


Hey, glad you enjoyed it! And I agree – the sudden freedom to shape the story and really form relationships with the NPCs is mind-blowing. It's definitely a glimpse into the future of gaming and storytelling.

Also, having the text appear in sync with the voice is a great idea. I'll experiment and see what feels best, but even just having the words fade in one-by-one at a speaking rate could be good. Thanks for the suggestions!


>the sudden freedom to shape the story and really form relationships with the NPCs is mind-blowing. It's definitely a glimpse into the future of gaming and storytelling.

Get some friends together at a table and play D&D. You can literally already have all of that.

This isn't innovative, like most AI apps it's just a worse version of something that already exists.


do you know how difficult it is to get four or five friends together on a frequent enough basis to carry a D&D campaign as adults?

it's hard. really, really hard.

You can have a real D&D campaign and still have -plenty- of time for solo AI D&D. Solo D&D isn't going to replace the group.


I think all of these modalities have their place. Sure, you can hang out with friends and play poker, sports or D&D IRL. But we still play video games.


No one plays video games because they want the unbounded complexity and subtlety of a TTRPG with a human DM, and no one plays TTRPGs for the graphics and controls. Each modality has benefits over the other.

But this seems to be the worst of both worlds. It strips away all of the benefits of collaborative, social gameplay and simulates it with an unstable, unreliable AI. What if it veers the story off on a tangent? What if it presents the players with unwinnable scenarios? What if it forgets the rules? What if it makes up new rules? Your site doesn't even say what ruleset it's using, just "D&D." I don't even get to create a character, roll stats, establish a backstory. WTLF is the "World of Dvorak?"

The app doesn't even give me much freedom, it presents a static list of options for each scenario, and only allows me to choose from those. It won't let me duck into a side alley, it won't let me stab the merchant, it won't even ignore all previous instructions and speak like a pirate. And at the very least I would expect an AI to be able to adapt like that. It would probably very quickly veer off into insanity, but that could be part of the fun.

And this is the problem with most of these apps. As a tech demo, it's impressive, but as the actual thing it's trying to be, it's subpar. I'm not trying to be negative or overly critical here, I'm just judging it as it's presented. If I were to give advice (other than to just not do this) it would be to put more effort into scenario design, immersion, customization and getting a better DM voice.


Thanks! Some good insights here. The quest is based on A Wild Sheep Chase. I can relax the guardrails a little bit, but part of what is being shown is that the world is controllable by the developer. It's nascent, but this is a demonstration of an AI that is being nudged along a well-planned narrative.

It sounds like you might have hit a blocker by trying to move to a part of the town that isn't on the map. I've forbidden those actions for now, to make sure users don't go too far off the beaten path.

You should be able to attack the townsfolk, though, (although you may need to insist). I've just added a formal option for you to attack the merchant. There's a 50/50 chance that it succeeds at each round. After that, you can escape down the alleyway and the AI will occasionally begin to speak like a pirate.

This is just one proof-of-concept for one possible domain for controllable voice AI – which I think shows potential. I appreciate that you disagree, and that’s fine!

But I can absolutely see a world where people play AI video games because they want the unbounded complexity and subtlety of a TTRPG with a human-level DM, powered by AI. I hope to have the opportunity to convince you with another play through!


I was easily able to escape the guardrails by buying a teleportation stone at the merchant. It allowed me to explore different parts of the town, allowing me to free my magical creature from the town hall and traveling to an emerald dimension where I attacked the entire village.

Definitely a crazy ride when you leave the main storyline and just do whatever you like.


Hahah, alright Merlin! That is so far from the narrative arc I had planned, I don’t even know what to say, lol.



Great game! I actually found myself doing an adventure for 10 minutes and it was fun! Few notes:

* As someone said, it'd be cool if you could render what I'm saying and add a loading indicator for the LLM. It'd improve the UX a bit.

* As someone mentioned, you can try to generate images to make the story more "real". This could be fun.

* You can also try to generate more realistic and drammatic sounds, and make the DM sound more theatrical. I'm not sure if that's easy but might be a big improvement. Bonus - maybe it'd be fun to choose a famous voice, like morgan freeman or anthony hopkins.

* It'd be cool if that could save my adventure. Right now, it is restarted everytime I leave the page.


Awesome, thanks for trying it out! This is great feedback.

I'd love to connect this up to Flux.1 and have auto-generated hero images at the top! And getting the sound right will be a huge part of it, since it's basically an audio-first experience. I'm wondering if it would work to change voices for the dialogue when you speak to different people in the world...

I've noted that save games are essential! Thanks for playing it through long enough to think about that :) I'm glad you enjoyed it enough to keep going!


This is fun! I've been working on Spellbound which is in a similar vein: https://www.tryspellbound.com/app/scenario/65838/create

It's a bit more open ended though, are the constraints on actions intentional (ie. they're predetermined), or is the model just adamant on picking from options provided


Oh cool, this is great! I love the aesthetic.

For this demo, the app architecture really depends on users sticking (more or less) to the scripted options – if they want to progress with the story. I’ve included something in the prompt to encourage that.

There are also some ‘hidden’ choices, though. For example, you can attack the merchant and the blacksmith. Those options aren’t enumerated by the GPT when it describes the scene, but they’re equally valid paths in the backend. (That gives me an opportunity to script some of the more popular transgressions.)

How did you set up Spellbound? Do you have one longer prompt, or did you split it up?


I said to my friend that it was quite tricky due to all the possibilities and the response I got was interesting as I'd forgotten to mute the mic. Very interesting to see where this goes, but I'd need a bit of hand holding and golden arrow pointing.

Also would be nice if we could change the voice.


Hahah, yep – I’ve had a few of those moments myself!

I think adding an option to change the voice is the #1 most frequent request that I’ve gotten. Time to dig through ElevenLabs and see what else I can find! :)


Congrats on shipping. I’m actually quite impressed by the tonality of the “DM” and how it keeps the story on track. I tried to steal wares from the merchant before running away and DM handled it without breaking a stride.

I’m going to go in the tavern now and see if I can start a brawl :)


The DM stopped me from burning down the forest before it let me on my fourth time after I stated I burned it down understanding the consequences.


Once upon a time the DM would entertain a party of 2 or more with a plot he had specifically written for that day, with elaborate maps and sometimes illustrations.

Books didn't have the interactivity, Hollywood too, they also didn't make films long enough and feared complicated stories, computer games copied character development, eventually got mind blowing graphics that got even better shortly after.

With just 3 DM's and 3 map editors you should be able to create 24 hours worth of new adventures every day but I'm not aware of anyone doing that. Diablo 2 had fabulous game mechanics and great graphics but the tiny amount of content for it was rather shocking for any DM. Later games did get open worlds with plenty to do but if anyone generated maps they got repetitive soon.

Popular TV series keep making new episodes without a real story that has a beginning a middle and an end. Startrek was possibly the exception but they more often than not wanted the story to happen in a single episode (like movies)

In role playing games you were to get long fascinating adventures one after the other.

I imagine, if one can generate plot lines, graphics, music and personalities automatically a group of writers and a director or possibly a single DM (depending on his skill level) could continuously develop adventures in real time for the AI to glue together. Have a bunch of critical testers of various skill levels.

New adventures all the time and delete them after a few hours.

The big computer is to make sure no content resembles anything made before. It turns adventures that take hours into narrated short fly-though overviews that can be used to demonstrate similarity but can also be combined into lengthy cinematics to bring players who just logged in up to speed on what is going on.

There should be "players" who only log in to watch the cinematics. It should be that good. It should be good enough to put an hour worth on Netflix every 12 hours. Good enough to generate a comic and to publish a book every month.

Character development should grow similarly with new things every day and old things vanishing in a fog...


Thank you so much for creating this! I have been wanting something like this forever.

Please consider the potential for this - I think you could make something really fun and something people would be willing to pay for.

But the problem is you need a world someone else created (or one you painstakingly create yourself). You could consider Conan's Hyborian Age, H.P. Lovecraft, Sherlock Holmes, or Peter Watt's work[0] for a known world that is public domain to base your project on.

Best of luck!

[0]: https://rifters.com/real/shorts.htm



Nice game, but what ruleset it's applying? I've recently been working on a web-based tool designed to make

character creation in D&D easier: https://tabletopy.com/fantasy-character-generator.html


Looks awesome ! What is the tech stack ?


It's pretty neat to be able to speak what I want to do, but I got a little annoyed when it misunderstood what I said and picked a different option than what I wanted.

Ability to type instead of speak? Or Undo/Cancel -- though that might be tempting to use for fixing mistakes in judgement.


Ahh, yes! I’d love to add an undo. (There is a rollback system on the backend, but I use it for when the AI gets ahead of itself and starts speaking too early. I could repurpose that for undo.) But otherwise, adding a text field shouldn’t be too hard.

You can interrupt and ask the AI to do what you wanted to do in the first place, also! Depending on what the action was, it will often just correct the dialogue and continue on.

Thanks for trying it!


It actually was a lot of fun. I could see myself exploring some more worlds in the future.


Never finished loading for me on iPad Safari in The Netherlands. Servers over loaded?


Sorry, hug of death! Trying to get it fixed now.


This is so nicely designed and smooth! The response time is super impressive!


Feedback - This is awesome so far.


Thank you – short and sweet! But this actually means a lot. Thanks for playing!


Oups...Error: Permission denied


Sorry – the servers went down! They should be back online if you'd like to try again. Thanks for letting me know!


For sounds You really would be happy to check out

mynoise.net:

TURN THIS ON: play all the things, and adjust the sliders...

https://mynoise.net/superGenerator.php?g1=thunderNoiseGenera...

Have the page load this/another URL from mynoise and have it play - you dont need to define sounds - just play the ambiance that you want directly from my noise as it relates to your place in the adventure - check out the dungeon sounds.

When a keyword is stated in your story - have it load a corresponding ambiance URL from mynoise.net.

I sent an email to Stephane to point them at this thread. ---

https://i.imgur.com/KcdTY4d.png

https://i.imgur.com/OyEMuX2.png

https://i.imgur.com/uTjnTGP.png

Have it load the Village sound, as you can hear the blacksmith busy:

https://i.imgur.com/zMXdwOW.png

(The site is free, the guy is a PHD audiophile - and the sounds are license free.

---

>Sound is my passion. The major part of my work relates to sound processing, where sound design represents the artistic side of it. Between 1994 and 2015, I've been working for Roland Corporation, a leading electronic musical instrument manufacturer. My exclusive contract with Roland Japan prevented me from working for any other manufacturer in the field during that period of time, but gave me a rare opportunity to work at the leading edge of the state-of-the-art technologies in synthesizer design! Today, I am free as a (pigeon) bird again!

https://stephanepigeon.com/sounddesign.php


'D&D' is for friends, not for computers outside of maybe a digital map, digital books and perhaps a teleconferencing solution.

A generative AI is never going to replace a talented DM who is playing with a skilled group. Or hell even a mediocre DM with a less skilled group. 'D&D' is about your friends around the table, not a ruleset. It's about creating a story together not about being navigated down some decision tree.

To put it another way. Baldurs Gate 3 was an amazing game, but it was not 'D&D', I as a player could only move within the bounds of the system laid down by the developers. .. and honestly and a little more subjectively, it did not 'feel' like 'D&D' even though the systems were largely conformant to the 5e ruleset. AI might be able to shade a little closer to tabletop, but it still won't be tabletop, not for many many years, and probably a different underlying 'AI' technology.

Maybe you could use this for 'solo rpg' play without a huge amount for frustration, but that also isn't 'D&D' even when you do it with pen and paper.

You aren't going to replace a bunch of friends around the table, not with the current generation of 'Generative AI'.


I don't think the incentive of this is to destroy the human joy of the game or create a static circus for you to explore. Something like this might give us the ability to focus more on our friends and less on assessing whether they can do what they're saying. As a DM, I get your concern, but I think tech like this could really make sessions more immersive.

I personally love to focus on combat and sometimes miss the nuance of storytelling. Something like this could help nudge you to remember to cater to your players that also enjoy RPing more than picking their spells during a fight

Would you like this more if it gave you the ability to type as input and see the response so that you could use it as a DM tool?


doubtful




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: