Hacker News new | past | comments | ask | show | jobs | submit login
AI Restaurant Menu with RAG (wandb.ai)
61 points by byyoung3 12 months ago | hide | past | favorite | 53 comments



> Physical menus are fine. They do the job. But if we're looking to actually improve upon them, current approaches miss the mark. They offer limited interactivity and flexibility.

Sigh. Am I the only one who likes to hold a physical menu and browse it? I never go into a restaurant knowing what I’m gonna get… I like to see how the menu is themed and decide from there. The UI on a physical menu is a vast improvement over any digital menu I’ve seen.


You are not at all the only one. I don’t mean to diminish the project in TFA-it’s a cool application at a sensible scale. But your point resonates with a broader frustration of mine that’s percolated since COVID.

Dining out is a human experience, and I deeply resent any systems that demand my attention be funneled away from my dining companions and staff, into a wholly inadequate digital kludge. There should never be a reason for an electronic device to be at the dinner table.

QR-code-based menus are absolutely and uniformly unfit for purpose. It’s a question of information and navigation: it’s intuitive and natural to navigate the physical layout of a printed menu, associated as it is with the stages of the meal and complete detail of all options at once. As you point out, I’m going into the menu not with the intention of seeking a specific taste I have in mind, but with the intention of hearing the full range of the restaurant’s proposition and recognizing what resonates with me. It’s a recognition task more than a retrieval task.

Compare to tapping and swiping through tabs full of pages of digital listings to remember what you meant to order when the time comes.

I admire the creativity and technical effort involved with this project, but I can’t imagine anything worse than having to play a guessing game with an LLM to hide-and-seek what the kitchen serves.

I feel like the whole enterprise understands the value of hospitality differently than I do: a human hospitality professional can apply the full range of their judgment, context, intuition, and humanity to something like this type of menu recommendation. And a skilled menu designer can provide an efficient survey of what’s on order and set the tone. Even in its ideal implementation, what problem does this AI-ification actually attempt to solve or add value to?


For a menu with a large number of small dishes (e.g. Sushi) I kind of like the tablet approach. You used to write numbers on a list and hand it to the waiter, but handling the large menus was very unwieldy. Now you just tap a few buttons on a tablet.

I can’t imagine how you’d factor AI into that though. Ultimately I generally know what I want, and writing a story about it isn’t going to help.


I sometimes like using a tablet to order at sushi places in Japan. The important parts are:

- the tablet is provided by the restaurant and knows your table number, etc - doesn't require registration or personal information - isn't used for payment


Indeed, this is a retrieval task, not a recognition task.


The only improvement over physical menus would be an ingredients list IMO. As someone with food sensitivities, it's really, really annoying when a dish's description isn't inclusive of everything in it.


Even for that you could add an extra qr code at the end.

The only reason online menus are taking off is cost reduction and the fact that they can change the menu more easily (and the prices of course)


For the same reason I prefer a well stocked physical library. University libraries tend to be the best for browsing. Something about walking slowly down an aisle and glancing over the spines of so many books. And maybe one catches your eye, or you just stop and arbitrarily decide that the Annals of Quantum Mechanics: Proceedings of the 1989 Convention, Vol II is the right book to pull out.


Yeah I’m not sure which restaurants would use something like this maybe food trucks or casual places that already do the QR pdf menus.

The menu is a huge portion of setting the ambience and theme of a nice restaurant.


And beyond the setting and ambiance, it’s a huge portion of the restaurant’s value proposition: “Here are all the things we do and are proud of, what sounds best?”

At the risk of being extra curmudgeonly, have you ever dined someplace where they wheel a dessert cart round? There’s kind of a lightweight vicarious thrill you get just from eyeing the physical tantalizing plates, even if you don’t end up eating one.

Compare that to “here’s an empty text box. Whisper your very specific wishes to the genie and maybe it’ll materialize something in the neighborhood of your demand, probably not exactly what you asked for; and it’ll hide all the cool stuff we do that isn’t directly relevant to the demand we made you work to formulate (and probably can’t fill exactly).”


I think Animats has it right: this doesn’t make sense for almost any restaurant - even a place with a huge menu doesn’t have enough entries that a regular search engine wouldn’t be equivalent quality at orders of magnitude better cost & performance – but it could make sense for a company like Uber Eats or Yelp, and I think especially so if the language parsing could handle really imprecise queries across languages.

Based on my dinner earlier, a good example would be if you could take “who has that Ethiopian dish with chicken and bits of bread” and return a place listing only “Doro fitfit”, which would require more effort with a traditional search engine since none of those words are commonly present in the menu and you have to have more than basic Solr-style synonym expansion to expand “Doro” to “Doro wat”, which itself expands to “chicken stew”, and expand “fitfit” to “shredded injera”, which expands to “a type of bread”. Obviously that _could_ be solved with the traditional stack but that starts to get more involved so I could see an advantage for making it smoother.


You aren’t alone, no, however an awful lot of people would be thrilled if they could walk into any restaurant, anywhere, and automatically have mac and cheese or whatever their preferred dish is delivered to them.

The majority of humans aren’t adventurous, and prefer convenience and consistency, and not having to think.


Personally, I don’t care whether physical menu is available or not. But my smallest pet peeve is not having the online menu available, because sometimes they take the menu with them after placing the first bunch of orders, then I gotta ask for the menu again… then the dance continues.


Just say, as I do, "I'm going to hold onto one of these menus thanks :)". A normal human interaction.


Oh, I do! Sometimes the place is loud and it doesn’t get heard. Nothing wrong that a server or someone else did, but having it available on my phone is just a big convenience thing for me.

Other thing is, when the menu isn’t available, there’s a decent chance I won’t even go there when I’m craving something specific. Like — I’m craving wings, but not every pub near me has it, so I do a very quick check before I go there.


I'm with you. I thought it was a joke at first


Being able to filter by allergen is pretty useful. A chain of custody for ingredients would be cool too. Of course, one can just ask. Does this have nuts in it? Was this ethically sourced?

But, to your point, the nuance that some chefs place in their food descriptions tells a lot. Only a human can tell you which pairings are important to a particular dish, a machine can't. A bit of both would be nice.


You don’t need a language model for that, and arguably shouldn’t use one if there’s a possibility of the model hallucinating the wrong answer. And the model is only as good as the information it is given, so if there is allergen info available, why not just display it directly to the customer? If there’s not, your model will almost certainly be wrong.


>Was this ethically sourced?

If you’re at a restaurant and it is uncertain, the answer is likely no, purely based on the scale that restaurants require.


I prefer the physical menu to look at but it is a huge convenience to use toast or similar apps to order on your phone and then pay whenever you’re ready to leave. No need to wait for the servers for who knows how long in a busy restaurant.


Surely you are not the only one. I hate going into a restaurant being forced to use my phone to read the menu. What the hell? Is printing menu cards that expensive? Also our kids can read but don't own phones yet.


Maybe eink screens will become cheap enough to dish out as dynamic restaurant menus in the future?


When everyone has a phone in their pocket, this would be a folly.

In Malaysia, it’s common for the menu to be entirely digital - you scan a qr code, and you’re then presented with the menu, from which you then order and often pay directly on your phone. Some places remember who you are and show your previously ordered dishes at the top of the menu.


There was an uptick in this behavior during covid in American restaurants, but everyone I know expressed dislike for it. I agree it's more practical, but I thought my suggestion was a nice way to hedge engrained customer expectations against a desire for digital advancements.


I think there’s also a basic QA issue. I use that heavily at places which have systems which are fast and reliable but most of the places which stopped had these terrible systems which wanted you to install their low-quality app which demanded lots of permissions and account creation, or seeming-parodies of modern web apps which needed 25mb of JavaScript to display a menu (not hyperbole - I checked one of them in WPT after being surprised by how bad it was).

Unfortunately, I think a lot of restaurants drew the conclusion that customers don’t want online ordering rather than that customers want fast and smooth ordering. I suspect some of that was also unrealistic expectations from the ad-tech industrial complex promising additional revenue from things customers don’t want, too, based on how aggressively some of those pushed you to create accounts and allow tracking before you could order anything. Someone not blinded by greed would make the pitch to do that kind of thing after ordering when you’re not trying to do something else and have an idea about whether you’ll even want to come back.


Love the physical menu as well


The rather complicated prompt to force the recipes to be in JSON w/ constraints can be substantially simplified using ChatGPT's function calling/structured data capabilities: https://news.ycombinator.com/item?id=38782678

Additionally, a full vector store is overkill for a menu with <100 items. You can just keep a numpy array in memory with vectors for that and use a matmul for RAG.


> You can just keep a numpy array in memory with vectors for that and use a matmul for RAG.

... what? Tell me more.


The model they use (BAAI/bge-small-en-v1.5) produces embeddings 384 wide, at float32 that equals 1536 bytes each. The size of all the vectors for 100 items is 153kb. Calculating the dot product of that against a query will be under measured in nano seconds, even with a naive implementation.


This is actually what llama-index does if you don't use a vectordb integration


How else would they sell managed vector dbs and llm app frameworks? /s


Aha interesting. I would guess it is missing a bit of nuance, eg a real word query would not be "appetizers with chicken" but something closer to "I'm waiting for my friends, not too hungry, what do you have that's spicy but not too filling? Oh, and I don't eat red meat".

This would be much harder with basic vector database+RAG. You would at least need HyDE (hypothetical document embeddings) to act like a real waiter.

I find your project very interesting though, and the walkthrough is nice!


I feel like HyDE would just cause hallucinations.

Imagine the generated documents based on that query was like “Our shepard’s pie is the perfect spicy meal for sharing with friends without being filling” (Obviously this is wrong, but it’s an example of a statistically possible document)

Then the following similarly search would also be searching for shepherds pie.

I don’t think throwing more LLMs at rag problems is the answer.


Agreed. Rag isn't ideal for every scenario. Putting the contents of the menu in the prompt could even be feasible with a smaller model.


For a single menu, this is silly. As a search engine for Doordash or Uber Eats, though, it might lead to this guy being acquired.


I mean even as a single menu, I'm not a big fan of sifting through the menu of a restaurant, especially if I'm eating because I have to. If I could type (or even better speak) what I'm into and what I'm willing to spend and if there pops up the options, it would be a quality of life improvement for me.


1m obo


Semantic search with vector embeddings is great... as long as you maintain an inverted index for traditional full-text search.

As a user I sometimes want a "buffalo chicken quesadilla" not disparate buffalo wings and quesadilla recommendations.

I believe Elastic has a mechanism that supports both? Haven't tested it yet.


> As a user I sometimes want a "buffalo chicken quesadilla" not disparate buffalo wings and quesadilla recommendations.

Curse you! I just started eating dinner, but now I want what you listed instead.


Uhh, let's see here

Roasted buffalo chicken thighs, smoked gouda, and sautéed onions, bell pepper (and jalapeño or other peor is optional, there's a smoked canned pepper -- I cannot remember the name -- that could be especially good here).

This is pretty straight forward:

1. Place chicken thighs (bone in, skin on) on a roasting sheet, liberally salt. Roast in an oven at 400F for 30 to 40 minutes (start checking at the 30 minute mark).

2. While the chicken roasts, dice an onion, a red bell pepper, and a green bell pepper and sautee in oil of your choice (bonus points for using the drippings from the chicken thighs). Don't cook them all the way through.

3. Cook down canned black beans with a bay leaf, 1/4c of fat of your choice, salt (to taste), cumin (to taste). They're done about the time this concoction resembles a spreadable paste, about 10 minutes over medium heat.

4. Shred the thighs in a bowl, consume or set aside the skin. Add your buffalo sauce if choice. Bonus points if you made your own Buffalo sauce.

5. Heat a cast iron skillet over medium high heat, or a George foreman style press. I'm assuming these are non stick, add enough oil that there is a fine sheen. We don't want to fry the quesadillas (maybe next recipe).

6. Assemble your quesadillas directly in the pan, in this order: cheese (shredded or slice thin is fine), veg, meat, beans, cheese other tortilla. The last layer of cheese is important and will help glue the thing shut. If you're using a skillet, put a lid on it to tap heat and encourage the top cheese to start melting.

7. When the cheese on the bottom has melted and the tortilla is starting to brown, we flip. Get a plate and slide the quesadilla onto it. Then take the skillet and put it on top and then rotate both. If you're using the press, get a second plate and flip.

8. Wait for the new bottom side to brown and melt. Finish with cilantro, lime juice, and something approaching queso fresco. Greek yogurt, sour cream, goat cheese, anything like that will do.

Enjoy!

now I also want Buffalo chicken quesadillas

[edit] Chipotle! It only took an hour to remember.


Using a chatbot to answer customer questions about menu items seems like a great way to kill someone with food allergies due to LLM hallucination, but hey.. AI!


RAG is probably the only thing you can do with LLMs nowadays that is useful, accurate, and cheap enough to deploy in a real business scenario. Every other use of LLMs so far is either too expensive to scale or produces too many hallucinations.


Different stokes for different folks but this is dystopian to me.

I prefer the approach of standard french restaurant: 2-3 options for each category of appetizer, main course and desert. Don't paralyze me with choices. Too many options also mean the establishment has no focus - they won't be all good so it's a lottery.

If I have question I will ask the waiter. And I am as asocial and introverted as any common nerd.


This is way over complicated for querying a single menu. This article seems like a solution in search of a problem. You don't need a search engine when a menu is only going to have a handful of dishes.

Just use an picture of the menu as part of the prompt and you will make everyone's lives easier. I'm skeptical that asking questions about the menu has much real utility.


If we've fixed the 24/7 'free' internet with zero latency issue, 24/7 electricity, and cheap mobile devices – in short, ZERO FRICTION – then this kind of tech could all work. Otherwise, we're just adding new problems instead of fixing them.


There are already too many restaurants owned by small business tyrants who feel entitled to run a successful small business. If you don't want to make your restaurant menu by yourself, maybe you shouldn't be in the restaurant business?


Precisely


Is there any specific reason for using LLAMA embedding models instead of the new models OpenAI just released? Are they better?


Not an expert but I'd say prolly llama is preferred for privacy + open source reasons.


Not sure how much privacy is needed for a menu in a restaurant. Plus, it's all later sent to GPT which breaks privacy anyway.


There's no reason for any of this bullshit. Like cryptocurrency in 2021 this is all a solution in search of a problem.


With this particular solution, I agree.

But LLMs and RAG in general couldn't be further from crypto in terms of utility.


Nice idea




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: