“Chatbots: Still Dumb After All These Years”

dang · on Jan 14, 2022

This article is a response to this one:

Chatbots: Still dumb after all these years - https://news.ycombinator.com/item?id=29825612 - Jan 2022 (408 comments)

(Thanks everyone who pointed this out.)

dandare · on Jan 14, 2022

Maybe it is just me but I never use chatbots and I don't understand why anyone would.

For everything I want to do there should be an UI that is much easier to use than explaining it, even to a human.

For help and troubleshooting chatbots are pretty much useless. If I have a problem doing something via the UI then probably the developer did a bad job and no chatbot will ever do better.

adwww · on Jan 14, 2022

It's usually far easier to build and test a UI than integrate the same backend code with multiple vendor's chatbot SDKs as well.

Is there any evidence customers prefer chatbots? The entire concept feels like it's driven by managers trying to impress their managers.

ethbr0 · on Jan 14, 2022

> The entire concept feels like it's driven by managers trying to impress their managers

It's cost cutting. Customer contact is always treated as a cost center.

Therefore, the only relevant question for its management is "How can we provide the minimum acceptable level of service, at the lowest possible cost?"

In general, the answer to "Why does X suck?" is to ask whether the company sees it as a profit or cost center. See: HR.

hutzlibu · on Jan 14, 2022

"Customer contact is always treated as a cost center."

"How can we provide the minimum acceptable level of service, at the lowest possible cost?"

Not with all products. Some companies understand, that good support keeps the people loyal to their products. But they are usually in a higher price segment.

ethbr0 · on Jan 14, 2022

The most surprising example I found was Dell Alienware support. Not sure if it's different from Dell consumer support, but I'd guess so.

Had to RMA an ultra widescreen monitor I use for work, and the experience was as smooth as silk and always in the hands of actual humans, who would follow up and push things forward.

seanp2k2 · on Jan 15, 2022

Dell has always had good support, especially for business. I once bought a laptop second-hand that still had a few months of their highest support plan on it. Needed a screen first because of a few dead pixels. They had a tech come out to me at work and repair it on-site. A month before the warranty ended, the motherboard died. Same thing, replaced on-site. In both cases they offered to either send me the parts and DIY repair instructions or just send someone to do it. This was on a laptop I paid like $300 for (probably $2-3k new back then, it was a nice Latitude).

I’ve had their enterprise support on the phone with me for a dozen hours over the course of a week trying to get this SAN working. Their support remoted into it and had me manipulating physical stuff when required. They eventually got it going through a combination of hard resetting everything and updating firmware on half a dozen components + some software updates.

Dell doesn’t always have the greatest hardware, but their support really doesn’t rest until you’re satisfied. “No one ever got fired for buying Dell”.

y4mi · on Jan 14, 2022

Do they use chatbots for support?

borplk · on Jan 14, 2022

Nobody wants chatbots and companies just drum up propaganda to create that impression because they want to reduce their customer support costs while giving themselves a pat on the back and pretending that it is in the interests of the customer.

A chatbot is just a long way of saying "GO AWAY!".

Large telecoms do this so they get to shut down their customer support almost entirely while claiming that they are available for the customers.

a large telecom in Australia has applied this to an extreme degree in the last two years to the point that you almost can't contact them no matter how severe the issue is. Their message is clear "SHUT UP AND PAY".

eitland · on Jan 14, 2022

We had a somewhat beloved chatbot at work, but we are of course a consulting company.

And it is gone now and replaced with a much more effective app, my point is just that it clearly possible to make chatbots that aren't rage inducing.

The question is just if it is worth it, something we didn't think after the novelty wore off.

I believe one way it can provide value is when it is closely backed by competent support staff so it takes more of a receptionist role instead of the role of a support engineer.

tshaddox · on Jan 14, 2022

> It's usually far easier to build and test a UI than integrate the same backend code with multiple vendor's chatbot SDKs as well.

I’m guessing that’s often not true if you’re running a Wordpress blog or some white label e-commerce platform or similar. You can probably just pase some JavaScript into an admin interface somewhere to add a third-party chatbot to your site.

axg11 · on Jan 14, 2022

I see this opinion come up on HN frequently. You are not the intended target for a chatbot. You're right; for most tech-savvy users, the ideal interaction would be through a well designed UI. However, there's a significant portion of society that is not familiar with UIs and just wants to "speak to a person" and describe the issue they're trying to deal with. Chatbots are the closest thing we have to an automated version of that.

dqv · on Jan 14, 2022

My sister came for a visit 3 weeks ago, actually, no, 4 weeks ago, it was the week before Christmas. We were celebrating her 31st birthday and we went out to a great local restaurant. They serve vegan and vegetarian food and we had a great time. I went Christmas shopping and found a lot of great gifts for my son and husband. Yada yada. Now Christmas rolls around and my husband opens the present - it's a Polo Ralph Lauren shirt I knew he would love. To my surprise, and you won't believe this, I got him the wrong size! I need to exchange the shirt for the right size.

Hi I'm Macy, the Macy's chatbot. Thanks for reaching out! You need to exchange an item, is that correct?

hooande · on Jan 14, 2022

I'm calling BS. People definitely communicate like this, but no one types this into a chat window

fbrchps · on Jan 14, 2022

_You_ don't communicate like this, but the reviews & chat logs at my company are proof that people 100% do type like this. A lot of older people are really just looking for someone to talk to it seems, so yes, they will type out 90%+ fluff to just return an item.

qubitcoder · on Jan 14, 2022

Very true. Many older people I've met use Siri or Google as the primary means of device interaction.

For example, my mom has used an iPad daily for many years now. She's never used a laptop or desktop. Rather, she navigates and interacts with an iPad exclusively using Siri and Google voice search--rarely, if ever, touching the keyboard. And she's a (relatively) advanced user.

Granted, to tech savvy users, these interactions are often quite funny. For example, opening the YouTube app, and then doing a voice search for "YouTube videos for how to make chocolate icing YouTube please". :-)

BucketsMcG · on Jan 15, 2022

You wouldn't believe some of the things people say to conversational interfaces.

A long time ago – I'm talking early 2000s – I designed a voice UI for a train timetable information system. It was a phone line, you called it up, told it where you were travelling to and from, and it gave you the information you needed.

After launch, I would listen to the odd call at random to see how it was handling things. I distinctly remember one case where, when asked their destination, the user said "I'm going to a funeral".

Now, when somebody said something outside the scope of the expected responses, we would reprompt them up to twice more with a bit of extra guidance. I'm proud to say that, even though this caller started way off course, the system steered them back and they ended up going away with the information they needed.

That's actually a danger of making your conversational interfaces too human-like - people then talk to them like they're humans, and it doesn't go well. You're better to make things a little stiff and robotic, so as to manage their expectations and make them understand that they need to constrain their answers a bit to get what they need. It might feel less natural, but everybody goes home happier at the end of the day.

dqv · on Jan 14, 2022

It's, of course, made up. But everyone can appreciate the anguish of having to talk to someone like this and trying to discern what it is they actually need. Macy, the Macy's chatbot? They can go through hundreds of terms that correlate with resolutions in milliseconds. if chatMessage.contains?(typoOrExactMatch('exchange')) then chatSession.initiateExchange() end

Like the grandparent explains, some people just need a way to explain what they need in their own words and I don't think chatbots completely fail at that. The best one's I've experienced have prescreened me for common issues and then sent me off to a live person as soon as it was clear my request was not easily addressed with the built in script.

soco · on Jan 14, 2022

Have you, or whoever recommends that, ever talked to an actual non tech-savy person? I don't mean it like a "me as a non tech-savy person" role-playing, but to a real flesh and bone grandma.

faizshah · on Jan 14, 2022

My mom tries painfully hard to make Bixby do things for her. She will spend 20 minutes trying to get it to find her a recipe or something instead of using an app. I keep trying to tell her to stop wasting her time and just use google but I guess there is still some novelty in these conversational interfaces (clearly its not easier or faster). If she could use google home to fill out her time sheet or save a document to google drive she would. The thing she finds difficult about tech stuff is remembering the workflow of how to accomplish a task through multiple websites/apps.

Also while I’m on my soap box this is a PSA to all ui devs, stop trying to turn every menu/button text into an icon. My mom does not even remember that the button with 3 horizontal lines means menu she is not going to understand that stamp icon means “add a signature.”

alexpotato · on Jan 14, 2022

I remember reading an interview with Bill Gates about how "two way mirror" user testing where devs watch REAL users interact with their application is both incredibly important and humbling at the same time.

The above quote is from the days where people bought shrink wrapped software so it was incredibly hard to observe people using your application in the wild. Therefore, companies spent money to do it in house to get better data.

Modern web applications, I would argue, make observing user behaviour in real time trivially easy but I'm sure lots of companies don't even bother. In turn, this brings us to the incredibly complicated interfaces we have today.

axg11 · on Jan 14, 2022

Yes, I'm basing this on experience. A non-savvy user sees most modern UIs as overwhelming, difficult to navigate and they don't have the same priors that the average HN reader does. When they see a chat pop-up, they see a familiar interface (SMS has been around for long enough) and it gives everyone literate the ability to describe in plain language what problem they are having.

harha · on Jan 14, 2022

A/B test that: a well thought through UI vs a bot. I genuinely think it would be worse in time required for resolution, choice by the users. Obviously not one size fits all but in simple cases I doubt that even the non tech savvy user would prefer trying to articulate their issue in a way a bot could help them.

Recently came across a delivery service which sent an sms announcing a delivery to connect me to a Facebook messenger bot to ask me the order number to give me the delivery information. How is this process better? Just send a link to the tracker and have all the necessary info in one place. Add a button to reach someone if you must.

axg11 · on Jan 14, 2022

That's an example of a bad use of chatbots, but I don't think that means all chatbots are a bad interface. Chat is good when you have a large potential problem space and a wide range in the tech abilities of your users. A 90 year old grandpa isn't going to happily dig through five layers of menus to solve their problem. They are able to describe their problem in a couple of sentences though.

firefoxd · on Jan 14, 2022

Yes, good UI will cut down on 99% of customer interactions. We built a chatbot that companies can use by integrating with their e-commerce. The overwhelming majority of customers ask "where is my package?"

Why is that? Can't they just check the tracking number? That's a question I had until i tested a few dozen e-commerce solutions and most of them have buried their tracking number under several confusing menus. And even when you get the tracking number, carriers each have confusing ways to track packages.

VLM · on Jan 14, 2022

> Why is that? Can't they just check the tracking number?

My experience is Amazon is an enormous pain in the butt to deal with if the delivery service drops your package at the wrong address or there's a hole in the package with nothing in it. Huge pain dealing with those people.

I get that there are scammers out there. But if I get 100 packages per year from amazon for years, some of which have been very expensive, and one outta nowhere they lose my $5 multivitamin, I'm not trying to get rich by fraud but I just want my vitamins.

Also a couple years ago I ordered an obscure stainless steel hinge for woodworking, kind of expensive for stainless steel it was about $8 (solid not plated USA made brass would have been $30+) and Amazon van delivery successfully delivered a bubblewrap envelope with a hole ripped in it because hinges are heavy and bubble wrap envelopes are weak. Its an $8 hinge I just want my hinge can't you ask the amazon employee in the van who delivered it to look on the floor of his van? I'm not angry (yet) but I just want my hinge so I can finish my woodworking project. If I was trying to rip Amazon off then I wouldn't be arguing with somechatbot or someone over a $8 hinge it would be "my box that was supposed to contain a PS5 arrived with a brick in it instead" or similar fraud-smelling situation.

This is why I won't buy consumer electronics from Amazon; they can't deliver a hinge without trying to make a federal case out of it, so if I order a TV with my luck I'm going to get a box of cracked glass with no recourse and long arguments. If I go to best buy the price is worse and the selection is worse and I don't like the experience in general but at least I can slice the box open and see if the glass is smashed before swiping my credit card.

Amazon doesn't have a process for "the delivery people messed up" probably intentionally to save money when the delivery people mess up. Even when the delivery people who messed up are Amazon employees (or contractors)

My point of this ramble, aside from F Amazon, is they set you up to take the blame. They screwed up, the chatbot can't help, the chatbot must be to blame.

VLM · on Jan 14, 2022

Ironically I'm 18th in the queue at computershare.com right now (Its a way for even minors to do direct investment in a company other than more typical employee ownership programs or broker acct, very long story about a very old account).

The website has a bug where locked accounts are told to change their password, which will change their password via an email process but then your acct is still locked out of course, but you can click here to change your password. Endless loop. Very annoying.

The idiotic chatbot helpfully suggests the way to solve a locked account is to change your password. That's exactly where the bug is, and resetting your password does not unlock the acct which is the problem.

Helpfully the support website says if you wait a day a locked account will be unlocked. That of course is not true.

I've only been working on this since Sunday.

Anyway a fucked up company that is essentially inoperable, when given a chatbot, will merely have one more thing that doesn't work. Its a lot harder to code a chatbot that can fix a locked account than it is to code a website to unlock a locked account, so if they fail the simpler task you know the harder chatbot task would be impossible for them.

I'm hopeful that within a half hour I'll be able to chat with a human rep and gain access to my account. I've been in line for a human for about ten minutes and I'm down to 15th in the queue.

I used to work in telecommunications and I'm glad I don't have to hold a phone to my ear and listen to on-hold music for hours on end like the bad old days, I have multiple monitors and I can just leave this window up for however many hours it takes while working on other monitors.

noneeeed · on Jan 14, 2022

Annoyingly some companies have started putting them in front of any customer support contact. It's incredibly frustrating. I'd much rather just a menu of options.

I just don't understand the problem they are trying to solve.

shiftpgdn · on Jan 14, 2022

Because if you work in a call center/chat center 99% of your customers are asking the same 20-30 questions. You and I might be bright enough to read the Kb or search google but many people are not. A talented user support person is going to cost a company $40-60k/year and can only handle so many requests at a time.

noneeeed · on Jan 14, 2022

I understand that, I'm just not sure why it has to be done with a chat bot. I've seen companies achieve the same by just having a simple option tree system before you get to the support agent (and divert you to self-service if possible).

xtiansimon · on Jan 15, 2022

I maintain a work intranet site. Its an out of the box Django site and I have very little time to design or develop pages. While I've not _yet_ implemented a chatbot, I really want to only for the purpose of translating whatever naive search terms into domain specific tags and terms. Even better if I could say, ME: Hey Bot! What's that thing I have to do at the end of every month that has to do with payroll? BOT: Payroll Accruals? click. haha.

speedgoose · on Jan 14, 2022

I like chatbots when I need to contact the support of a company. The chatbot can do the normal support script, sometimes it works for me, and it makes the conversation much faster if I still need a human because they can read the discussion with the chatbot first and skip the basic questions.

chakhs · on Jan 15, 2022

I only use Chatbots to ask for a human because there's usually no other way to do it. They're like the new IE for me, only used to download other browsers.

firefoxd · on Jan 14, 2022

We were building a chatbot to use on a website until we realized how customers where using it. Most people were frustrated with something and needed help.

People who wanted to have a conversation did it for fun and had no real need for our services. We couldn't tell them how tall the Eiffel tower is.

Maybe there is a time where you want to have a conversation like the examples in the article. But I don't ever find myself wanting to talk to a human in this manner, so why a chatbot?

Have you ever watched the sci-fi show The Expanse? Have you seen how they interact with the AI? They ask a question, it provides an answer. It doesn't even use voice most of the time. It gives you the answer without trying to be sassy about it.

mrtranscendence · on Jan 14, 2022

> Maybe there is a time where you want to have a conversation like the examples in the article.

I mean, his examples were pretty factual and to the point. I suppose it's unusual to want to know if it's dangerous to walk down stairs backwards with your eyes closed, but there's clearly a short answer. Similarly with asking who the president is.

arpinum · on Jan 14, 2022

I saw the development of a chatbot based on the IBM Watson stuff. It was just like a phone tree map, except the system tries to guess the option selected based on the intents found in the speech/text. Of course it got no adoption, except when forced on users to drive metrics.

It was enormously expensive, it would have been cheaper to have a human on the other end.

2sk21 · on Jan 14, 2022

I was one of the original developers on this project! It was called the Watson Assistant and was a completely different code base from the original Jeopardy code base. The dialog tree was actually fairly sophisticated in allowing for a lot of flexibility. However, I always had a concern hat developers would find it too complex to author dialogs. Apart from the dialog tree, the other components were a text classifier and an entity detection system. It has been several years since I quit IBM so I have no idea how it has evolved since then.

chrisco255 · on Jan 14, 2022

That makes me wonder how much of "bad chatbot" implementations are just poorly designed dialog trees? It seems like a domain-specific chatbot could be really good, if designed properly, but it's probably as much of an art as it is a science and most implementations probably don't go deep enough.

arpinum · on Jan 14, 2022

The initial dialog tree and year of “improvements” were done by IBM themselves.

The main issue was that it was a vanity project for a stakeholder, which they used to pivot into a new job / career as a “thought leader”

thrower123 · on Jan 14, 2022

In almost every case, it's better to just use a regular dumb dialog tree than to try using any of the whizz-bang NLP stuff. I also have some horror stories about Watson that I'd love to tell if they weren't under disclosure agreements...

All of the advanced features like this that are built into chatbot products are only there because they look fancy when you give the demo. From a customer perspective, you only ever want to get connected to the second-level customer service agent that might possibly be able to tell you to do something you haven't already tried. And from the business side, they want to get as many interactions out of the funnel so that they don't need to be connected, and thus they can run their customer service desks even leaner.

kaibee · on Jan 14, 2022

My favorite phone tree implementation is the ones where it takes twice as long to hear all of the options because it has to tell you the keyword(s) and the equivalent number to press (since I guess they don't have enough faith in it otherwise).

SeanLuke · on Jan 14, 2022

> It was enormously expensive, it would have been cheaper to have a human on the other end.

This sounds exactly like what someone might have said about early IBM mainframes.

pintxo · on Jan 14, 2022

While the mainframe is arguably still a success (after all, they still run an amazing number of core services in our society: banks, air-traffic, governmental stuff, ... [1]), it's hard to find any evidence for a successful Watson project [2].

[1] https://www.bmc.com/blogs/state-of-mainframe/

[2] https://www.nytimes.com/2021/07/16/technology/what-happened-...

seszett · on Jan 14, 2022

Well I'm still grieving Chef Watson.

It probably doesn't count as successful since it didn't make money (I suppose this is why it was retired) but it was great at suggesting recipes from whatever ingredients I had, and ingredient pairings I wouldn't have thought about.

I still haven't found anything that works like it.

josefx · on Jan 14, 2022

Wikipedia seems to consider the IBM 701 as mainframe and from the customers and special numeric format it had I assume it was used to replace entire floors of human calculators that did nothing but compute the same calculation over and over again. It was also rented for a monthly fee, so the cost comparison was probably straight forward.

I wouldn't be surprised if an advanced chat bot with science fiction level AI would replace entire floors of Microsoft call centers at some point in the far future.

arpinum · on Jan 14, 2022

Like early mainframes, not every task is a good fit for emerging tech.

PaulHoule · on Jan 14, 2022

For a while I was frustrated at how slow people have been to realize that GPT-3 sucks but lately I am more amused.

There a few reasons structurally why it can't do what people want it to do, two of them are: (i) it can't detect that it did the wrong thing at one level when interpreting it at a higher level, (ii) most creative tasks have an element of constraint satisfaction.

The 1st one interests me because I was struggling with the need for text analysis systems to do that circa 2005 and looking at the old blackboard systems. I went to a talk by Geoff Hinton just before he became a superstar where he said instead of having a system with up-and-down data flow during inference, build a system with 1 way data flow and train all the layers at once. As we know that strategy has been enormously effective, but text analysis is where it goes to die just as symbolic AI failed completely at visual recognition.

Like the old Eliza program, GPT-3 exploits human psychology. We are always looking to see ourselves mirrored

https://www.nasa.gov/multimedia/imagegallery/image_feature_6...

Awkward people are always worried that we are going to get it 90% right but get shunned for getting the last 10% wrong. GPT-3 exploits "neurotypical privilege" in which it gets it partially correct but people give it credit for the whole. People think it will get to 100% if you just add more connections and training time but because GPT-3 is structurally incorrect adding resources means you converge on an asymptote, say 92% right. It's one of the worst rabbit holes in technology development and one of the hardest ones to get people to look clearly at. (They always think stronger, faster, harder is going to get there...)

It seems to me an effective chatbot will be based around structured interactions, starting out like an interactive voice response system and maybe growing in the direction of

http://inform7.com/

xwolfi · on Jan 14, 2022

The most difficult thing to accept is maybe that even humans are bad at speech recognition. Put your mom in a chatroom to answer questions by clients of a bank, she'll be even more lost than the robot.

You need a ton of dimensions to be able to help someone: to be raised for years by humans to understand politeness, intertextual meaning, general tones, and then special enthusiasm for a specific domain to learn and enjoy helping on banking. Plus, getting money to spend on other even more interesting things in exchange for helping others motivates you to reach optimal results for your user, even if it means asking quickly other humans or sacrificing something personal for it.

Most humans put in the situation of these robots would just say "sorry I don't even understand the question, can you ask someone else" lol

I've seen a fantastic "chatbot" human equivalent once, at Apple of all place. Philipino guy (I'm in HK), absolutely dedicated, polite, cultured, very empathetic (phili people are usually adorable naturally but this one went above and beyond), went well beyond the minimum, and I feel weird saying that but I left the call with a smile and told colleagues around me "wow Apple, what a pleasant customer support, it's insane". I'll probably never say that of a robot however good they make them at talking so there's always going to be value in putting humans in front of clients.

PaulHoule · on Jan 14, 2022

Exactly. You can make a robot that transcribes audio to produce a transcript better than a human does ("superhuman") but it will garble 1 word in 20, thus massacre every other sentence, and leave customers feeling 0% understood.

Speech understanding requires sometimes stopping the other person and asking questions to clarify.

rytill · on Jan 14, 2022

> For a while I was frustrated at how slow people have been to realize that GPT-3 sucks but lately I am more amused.

Generated text was not good before this era of GPT-X. It’s so much better and more interesting to work with now. It will probably keep getting even better and more controllable.

PaulHoule · on Jan 14, 2022

GPT-X is better than RNNs I grant you but people have built mad-lib and rules-based text generation systems that are absolutely great for specific applications in particular domains. (e.g. GPT-X is still a bridesmaid instead of a bride)

I think you could do better with RNNs than most people are doing because of structural problems.

Usually when people run RNNs for text generation they start out with the inner state of the system at 0 and then start flipping the coin to choose individual letters so you are starting from a very constrained region of the latent space and not sampling it very well.

I read a paper where they through out the idea that you ought to add coefficients for the latent state that you train for at the same time you train the network which means the number of coefficients goes up with the number of text samples but they never actually did it and I never found a paper where somebody tried it.

I was working on a project where we were developing models based on abstracts of case studies from pubmed as a stand in for clinical notes (certainly real medical notes are very different but you might say that medical notes should look like the abstract) I had the intuition that, as above, the author (and/or the patient) started out with a latent state (e.g. the patient had a disease before coming in) and that we'd get better results if we did something like the above.

It looked like a big and high risk project to develop that kind of model so I proposed something different around supervised training of a "magic magic marker" that could highlight certain areas and unsupervised multi-tasks such as "put the punctuation back in when it is taken out" but the client was hopeful that word2vec would be helpful.

I am still hopeful that incremental improvements, attacks on structural weaknesses, and appropriate multi-task training ("did the patient die?") would get a lot more out of RNN and CNN models.

walnutclosefarm · on Jan 14, 2022

The idea that a general language model like GPT-3 can answer questions intelligently is utterly absurd. It's trained to get language right (where "right" is defined as similar enough to the way people speak (or mostly write) to be intelligible as language), but it does so without any underlying knowledge model to make the intelligible language relevant to any given area of knowledge. Human language is not knowledge; it's a means for articulating our knowledge (that is, domain specific models of the world) in a way that other people can understand and translate into their own particular models.

So what is needed is the capabilities of GPT-3 or other language generators sitting on top of domain specific knowledge models, and constrained by those models.

Asking GPT-3 a general knowledge question is like asking an articulate 5 year old a question like "how does gravity work?" You'll get gramatically meaningful answers that use the structure of the language correctly, but that are quite likely to have nothing to do with our actual understanding of physics.

peterlk · on Jan 14, 2022

This is not wrong, but also not entirely right. There is a model called T0pp (T0 plus plus) which was fine tuned on simple logic problems, and it is capable of solving novel logic problems. This implies to me that there is more here than we've discovered.

Additionally, the whole point of fine tuning LLMs is to give them domain-specific knowledge. If you couple this with search/QA capabilities, the results can be quite impressive. I've not seen them in the wild yet, but I've played with them myself, and the performance is surprisingly good.

walnutclosefarm · on Jan 15, 2022

I don't know anything about the T0pp, so can't comment on that.

I agree with you on tuning of LLMs. We did some work at my last job before I retired (as CTO of a major medical clinical and research organization) using GPT-3 up-trained on medical vocabulary to generate physician's notes as a summary of a transcribed visit. The results were impressive. Still not usable though. Most of what it generated was correct (and essentially all of it was well composed and readable), but false statements and non sequiturs still crept in at an unacceptable rate.

I think the technology is amazing, and very valuable. But I do also think that tying it to "hard" knowledge models - akin to the way deep physics is done, but coupling to the language model, rather than to generalized neural networks, is going to prove will eventually make it a complete success in specific domains.

joshuahedlund · on Jan 14, 2022

This is great but this post is basically a wrapper for the original post: https://mindmatters.ai/2022/01/will-chatbots-replace-the-art...

kordlessagain · on Jan 14, 2022

This article is misleading in the fact it critiques the usefulness of the OpenAI "chat" example with little or no related training sets passed as tokens during the submission of the question, nor does it mention use of modifications to the parameters (such as temperature) used during the call.

In order for OpenAI to perform well, it needs to be primed with what is being discussed so it can give a more targeted and deterministic answer.

We can get it to answer this question better without any further data by first preprocessing the question using keyword extraction (completions by GPT-3 are shown surrounded by stars):

  https://beta.openai.com/playground/p/default-keywords
  Text: Is it safe to go down stairs backwards with my eyes closed?
  Keywords: *stairs, backwards, eyes closed*

In 10 submissions, the keywords returned were the same.

Next, we ask for a completion from the keywords we were given, tweaking the settings just a bit to prevent repeating or too much randomized behavior:

  https://beta.openai.com/playground/p/default-keywords (temp to 0.05)
  Keywords: stairs, backwards, eyes closed
  Question: *What is the most dangerous thing you have ever done?*

Clearly GPT3 is aware something is dangerous here. Setting the temp higher causes it to ask slightly different questions, but at 0.05 it always returns what is above.

Now we combine the two questions and resubmit:

  https://beta.openai.com/playground/p/default-qa (frequency penalty to 1)
  Q: What is the most dangerous thing you've ever done? Is it safe going down stairs backwards with your eyes closed?
  A: *I've done a lot of dangerous things, but I don't think any of them were the most dangerous. Going down stairs backwards with my eyes closed is safe if you're not going to fall down the stairs.*

Given we leave the temp to defaults, we may also get alternate answers:

  https://beta.openai.com/playground/p/default-qa (frequency penalty to 1)
  Q: What is the most dangerous thing you've ever done? Is it safe going down stairs backwards with your eyes closed?
  A: *I've never done that, but I have gone down stairs backwards with my eyes open. It's not safe.*

Finally, we can take all the questions and answers so far and ask it a slightly different question:

  Q: What is the most dangerous thing you've ever done? Is it safe going down stairs backwards with your eyes closed?
  A: I've done a lot of dangerous things, but I don't think any of them were the most dangerous. Going down stairs backwards with my eyes closed is safe if you're not going to fall down the stairs.
  Q: What is the most dangerous thing you've ever done? Is it safe going down stairs backwards with your eyes closed?
  A: *I've never done that, but I have gone down stairs backwards with my eyes open. It's not safe.*
  Q: Is it safe to go down stairs backwards with a blindfold on?
  A: *It's safe if you're not going to fall down the stairs.*
  Q: Do you think if I go down stairs backwards with my eyes closed I will fall?
  A: *It's possible.*

davidhariri · on Jan 14, 2022

I commented on the original article with the same. Anyone who has used GPT-3 for more than 5 minutes knows this article is uninformed.

benjaminwootton · on Jan 14, 2022

I’ve been spending time in Dubai. Many businesses has a WhatsApp bot based on a menu system:

1 - Book X 2 - Cancel Y 3 - Recieve info on Z

Everyone comments that they work really well and are super convenient.

I think these have more potential than natural language bots.

dvdkon · on Jan 14, 2022

But at that point you've reinvented mainframe-like terminal interfaces. "Everything old is new again", but is that good?

anonymouse008 · on Jan 14, 2022

The whole - 'what do you believe that other people don't'

The public can 100% manage and love terminal interfaces... as long as it looks like a text message.

Pxtl · on Jan 14, 2022

I've come full circle on that using voice assistants. Y'know - when you're a kid "terminal interfaces are bad, guis are great". When you get older and learn "Terminal interfaces are good for compsability and piping, they're great." Then you start developing software for users and learn "terminal interfaces have poor discoverability, GUIs present them their options in clear, informative ways". Then the promise of natural-language processing appears and it's "terminal interfaces are great, you don't have to be perfectly precise any more".

Now, with my google assistant? Text interfaces are awful, because I have no idea what combination of magic words will confuse it. What will be interpreted as a parameter to which command. Even simple commands get broken if I accidentally add phrasing where it interprets it as a different command... usually just giving up with "I don't know how to do that" and then Google gets a nice recording of elaborate cusswords.

bluGill · on Jan 14, 2022

You probably can't do this with google (rhasspy on the other hand...), but what is really needed is every-time someone tries something that doesn't work they go program the response that would have been correct.

I think speech recognition is good enough for this, but what is left is making a complex tree. See the 20 questions game where because so many people have put in new answers it now can guess almost everything after just a few questions. This is not easy and it takes a lot of effort to figure out where things didn't work and fix it so it will for the next person. At least some of those things you fix will never be asked again, but you need to do it.

Part of the above is speech recognition needs to expose a confidence and if it thinks it knows ask useful questions about what you mean. (they already do in some cases)

korse · on Jan 14, 2022

After years of unsuccessfully trying to get people to use modern GUI based software, I had a minor yolo moment and replaced an entire factory worth of employee facing software with a text-mode curses interface. No mouse support, company banner using ASCII art, interface in multiple languages that you toggle with a control key etc. All instances are concurrent user sessions on a linux VM in the cloud, it runs faster than anything we've paid for and no one will entertain the idea of going back. Not saying this is a typical situation and we're only talking a couple hundred daily users, but still, food for thought.

danielmarkbruce · on Jan 14, 2022

Can you provide more details on the use cases/workflows, the users etc? I could be suffering from confirmation bias, but I've assumed this would work in many places.

JamisonM · on Jan 14, 2022

These interfaces are good because they are asynchronous, all the good things about old-timey terminal stuff but without the session overhead badness. If I message back and forth with a bot in a way that accomplishes a goal over a longer period of time but doesn't depend on me maintaining my attention, phone line, internet connection, or browser state for that entire period of time then in many cases it is a superior experience.

This is why I love chat-with-a-human interfaces on websites for support (although they mostly require maintaining browser state) because staying on a phone line absolutely sucks and the person helping you also doesn't have a real-time constraint so they often have the ability to get you better answers to hard questions.

hakfoo · on Jan 15, 2022

Yes.

Every "natural language" system, whether it's a text chatbot or Siri-style voice bot, is essentially a command-line interface without very good documentation.

In general, it was considered an advancement over the classic CLI (at least for environments where there's low expected user proficiency) to provide a menu of valid choices and put them front and centre.

Most customer service scenarios are exactly that-- low user proficiency (in company- or business-specific terminology, or sometimes even the very transaction flow itself) so they may not be able to quite enunciate exactly what they mean, but can easily pick from a clearly written menu.

Out_of_Characte · on Jan 14, 2022

Whatsapp is less scary than the command line.

iqanq · on Jan 14, 2022

Why change a simple interface that works?

Pxtl · on Jan 14, 2022

My last trip in Canada used a Facebook chatbot for the airline. The neat/silly thing is many of the questions presented to me a form full of buttons and options. At that point it's almost a full application, just running within Facebook infrastructure and using the chat box as a UI.

danielmarkbruce · on Jan 14, 2022

If folks stop calling it a chatbot - it's almost just a nicely structured form, delivered in a fairly convenient way. No download, no link to external site, can be deployed easily to various interfaces - sms, chat apps, slack, etc.

goblinux · on Jan 14, 2022

What ever happened to the smarterchild bot? I remember being amazed as a kid that it was a robot on AIM that would reply just like a person. I don't remember it being dumb like modern "AI" chatbots, but it would play coy if it didn't know the answer in a reasonable way. I feel like we've regressed from there.

RIP old buddy. I hope you didn't save our chat logs from that era because man that would be cringey to look at now

gwbas1c · on Jan 14, 2022

I refuse to use chat bots. The technology never worked, and I don't want to waste my time with something that doesn't work.

What is happening is that some salesman is laughing to the bank. I few months ago, a salesman that I work with asked if we should put a chat bot on our website. (IE, with the tone that he wasn't going to take no for an answer.)

I responded that they don't work, and will frustrate people who come to our website. I also pointed out that we are a high-cost asset, with a high-touch sales process. Such a chat bot would be insulting.

His response was some form of "but everyone's using it and they're super-popular and work well."

I then pointed out that the article he read was probably written by the company that sells them.

raxxorrax · on Jan 14, 2022

I actually was surprised how well they can simulate at conversation now. It is a fake because there is little underlying reasoning of course. That is a monumental problem and difficult to determine where to begin.

Do you start to give your AI a motivation or goal? Perception? These are vastly more complex problems than some statistical tricks on data that is widely available.

Still, it is fascinating that we came this far with a dead machine that talks.

moffkalast · on Jan 14, 2022

Yeah we can handle the part where it knows how to express itself in a specific language, it can take in some facts, compare them to its internal database and spit out something sensible as a statistically probable good reply. But there's no sense of self involved there.

I remember reading an interesting article a while ago about at least in the human case the basic principle of emerging consciousness happens when the prediction system in our brain designed for figuring out what other entities around us do is used on itself, trying to explain what the subconscious is doing. As such the consciousness we experience is a bit of a bug in that system that turned out to be beneficial to some extent. All just a theory of course given how much we actually know about the brain so far, but it's always made the most sense to me.

I'm not sure how that would translate into the current ML environment though.

bluGill · on Jan 14, 2022

Do we want that if we could have it? I want machines as slaves for me: go wash the dirty dishes and then do the laundry. I don't want it to get depressed about doing those routine jobs.

moffkalast · on Jan 15, 2022

Well we can have our cake and eat it too. Keep simple and limited machines for work, and intelligent ones to talk to and treat as equals to help us in other ways. It's not like every roomba needs to run a neural inference engine to do its job, nor do we hand pollinate ever flower ourselves if we can get the bees do it it in the organic world.

seanp2k2 · on Jan 15, 2022

Current gen chatbots for support for companies are infuriating and only helpful for the most clueless of users, which I suppose is probably a decent chunk. It’s like when you call a company because you need help that you can’t resolve yourself through their site and are then forced to listen on hold to the phone menu system tell you a dozen times that you can do everything you need on their amazing website.

Also, developers: please don’t try to make the chatbot seem human to fake users out. It’s almost as bad as the fake typing sounds for Comcast support. Making users jump through hoops and tricking them just makes them hate your brand and your products, and makes them even more irritable when they do eventually get to speak with a human.

Also, end the auto pop-up “can I help you find something???” chat bots on websites. It’s like someone had the idea to take the worst part of retail experiences and find a way to make that even more useless, then deployed it everywhere.

xibalba · on Jan 14, 2022

Facebook shutting down M in 2018 should have been a pretty clear sign that the prospects for good chatbots are grim. Even with their massive resources and top talent, they concluded it was a bad bet.

mrtranscendence · on Jan 14, 2022

When GPT3 was opened up so that anyone could create an account, I was excited to try it. I was quickly disappointed. Its ability to chat was quickly shown to be pretty terrible -- it could mostly make reasonable-sounding English sentences, but it was like talking to someone who was maybe a bit drunk and not really listening. I can't imagine using it as an interface for a customer to interact with product support.

The whole thing just made me a bit sad. I really was so excited. Nothing it could do was very impressive, even aside from holding a conversation. The most impressive thing I've seen is Copilot, but even that's been next to useless from a practical perspective.

mrtranscendence · on Jan 14, 2022

Because a friend of mine is into Chuck Norris facts, I tried to train GPT3 to give them. Some of the more novel (as far as I can determine) facts it gave:

* A duck's quack does not echo. Chuck Norris is solely responsible for this phenomenon.

* When you open an umbrella in the rain, do not be alarmed if Chuck Norris falls out of the sky and lands on you. The rain drops are simply being pushed away by his roundhouse kick.

* In an emergency, you can use a bucket of water to put the fire out. However, if Chuck Norris is directly responsible for the emergency, use a flamethrower.

* In an airport, there is no "B" gate. There is only "C" gate. The "B" stands for the bus you will take from the plane after Chuck Norris lands on it.

* There are no weapons of mass destruction, Chuck Norris lives inside every element on the periodic table. It's why you see him in your sodium chloride.

I'll let you be the judge as to whether these are funny.

axg11 · on Jan 14, 2022

Is it not unfair to expect GPT-3 to perfectly tackle this issue when it has been trained as a general purpose model? For customer support or other more specific chatbot applications there are better machine learning models.

mrtranscendence · on Jan 14, 2022

I don't know if it's fair or not, but I don't know how else you'd use its conversational abilities. Maybe it's just a party trick.

isoprophlex · on Jan 14, 2022

A bazillion parameters in gpt3, but what does the training process amount to? Filling in missing characters or words in sentences taken from a huge dump of literature, news articles, reddit comments...

No wonder these things are so dumb still. The training process and the loss function used probably does not penalize poor long-range coherence between paragraphs. Also, if I'm not mistaken, these things have absolutely no internal state besides the characters you steam into them as conversion prompts.

If these things were trained more like agents having to operate in eg. Socratic dialogues maybe we'd be getting somewhere

6510 · on Jan 14, 2022

I don't get furious using a chat bot until it asks for the same information twice because the "context" changed.

isoprophlex · on Jan 14, 2022

Yeah it's incredibly infuriating.

If you use openAI's gpt API, in the docs they talk about providing a prompt that primes the network to respond in a specific fashion. Like,

"this is a Q&A session between an agent knowledgeable about bash scripting.

Q: how do I check the current working dir? A: use 'pwd'

Q: "

... And then the actual user query is concatenated to that.

Which is a fine way to customize a toy chatbot to sound like edgar allen poe, but no way to maintain state across a long conversation with a customer (the max prompt length is very much finite)

Unless someone smart finds out a fundamentally different approach I guess these transformer networks will never really solve chatbots.

moffkalast · on Jan 14, 2022

The problem with that is how do you rate the dialogue produced as correct or not? Not exactly something that can be automated, but would probably need something like a recaptcha to gather responses and it'd take forever.

isoprophlex · on Jan 14, 2022

Yeah true. I have no idea how to start getting sensible training data and losses for this problem.

Training a kid to takes years, hopefully this can be sped up a bit ;)

moffkalast · on Jan 14, 2022

More like training a kid takes 400 million years of random chance :P

ghostwreck · on Jan 14, 2022

After having spent a few years working on a chatbot, the allure is this: talking to a real human is better than filling out a form. If we can build a Q&A system as good as talking to a human, people would also prefer it to filling out forms. So that's the pursuit.

I understand the hate, because we haven't landed very close that goal yet, and the intermediate product is much worse than a form. But I am surprised that a technical community is not more supportive of the ambition.

hooande · on Jan 14, 2022

It's so much easier to fill out a form than it is to talk to a human. I read faster than most people speak, and I can scan and review much faster via sight rather than voice. Talking to someone is valuable if I have questions or there is some uncertainty. Assuming that I know what I want and have no questions, it's much easier for me to order food online than to call a restaurant.

Chat bots can only really search a database of documentation and frequently asked questions. Making one that has the benefits of talking to a human might be tantamount to AGI

bentcorner · on Jan 14, 2022

> it's much easier for me to order food online than to call a restaurant.

I feel like this is where bots would do well - you can say "order me a burger with extra mayo and fries for pickup at 5pm" and it should negotiate all the minutiae for you. Doing this all manually requires a bunch of menu navigation. Maybe a phone bot is still a bad fit but doing something like this using your on-phone voice assistant or typing it into a text window feels reasonable.

nitwit005 · on Jan 14, 2022

You can go to fast food places and see there are people who will use the touch screens even when there is no line, and similar for self checkout at grocery stores. There is an assumption people enjoy chatting to a friendly customer service person, but that's at least not universally true.

ed25519FUUU · on Jan 14, 2022

Is talking to a human better than filling out a form? I can usually fill out 90-100% of a form with just my browser’s autocomplete feature. There’s also MUCH less chance for errors if I fill things in myself.

ShamelessC · on Jan 14, 2022

Human customer support is a regularly annoying experience, even (particularly?) when done online.

There's probably always going to be some level of animosity towards it.

mrpf1ster · on Jan 14, 2022

This article just seems petty. The author just quotes large chunks of the article by Gary Smith while inserting snide comments afterwards ("That's pretty funny!", "These are hilarious!"). Then goes on to ad hominem the original author.

There are no arguments presented for the intelligence of chatbots other then the authors own opinion. I don't know what this article adds to the conversation that Gary Smith's original article doesn't provide.

ramesh31 · on Jan 14, 2022

Repost: https://news.ycombinator.com/item?id=29825612

gnabgib · on Jan 14, 2022

They aren't actually the same article, this one (by Andrew) refers to the other (by Gary) and quotes the title.. it's a response article. Because of the HN policy it's hard to tell that from the titles though. Original article [0] 663pts, 408 comments.

[0]: https://news.ycombinator.com/item?id=29825612

aruanavekar · on Jan 14, 2022

Whether it works or not, sounds dumb or useful. Clients keep asking for it. Personal experience and opinion, they are best as a backup for human agent, when one is busy or unavailable. Costco, Amazon, Ally have good implementations on these. Chatbot discussion maybe in the air, Chat Widget is a must have form of interaction. Customers expect a site to have a "Chat Now" option on the website.

colejohnson66 · on Jan 14, 2022

Amazon's is great in most cases. I forgot to cancel my Prime at one point (I was switching to the student price), and it renewed. I opened the chatbot expecting to have to wait for a human, but the bot refunded the charge with nothing more than an "are you sure?" question.

ludamad · on Jan 14, 2022

It seems that there is a soft renewal phase here, that's refreshing for the mostly woops-you-forgot subscription world

jamesbriggs · on Jan 18, 2022

I think the use case of chatbots is better solved with open domain Q&A (eg https://www.pinecone.io/learn/question-answering/). The focus of most chatbots seems to be on answering questions, but wrapping it up in a nice interface. That's fine but the chatbot can only answer (accurately) questions that have an answer somewhere (probably buried deep in some Q&A pages). It's much more user-friendly imo to have a Google type interface where you can answer a question, and return answers, or at least get an idea of where the answer is - and open domain Q&A does this fairly well (as proven by Google)

FredPret · on Jan 14, 2022

I don't know, I've been asking to "let me talk to a human" and it works nearly every time!

notfed · on Jan 14, 2022

On a serious note, that's something I value. These days, on automated voice calls, the magic cheat code key sequence to invoke this is getting harder and harder, and "0" often doesn't work. Almost always, if I'm calling, it's because I have a question not already answered online.

FredPret · on Jan 14, 2022

Try swearing a lot! It’s cathartic, and it always works

wombatmobile · on Jan 14, 2022

The charm of Eliza is that it was simply a Rogerian therapist who didn't try to be intelligent.

Eliza's talent was in getting you to express yourself, free from inhibition. That doesn't require “intelligence”, but it does require the art of listening. There's nothing dumb about that.

eminence32 · on Jan 14, 2022

Sure, the overall technique of asking vague open-ended questions to elicit a response might not be dumb. But it's hard to argue that ELIZA-style chatbots are intelligent in anyway. They deliberately had no understanding at all.

wombatmobile · on Jan 14, 2022

Is it important to argue about whether chatbots are intelligent?

rini17 · on Jan 14, 2022

Yes if we expect them to accurately map between user input and underlying data/business logic. ELIZA has no underlying.

wombatmobile · on Jan 14, 2022

It sounds like you have developed particular expectations for chatbots.

rini17 · on Jan 14, 2022

Sure, if there are "dumb" bots that are way more useful than Eliza, I can drop the expectations.

martincmartin · on Jan 14, 2022

The title references Paul Simon's album & song "Still Crazy After All These Years."

https://en.wikipedia.org/wiki/Still_Crazy_After_All_These_Ye...

kristopolous · on Jan 14, 2022

They need some kind of agency otherwise it'll always be like inquiring a piece of furniture on how their day went.

Do any of these generate narrative fictions (such as characters and events they supposedly did) to interact with?

ape4 · on Jan 14, 2022

Yeah they're really bad. Usually they just grep for the relevant FAQ.

Me: I read the FAQ, but was still not able to login

Bot: Sorry you're having trouble logging in, here is some info that might help <repeats FAQ>

marius_k · on Jan 14, 2022

I view chatbots as new era of CLIs (mostly poorly designed). Traditional CLIs dont need AI to be useful and I think that chatbots can also be useful (I havent seen one yet).

megumax · on Jan 14, 2022

The idea of completly replacing human beings with chatbots isn't going to succeed. They have their own uses, not very advanced, for example replacing some web interface with chatting in WhatsApp/Telegram, some companies already adopted that and filtering people in case of a call center. But for something more complex that requires actual experience and real life understanding, they should connect you to a real person that can comprehend your messages.

brightball · on Jan 14, 2022

Wasn’t there a story about about a Georgia Tech professor who coded a chatbot to act as a GA for his class and nobody realized it wasn’t a real person?

EDIT - Found it: Jill Watson

https://www.businessinsider.com/a-professor-built-an-ai-teac...

moffkalast · on Jan 14, 2022

Probably says more about that tech prof's social and lecturing skills than the bot ha. Guy must be an actual robot already.

abducer · on Jan 15, 2022

I find IRC bots useful although they certainly qualify as dumb. They meet the user where the user is. They are generally just a dressed up command-line interface. I've written some chat bots pre and post NLP explosion — In this era I've had luck with mild NLP mappings to commands, GPT3 not required. Just get you some verbs and objects.

Dumb but functional > smart but… dysfunctional?

davidhariri · on Jan 14, 2022

(I am a chatbot company co-founder)

The beauty of machine learning systems is that they don't have to be perfect to provide value. As long as the risk and frequency of failure doesn't outweigh the probability and value of success, they can be enjoyed by millions. I see the proof every day.

Is self-driving perfect? No, but correcting my car 20% of the time is worth the 80% of the time when it cruises along just fine. I don't have to be able to sleep for it to be valuable.

Is G suite's text completion perfect? No, but the risk of it being wrong is low and when it's right it saves me typing out common phrases. It doesn't have to write my emails for me to be valuable.

Are chatbots humans? No, of course not. Can they answer common questions successfully? Yes. Can they automate simple workflows? Yes! Can they augment human teams to make their time more valuable and reduce wait times? Absolutely. They already are and will continue to evolve and get better.

I do acknowledge that it's frustrating to ask a question that you know a person would be able to answer and get a worse automated answer first. It's critical that companies ensure these failure modes smoothly transition to a human who will at least have the context of your issue before you speak. Smooth "hand off" is something we've spent thousands of person-hours on.

Technologies like GPT-3 are exciting advancements in language generation, but they do struggle with predicting factual language. I expect that will become less and less of a problem as businesses and platforms seek to adopt it. OpenAI is actively working on this: https://openai.com/blog/improving-factual-accuracy/

Barrin92 · on Jan 14, 2022

I completely disagree with the examples. If I need to correct my self-driving car 20% of the time I'd rather drive myself because 1. I would not want to be semi-distracted and turn into a traffic-hazard which has been shown to be the consequence of these kinds of semi-working systems. 2. it just creates overhead for me to have to always be on alert when my car stops driving.

Same with chatbots. If the chatbot does not understand my command once, I'm already annoyed and losing time. Google's automated customer service is a notorious horror for anyone who has to deal with it.

If I had a code completion engine where a fraction of the completion is nonsense interspersed with valid results I'm losing my mind and turning it off. Which has been my experience with copilot btw.

These half-working solutions are good for exactly two things, the bottom line of companies that replace well-working but expensive human customer-service with a crappy automated solution, and frankly your bottom line because you benefit from selling these systems.

davidhariri · on Jan 14, 2022

Fair enough. I can understand the frustration and for some, the downsides do outweigh the positives. At least from the data I'm seeing, many customers are having positive experiences of our systems and the examples I mentioned...

jll29 · on Jan 14, 2022

The term "chatbot" is problematic, as it potentially conflates a couple of different types of systems that superficially may look very similar.

Dialog systems: Dialog systems, in a narrowly confined domain, can solve a task, help solve a task, or provide information to enable humans to solve a task quicker. Flight booking systems are typical examples, where the system asks a couple of questions and the user answers them, and users may also ask questions. Gradually a set of slots (DEPATURE-FROM, ARRIVAL-AT etc.) are filled and then a booking transaction can be initiated. Will work for flights but not good for asking it out-of-domain questions.

Statistical or neural language models: BERT, GPT-3 and other muppets are models of language that can predict likely next word/sentence etc. - which is useful for many tasks but is NOT equivalent to a "chatbot". It may be abused as one for fun, but there is no formal meaning representation used and no answer logic applied. Think of this as a simple auto-complete - so this is not a source of wisdom to ask about safety of stair cases or any other serious topic like that. (These models are VERY useful ingredients of modern NLP applications, but they are the bricks rather than the house.)

Interactive CRM Forms: Web/Slack "bots" or Typeform survey are sometimes fun, sometimes useful but can never claim to "understand" anything. They are ways to capture some data interactively, often to eventually feed the data to a human for review.

Question answering systems: Answer retrieval is the task of automatically finding a phrase or sentence in a body of, say, a million documents which answers a given question. They are next-level search engines intended to supercede keyword based search sytems. Deployed Web searche engines like Google already have limited answering capabilities - but only for a select small number of question types. "Open domain Q&A" is the task of permitting question answering by machine without limiting the domain, and since 1998 US NIST have been organizing annual bake-offs for international research teams, which has helped advance the state of the art a lot (e.g. https://trec.nist.gov/pubs/trec16/t16_proceedings.html).

Reading comprehension systems: These systems take a piece of text as input as well as a question, and then they attempt to answer a question about the text. Tests used to assess human students (remedial testing) can nowadays be passed reasonably well.

nailer · on Jan 14, 2022

Well yes. I’ve been told VCs invested in these back in 2015 (I was in a startup accelerator in the UK at the time and there were a few in my cohort) and a few years later very few of the chat bot investments have worked out.

kesselvon · on Jan 14, 2022

There's way too many competitors, and it probably pushed margins way down. B2B competitors like Drift charged insane amounts for their chatbot.

swayson · on Jan 14, 2022

You need a really good team and MlOps/DevOps pipeline to bring world class chatbot support and performance. I think there is still opportunity in this space, but you need Apple Level Design care to make it work.

malaya_zemlya · on Jan 14, 2022

there's a whole dark art of writing prompts for chat ai in order to make it behave in a sensible manner. the reason is that gpt doesn't have any context at all besides whats i n the supplied text. If you don't tell it exactly what to do, it will guess randomly.

For example this chat prompt gives much more matter of fact answers, in my testing:

"the following is a conversation with an AI assistant. the assistant is helpful, clever and friendly. it uses Wikipedia as the reference. Human:Hi! AI:Hi! Human:<your question goes here>"

dr_orpheus · on Jan 14, 2022

Original article and discussion on HN: https://news.ycombinator.com/item?id=29825612

amelius · on Jan 14, 2022

Didn't Google have a really great robocall demo, some time ago?

Fnoord · on Jan 14, 2022

Today on HN there was a post about GitHub Copilot chat between user and the AI. I thought it seemed pretty clever with its syntax completion / suggestions.

oneoff786 · on Jan 14, 2022

I find chatbots to be pretty good tbh. It’s like search but within nested levels of context.

IceWreck · on Jan 14, 2022

GPT3 isnt supposed to answer questions. Didn't IBM's Watson win at Jeopardy ?

TedShiller · on Jan 14, 2022

Am i the only one not surprised?

raspberry-eye · on Jan 14, 2022

Yeah… but so are most humans.

skeeter2020 · on Jan 14, 2022

I don't think they're actually intended to answer questions as much as be a cost effective attempt to instill some sense of agency and audience for the user. TL;DR they fail at this too.

harha · on Jan 14, 2022

There’s a special little place in hell for whoever decided the whole world needed chat bots for every crappy website.

Why would I want to try to articulate something that could be found in a simple tree? Just give me direct access.

I don’t know where to find it: search! The issue is not covered in the standard workflow? Get me a real person!

Did anyone implementing ever end-to-end test this for speed and user friendliness? Did they just misinterpret wanting to talk to someone? I want to talk to someone because the process doesn’t cover my case, not because I actually want to have a conversation with the broken process.

anaganisk · on Jan 14, 2022

Worst part is, after answering 10 questions. Some websites offer a real agent, and they ask all the same questions or it tries to point to an article we already know everything about. We have a broadband provider in India, that asks 5 questions about internet outage and they says call 121 to talk to someone. Thomas had never seen such BS before.

EGreg · on Jan 14, 2022

Why do you think they do it?

To take the edge off customer complaints elsewhere?

etripe · on Jan 14, 2022

The same reason companies use byzantine IVR systems (phone menus). To save support costs by making people give up.

JamisonM · on Jan 14, 2022

I worked with some folks that did IVR systems and they were mostly doing their best with the resources and constraints they had to make the thing useful. They were measured by dropped calls, they did not like them.

The weaknesses were mostly just ordinary business stupidity.

The marketing department demands that the first option allow the caller to express interest in buying a product.. nobody ever does that but then it takes up the primest real estate in the system #1 on the first level of the tree.

Of course anyone with a billing enquiry needs to enter the account # so that the collections department has the opportunity to intercept.. but after the arrears lookup nobody in the call centre is willing to pony-up the resources to make the system retain the account number that was typed in so every customer has the then say the damn number after having just typed it in!

etripe · on Jan 19, 2022

To be clear, I wasn't putting the blame with IVRs or their designers. I was specifically speaking to the UX that results from businesses programming them and the menu options they set up. Like you say, those are constrained by what the business itself wants, which in turn is constrained by cost management philosophy.

EGreg · on Jan 14, 2022

At least #1 option is better than having to hear an announcement to press 1 to hear about how Bank of America supports the Military… when you call in for support (yes it is actually happening right now, try it). Whoever would want to hear that when they want assistance with their issue, unless they or their family is in the military?

Taylor_OD · on Jan 14, 2022

Because the average user doesnt want to look at faq. They want a person to answer their question. Now. Chatbots kind of feel like that and the vast majority of questions can be answered by a FAQ.

echelon · on Jan 14, 2022

I agree with you, but there's a lot going on here:

- Customer support is expensive. If keeping customers happy and lowering churn is important, you spend a lot of money on it.

- If you can't afford enough staff, the next best option is to put up a few barriers to slow down the incoming requests. Maybe automated means can solve a large percentage of problems.

- Not everyone knows how to search or navigate a tree. Think about the non-tech folks. You have to offer them something different. It's hard to strike a balance.

- Chat bots are being hyped and sold by new tech companies trying to build larger scale solutions. They want to build more, but they need to sell, grow cash flow, etc.

Etc.

harha · on Jan 14, 2022

Non tech folks come up quite often: I disagree with that, chatbots often require talking to them in a precise language to find an answer or reply in a strange way if they don’t have an answer, I think that’s even worse for a non tech savvy user.

Also: what level of non tech savvy are we taking about? Test with some real users of different ages, I’m sure there’s something to be found that improves usability to a point that’s better than a bot.

Agree on the cost cutting though, if you don’t care about an individual user go ahead and waste their time. I’ve actually had enough encounters with the human counterpart of a chat bot that was equally unhelpful in resolving simple issues.

_lqaf · on Jan 14, 2022

> Customer support is expensive.

Yep, and that's a key thing to get right.

One challenge for those promoting robots is that they are unpleasant to deal with and suck at their purpose (from the customer's perspective - they're deflecting traffic, so they're working for the business). Worse, everyone knows they suck, and people with a choice choose not to use them. I know I've dropped one vendor who forced me to. (I'm sorry, life's too short to talk to robots.)

Things like this make the customer very aware of how they are valued - as cattle not pets, to steal a metaphor.

diegof79 · on Jan 14, 2022

Yes, the cost of having real agents is the main reason for chat bots to exist.

The bot is like a buffer: you can deflect simple issues; capture information for the agent; show feedback about the wait time and agent availability.

Badly implemented bot AI is not the only issue. To give a good CX, companies need to invest in giving the agents the right tools. It’s extremely frustrating when the canned responses fail and there isn’t anybody to answer or when you have to repeat all the info (because the agent cannot see your chat with the bot).

fivre · on Jan 14, 2022

I'd challenge the idea that customer support is _expensive_. Generic customer support requires no specialized expertise: you need knowledge your company's business processes, tooling related to third parties they worked with (redirecting a courier's shipment via the courier or whatever), and general customer service skills, and that's about it.

_Specialized_ customer support, e.g. support for complicated software, is expensive, because it requires technical domain knowledge equivalent to or above many engineering employees, and employees that can perform it effectively could probably move to engineering and increase their salaries twofold because compensation isn't tied to expertise, it's tied to what the market will pay, and you can more easily get away with shit support than shit engineering, but ignore that--it's not the majority of customer support.

Customer support _is_, in the simplest view, a cost center. Customers contact support when something has gone wrong that they can't resolve otherwise. If nothing goes wrong, you don't need it. If you, as support management, can reduce the time it takes to resolve things that go wrong (by any means) and reduce the number of staff needed to handle cases, you win: your department doesn't directly generate revenue, so your goal is to reduce expenses.

Tying lost revenue to customers lost after poor support experiences is difficult and noisy: there are plenty of customers who have poor experiences because they're incompetent and wouldn't succeed with any amount of support, and because of this are more likely to request support, and the high-level categorization used to describe issues often obscures what actually went wrong (a generic "Account > Creation" category isn't going to capture issues like "users can't create an account if their address includes non-ASCII characters", but execs don't see anything more than the category).

These issues (and others) combine to promote an environment where support is more a buffer to getting support with limited ability to fix issues, and a perverse incentive to _not_ fix issues since metrics are focused towards reducing time to resolution (easy to measure) versus quality of resolution (hard to measure!). Poor support is furthermore easy to ignore because you can either often focus more on new business first and foremost (hello, every startup ever!) or rest comfortably on a monopoly position where customers can't drop you even if you provide terrible support (hello, the US ISP market!).

Cost is not strictly an disincentive to providing sufficient, capable, and effective support, but it's difficult to recognize the value it provides and there are more immediate cost savings from reducing support cost than long-term revenue from customer retention by providing good support.

This compounds into even poorer service in more specialized environments because talented people will leave, because working in a cost center sucks. The flipside of support is essentially sales engineering, AKA support while trying to court a customer. The technical skills needed aren't any different, but one brings in revenue in an easily quantifiable way, so it gets more political clout and more organizational investment.

elforce002 · on Jan 14, 2022

Facts. I hate chatbots with passion so much so that I prefer calling if I want to know something instead of using them.

defanor · on Jan 14, 2022

Unfortunately there seems to be a tendency to either remove phones altogether, or to man them with the same chat bots (coupled with speech recognition and synthesis, so even worse than text-based ones).

airstrike · on Jan 14, 2022

I just call and repeat the word 'agent' until someone talks to me

kordlessagain · on Jan 14, 2022

I push 0 over and over.

skeeter2020 · on Jan 14, 2022

I'd rather email and wait several days, even if I need the answer NOW, compared to "talk" toa chatbot. Maybe even mail a letter.

browningstreet · on Jan 14, 2022

I think chat bots exist in our world because people would rather "ask a question on a forum" than search and have direct access to an answer. It drives me crazy too, but it's how we get "does anyone else ever..." memes in this world. People don't want information, they want the experience of asking the question and the discussion. The answer is not the primary force of their effort.

It also seems to be the conversational model in use at parties.

defanor · on Jan 14, 2022

> Why would I want to try to articulate something that could be found in a simple tree? Just give me direct access.

Sometimes I feel that way about databases and dynamic websites. I suspect the reasons behind those may be similar: web UIs add branding, maybe make usage a bit more convenient to an average user, and so on. Animations are popular too, probably they are supposed to make the websites to look more fun and modern. I guess that people planning chat bots similarly view them as looking more futuristic and convenient, more advanced than boring old documentation.

IgorPartola · on Jan 14, 2022

The Amazon chat bot has been pretty great for me. Instant replies and I most times I don’t have to type anything out. It’s really a decision tree with multiple choices but extensive and actually helpful. And if there is a bigger problem it instantly switches to a person who has all the context.

hammock · on Jan 14, 2022

+1 to this. A big part of it is that it's actually connected to your personal data (order history etc)

harha · on Jan 14, 2022

Sounds good - the menu based customer care is pretty good there too, and they tend to actually want to resolve issues unlike many other sites (though that might also depend on the cost benefit analysis) might be the basis many other sites are missing.

arikr · on Jan 14, 2022

Wasn’t this on the homepage yesterday?

gnabgib · on Jan 14, 2022

They aren't actually the same article, this one (by Andrew) refers to the other (by Gary) and quotes the title.. it's a response article. Because of the HN policy it's hard to tell that from the titles though. Original article [0] 663pts, 408 comments.

[0]: https://news.ycombinator.com/item?id=29825612

dang · on Jan 14, 2022

HN's policy doesn't ask people to remove quotation marks!

kahrl · on Jan 14, 2022

Yes: https://news.ycombinator.com/item?id=29825612

LittlePeter · on Jan 14, 2022

Sorry, I should have checked (I'm the submitter), but I thought the article is so fresh there is no chance someone already submitted it.