The term "chatbot" is problematic, as it potentially conflates a couple of diffe...

The term "chatbot" is problematic, as it potentially conflates a couple of different types of systems that superficially may look very similar.

Dialog systems: Dialog systems, in a narrowly confined domain, can solve a task, help solve a task, or provide information to enable humans to solve a task quicker. Flight booking systems are typical examples, where the system asks a couple of questions and the user answers them, and users may also ask questions. Gradually a set of slots (DEPATURE-FROM, ARRIVAL-AT etc.) are filled and then a booking transaction can be initiated. Will work for flights but not good for asking it out-of-domain questions.

Statistical or neural language models: BERT, GPT-3 and other muppets are models of language that can predict likely next word/sentence etc. - which is useful for many tasks but is NOT equivalent to a "chatbot". It may be abused as one for fun, but there is no formal meaning representation used and no answer logic applied. Think of this as a simple auto-complete - so this is not a source of wisdom to ask about safety of stair cases or any other serious topic like that. (These models are VERY useful ingredients of modern NLP applications, but they are the bricks rather than the house.)

Interactive CRM Forms: Web/Slack "bots" or Typeform survey are sometimes fun, sometimes useful but can never claim to "understand" anything. They are ways to capture some data interactively, often to eventually feed the data to a human for review.

Question answering systems: Answer retrieval is the task of automatically finding a phrase or sentence in a body of, say, a million documents which answers a given question. They are next-level search engines intended to supercede keyword based search sytems. Deployed Web searche engines like Google already have limited answering capabilities - but only for a select small number of question types. "Open domain Q&A" is the task of permitting question answering by machine without limiting the domain, and since 1998 US NIST have been organizing annual bake-offs for international research teams, which has helped advance the state of the art a lot (e.g. https://trec.nist.gov/pubs/trec16/t16_proceedings.html).

Reading comprehension systems: These systems take a piece of text as input as well as a question, and then they attempt to answer a question about the text. Tests used to assess human students (remedial testing) can nowadays be passed reasonably well.