Building bot-ready knowledge bases (our first AI project, 2020)

We are currently exploring the role of local knowledge bases (KBs) in retrieval-augmented generation (RAG) AI processing. This post is the first in a series documenting our “sandbox” knowledge bases (created over a period of about 20 years) and how we’re using them in various Generative AI (GenAI) projects.

For this experiment we used our “shopping for groceries” knowledge base, which was both a standalone project and also part of our DITAinformationcenter.

Overview of the “building bot-ready knowledge bases” project (2020)

We used our grocery knowledge base in 2020 in an experimental initiative to prototype a bot-ready information solution using Google’s Dialogflow. It was perhaps somewhat groundbreaking for its time, but is primitive by 2024 standards!

GROCERYbot
GROCERYbot

We programmed our bot to:

  • Chat with a user at a minimal level
  • Answer from a set of prescribed questions (user intents) and answers (mini-knowledge bases or components)
  • Refer users to knowledge articles (knowledge bases) for more information
  • Defer if the user’s queries are beyond the bot’s restricted domain
GROCERYbot in the Dialogflow console
GROCERYbot in the Dialogflow console
GROCERYbot knowledge components
GROCERYbot knowledge components

Correcting defects in Google Dialogflow: Adding webhook fulfillment

Price table for canned goods
Price table for canned goods

The original DITA grocery shopping DITA files contained price tables, which had two major problems:

  1. The information was static.
  2. The tables were difficult to display on the web.

To solve the pricing issues, we added a “webhook”:

  • Webhook fulfillment allowed the bot to “look up” dynamic pricing information from an external data repository and return it in response to a user query about price.
  • We added a new user intent to handle user queries like “tell me the price of large black olives.”
  • Every time the new intent was selected, a call was made to the pricehook.php script running on an external web server.

Increasing “bot-readiness”: Turning the DITA metadata into a bot training kit

In a staged effort, we added metadata to the DITA files and made it available to the bot as a programmatic training effort. We reasoned that the source files should be as complete as possible, so that:

  • The most critical and definitive content and metadata was produced by the content’s authors at the time of their creation
  • Having a well-structured and relatively complete content collection in the early stages of the bot-building project meant that time could be saved in training, testing, and putting the bot-based KB set into production

Integrating GROCERYbot with Telegram

Finally, we integrated GROCERYbot with Telegram, a cloud-based messaging app, to refer the user to additional information in KB articles:

  • Based on a relevant user query, select one or more knowledge articles.
  • Use the article titles and short descriptions as “teaser text.”
  • Provide the user with a link that seemed to satisfy their query.
GROCERYbot integrated into Telegram
GROCERYbot integrated into Telegram

Lessons learned (2020)

We were convinced that our experimental initiative validated the potential synergy between DITA-based structured writing and AI technology, but…

  • The Dialogflow tool was still lacking in important features.
  • Effective project guidance and roadmaps were in short supply.
  • Creativity and customization were required to achieve a viable solution. This involved considerable manual effort on our part.