We are currently exploring the role of local knowledge bases (KBs) in RAG (retrieval-augmented generation) AI processing. This post is part of a series documenting our “sandbox” knowledge bases (created over a period of about 20 years) and how we’re using them in various GenAI projects.

For this experiment we used our El Dorado Hills Handbook local knowledge base.

Updating down-level content in local knowledge bases

The El Dorado Hills Handbook was created and published in 2001-2003, when we were living in El Dorado Hills, California.

Late in 2003, we moved away, and our book about the community has has not been updated since then. In spite of the down-level state of the information, we decided to transform relevant parts of the book into DITA/xml topics and the subsetted collection of topics to our current GenAI projects.

Content topics and prompts

We focused our experiments on the following content topics, which obviously contained out-of-date information:

Real estate activities in El Dorado Hills
Cultural resources in our former neighborhood (in 2002 these were in a sorry state – what has happened to them in them meantime?)
Newspapers serving the community and the region
A community landmark called “the El Dorado Hills rocks”
Events celebrating the founding of El Dorado Hills in 1962 (we had attending the 40th anniversary event in 2002 and wondered if there had been additional events since then)

We wrote prompts for each topics, referring the AI assistant to the topic and requesting more up-to-date information.

Example: Albert’s shrine topic and prompt

Here are the KB topic and prompt for “Albert’s shrine,” one of the cultural resources:

Excerpt of the "Albert's shrine" topic in the knowledge base — Excerpt of the “Albert’s shrine” topic in the knowledge base

Albert's shrine prompt to the AI assistant — Albert’s shrine prompt to the AI assistant

Example: Newspapers topic and prompt

Here is the KB topic and prompt for the “newspapers” information in the Handbook:

Excerpt from the "newspapers" topic in the knowledge base — Excerpt from the “newspapers” topic

AI assistants and large language models (LLMs) used in our project

We posed our questions to several AI assistants, including:

Perplexity
Google’s Gemini
Anthropic’s Claude
Microsoft’s Copilot

The LLMs our AI assistants were querying included the following:

OpenAI’s GPT-3.5
Google’s Gemma 2 Open AI model
Claude 3.5 Sonnet
Meta’s Llama 3
Meta’s Llama 3.1-405B

Results of our experiments

Our experiments were by no means exhaustive, but here are a few observations:

Claude was the probably the best AI assistant at acknowledging our KB topic input and considering it in its response to our query. In a number of other cases, the AI assistant simply parroted back our own information and declared that it could find nothing in addition. We had two reactions: (1) We were looking for an acknowledgement that the information came from our topic and not from the assistant’s search, and (2) We were suspicious that the assistant either hadn’t tried hard enough or we should be looking for an assistant with more specialized knowledge.
Perplexity seemed to do the best job overall with the task, and its answers were detailed and well organized. It also provided source references with no additional prompting.
Gemini provided useful information about the current newspapers serving Sacramento and El Dorado counties, and its answers were also well organized.
Perplexity’s “related questions” were especially helpful. These were similar to those provided by Google Search, but they were much more thoughtful and sophisticated.
For no apparent reason, Llama 3 returned information about Albert’s shrine in French.
There are so many AI assistants and LLMs participating in this space that it’s hard to remember which ones have which capabilities, and which won’t do an external search without a “pro” subscription. Preparing for a serious search would take more time and money than we were willing to devote to this experimental effort. However, we are strongly motivated to keep trying at our current level of effort.
One important realization we gained from this effort was how much time and effort would be required to establish a trustworthy AI information source and seamless workflow for a more serious project.

S	M	T	W	T	F	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

VRJ Associates, LLC

Updating down-level content using AI tools and techniques