We are currently exploring the role of local knowledge bases (KBs) in RAG (retrieval-augmented generation) AI processing. This post is part of a series documenting our “sandbox” knowledge bases (created over a period of about 20 years) and how we’re using them in various GenAI prototyping projects.

Sample DITA source file from the "grocery" knowledge base — Sample DITA source file from the “grocery” knowledge base

RAG processing in early 2025

In this, our kickoff post for the year 2025, we’ve circled back and identified a couple of our key hypotheses (along with the related observations and takeaways) from our 2024 projects, and summarized them below.

We plan to use a similar summary template for our upcoming RAG processing projects.

Our next project for 2025 is to move from our small language model (SLM) processing environment and move back into the large language model (LLM) world (see Hypothesis #2, below). We’ve also dusted off our Colab notebooks as better programming environments for learning and teaching purposes.

And we’ve already started some multimodal RAG experiments using knowledge bases containing images.

Watch this space for new posts!

Hypothesis #1

“What synergy (if any) is there between DITA topics and RAG processing?” (3 Jan 2025)

Published DITA topic

Observations and Takeaways:

Our belief in the potential synergy between DITA-based structured writing and GenAI technology was validated.
However, effective RAG project guidance, roadmaps and tools were in short supply, so creativity and customization were required for a viable solution.
HTML output with the kind of “disconnected” tagging typical of DITA projects seemed to be no more effective than simple narrative output (e.g. PDF) dependent solely on the LLM and its internal tools to “discover” keywords and relationships.

RAG Processing Post (date published on this site):

Comparing the effectiveness of DITA/xml output types in RAG processing (19 Aug 2024)

Hypothesis #2

“If you have the GPU compute power, should you do RAG processing locally?” (3 Jan 2025)

Prolog section of our RAG system prompt program

Observations and Takeaways:

Being able to observe the small language model (SML) process locally can be a good educational experience.
However, frequent updates to currently available GenAI tools, as well as inter-tool incompatibility, make it difficult to manage the software stack.
Also, performance can be slow except where top-of-the-line local hardware with GPUs is available.
We are switching back to our large language model (LLM) environment.

RAG Processing Posts (date published on this site):

RAG processing: Small language model (SLM) with chat and system prompt (21 Nov 2024)

RAG processing: Small language model (SLM) with chat (15 Nov 2024)

RAG processing: Small language model (SLM) with single query (11 Nov 2024)

S	M	T	W	T	F	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

VRJ Associates, LLC

RAG processing: Key project hypotheses, observations and takeaways (2024-2025)

RAG processing in early 2025

Hypothesis #1

“What synergy (if any) is there between DITA topics and RAG processing?” (3 Jan 2025)

Observations and Takeaways:

RAG Processing Post (date published on this site):

Hypothesis #2

“If you have the GPU compute power, should you do RAG processing locally?” (3 Jan 2025)

Observations and Takeaways:

RAG Processing Posts (date published on this site):

Content curation and transformation for RAG processing