Common characteristics of our sample knowledge bases (KBs)

Our eight “sandbox” knowledge bases, described in more detail and linked below, have the following common characteristics:

Contain information (e.g. grocery shopping, community and family facts and history) that is “timeless” and generally understood by almost any target audience
Contain specific information (“ground-truth”) that is already known to (and likely written by) us, so we can better formulate prompts and queries to the model and understand and evaluate its responses
Multimodal, which allows us to challenge the technical capabilities of various models

In addition, the first two (“Shopping for Groceries” and “Cleaning the Garage”) are simple and small, which makes them perfect for prototyping.

KB #1: “Shopping for groceries”

Facts, features, flaws

Structured document in DITA/xml; illustrates basic DITA concepts and features
Super-small project size (7 topics that include concept, task, reference material)
Content lacks interest, relevance

Sample published content

“Shopping for Groceries” (WebHelp Responsive)
“Shopping for Groceries” (PDF)

KB #2: “Cleaning the garage”

Facts, features, flaws

Structured document in DITA/xml; illustrates more sophisticated DITA features like multiple ditamaps and filtering using a ditaval file
Small project size (20 topics)
Content lacks interest, relevance

Sample published content

“Cleaning the Garage” (WebHelp Responsive)
“Cleaning the Garage” (PDF)

KB #3: “DITAinformationcenter”

Facts, features, flaws

DITA-based structured information project of moderate complexity containing approximately 350 source files
From 2006-2011, contained “ground-truth” information about DITA and the DITA Open Toolkit
Now (2025) the content is outdated and sometimes misleading
We are no longer SMEs in this space

Sample published content

“DITAinformationcenter” (PDF)

KB #4: “Computer history”

Facts, features, flaws

The source files are are in multiple formats: Some are in DITA/xml and others were written in HTML and originally published as website posts
The narrative content is based mostly on our volunteer experiences in volunteering at the Computer History Museum
Adding research-based information to the knowledge collection would be difficult and impractical to do

Sample published content

“My high-tech adventure” (PDF)
“Computer games from the past” (PDF)
“Anker-Werke banking machine” (PDF)

KB #5: “Astronomy images”

Facts, features, flaws

DITA-based structured information project published to WebHelp Responsive
Collaborative project (human/AI)
Images displayed by category (e.g., galaxy, nebula, planet) and included with each image is a short description
Object descriptions are relatively “standard” and “timeless”
Indexed to help people find particular images
Ideal way to share a hobby-based collection of images with friends and family

Sample published content

“Astronomy images” (WebHelp Responsive)

KB #6: “Community information project” (El Dorado Hills, California)

Facts, features, flaws

Written in structured DITA/xml; published to WebHelp Responsive
Collaborative project (human/AI)
Illustrates how to update out-of-date information with the help of AI assistants and agents
Examples files show how to set up a project and prepare for future automation
Includes a style guide and glossary
Model for similar group/community projects
Major issues include: (1) challenge of morphing a book into a web-based topic collection, (2) challenge of updating inherently volatile information after 25 years of inactivity, (3) lack of a current human owner or verifier

Sample published content

WebHelp Responsive site containing information available to general audiences:

“EDH Community Information (2026-01-13)” (website external)

WebHelp Responsive site containing the external information plus “behind the scenes” information relevant only to the creation and collaboration team:

“EDH Community Information (2026-01-13)” (project internal)

PDF file of the original (2003) Ed Dorado Hills Handbook:

“EDH Handbook (2003)” (PDF)

Knowledge bases #7 and #8: Family history, genealogy

These two multimodal knowledge bases contain “ground-truth” information about ourselves and our ancestors; for example:

Typical genealogical facts, charts and diagrams from family trees
Photocopies of original records and record indexes
Family history books, papers and posts
Photographs of individuals and events

Facts, features, flaws

The source files are not well integrated (e.g., some are in genealogy apps, others are narratives written by a number of people, and many are images in various formats)
Dozens or perhaps hundreds of collaborators have contributed to the collections, and some of the content files contain factual contradictions
We have had the most success in asking our AI assistants for contextual information to supplement our current collections
In 2026 we’re hoping to define and create a WebHelp Responsive genealogy and family history collection

Sample published content

“Pedigree chart” (PDF)
“Family history book” (PDF)
“Paper: compiled lineage” (PDF)
“Web post: birthday tribute” (PDF) (AI-enhanced)
“Web page: family page” (PDF)

VRJ Associates, LLC

Knowledge bases for prototyping

Common characteristics of our sample knowledge bases (KBs)

KB #1: “Shopping for groceries”

Facts, features, flaws

Sample published content

KB #2: “Cleaning the garage”

Facts, features, flaws

Sample published content

KB #3: “DITAinformationcenter”

Facts, features, flaws

Sample published content

KB #4: “Computer history”

Facts, features, flaws

Sample published content

KB #5: “Astronomy images”

Facts, features, flaws

Sample published content

KB #6: “Community information project” (El Dorado Hills, California)

Facts, features, flaws

Sample published content

WebHelp Responsive site containing information available to general audiences:

WebHelp Responsive site containing the external information plus “behind the scenes” information relevant only to the creation and collaboration team:

PDF file of the original (2003) Ed Dorado Hills Handbook:

Knowledge bases #7 and #8: Family history, genealogy

Facts, features, flaws

Sample published content

Local knowledge base creation, curation, and transformation for AI/RAG processing