Recently we’ve been heavily involved in preparing one of our prototype knowledge bases (#4: El Dorado Hills CA community information) for a major reorganization and content update. We described our project objectives and major activities in this recent post:

Using DITA/xml, GenAI tools and RAG processing to establish a community information collection

In this post we talk about some of the specific work we’re doing on the images in the knowledge base collection.

Sample image from the EDH community information collection

Images in our original El Dorado Hills (EDH) print knowledge base

Our EDH information collection has a lot of images, mostly photographs we took ourselves, and also a few line drawings and maps.

The images have been and continue to be an important part of the knowledge base for several reasons:

To contribute to the general appeal of the collection.
To provide information (for example, a map showing the location of an important landmark).
As an addition to the historical record of the El Dorado Hills community.

The original images, which were part of our 2003 printed document, were in either the TIF or GIF format.

Preparing the images for our current-and-future knowledge base

Our former images were inappropriate for our current-and-future, online- only knowledge base (for example, they needed to be converted from TIF or GIF to JPG). This week we’ve done some image reorganization, renaming, and resizing.

Renaming the images by hand

We renamed the images by hand, but that task would be a good candidate for automation sometime in the future.

Here are some of our naming conventions:

Images are consistently and meaningfully grouped, and indicate consistently their group and subgroup names
Names are all lower case and are “spaced” with hyphens, not underscores
The file types (jpg) are in lower case

For example, here is the file name of the fire station image above. It indicates that it is in the community section (vs. social history or environment), and that it is part of an FAQ about incorporation.

Resizing the images and converting them to jpg using the Python Pillow library

Our Python program to do the image resizing was developed using programming tips from Perplexity AI Assistant. The valuable information and good advice from Perplexity saved us considerable time that might have been required to wade through the extensive online documentation for the Python library.

Below is the Python code snippet that calls the Pillow library and resizes the images. We also downsampled the images with a high-quality filter to reduce file size and allow them to load more quickly.

For more information

Below is a link to a PDF of our Python program (ImageWrangler.py) to resize images.

Python program to resize images (PDF file)

S	M	T	W	T	F	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30

VRJ Associates, LLC

Resizing images using the Python pillow library

Images in our original El Dorado Hills (EDH) print knowledge base

Preparing the images for our current-and-future knowledge base

Renaming the images by hand

Resizing the images and converting them to jpg using the Python Pillow library

For more information

Content curation and transformation for RAG processing