Page 156

How Pure Bible Search Performs Its Searches

If you’ve used other Bible Search programs, either commercial or free/open-

source programs, you’ve probably discovered that the search method of King

James Pure Bible Search is very very different.

Most Bible Search programs treat the text as a large blob of unknown text as if it

were some arbitrary, even changing, document. They require you to type in some

word or phrase, click ‘search’, and wait and wait as it attempts to search through

the entire database to try and find the word or phrase, while trying to employ

special matching algorithms to match similar things, because it knows you’ve

probably typed the phrase wrong, since it itself doesn’t even know what’s in the

text and what isn’t.

But in reality, the King James Bible text is neither unknown nor does it arbitrarily

change. Therefore, King James Pure Bible Search makes use of this and treats it

as a known list of words, indexed by their exact position and word forms.

For the character set, King James Pure Bible Search treats all letters, the hyphen,

and the apostrophe, as unique characters composing a word. It also treats regular

Arabic Numerals as unique characters too, but the King James Bible doesn’t have

any numbers written as numbers, they are all written as words. It’s the “Holy

Word of God”, not the “Holy Numbers”.

Using this character set, all 12838(*) unique words (excluding case) were

extracted. A concordance was created mapping each word to its exact position in

the text. For example, using Genesis 1:1, we have the following unique words and

index mappings:

and : 8

beginning : 3

created : 5

earth : 10

God : 4

heaven : 7

In : 1

the : 2, 6, 9

This is what is stored in the database, but minimized to combine words that

appear with varying case, which minimizes the number of comparisons, as we

can see if the words match as lowercase and if so, then compare their correct case

only if the user happens to be doing a case-sensitive search.

As the database is loaded into memory, an inverse table is created mapping

position back to individual words:

1: in

2: the

3: beginning

4: god

5: created

156