What exactly is a search index?

Spitze & Co.
June 22, 2021

The title here should perhaps rather be: how do we best train employees in search skills so that everyone can find knowledge effectively. It is relatively easy to find training offers that cover the technical aspects, algorithms, architecture and the like. All of that is great if you need to build a search engine and have a development team, but if you're reading this blog or are a user of Responza already it's less interesting. Then, how do we solve the problem; the content team or the marketing team must understand and know the basics of search if they are to achieve the best results and in the best way equip their colleagues as well?


So how do we ensure the knowledge your teams need to be effective? First, let’s go back to the beginning. Before we had search engines running on computers, we had other ways of searching for information. Your local library and staff may have been your first step - and if you were looking for something specific, they might have consulted an index, probably on paper cards. The term index is the basis of all modern search engines. Another good way to think about this is the index at the back of a book - it lets you look up, using specific keywords or phrases (called 'terms' in search engines), where the information you want to find in the book is stored. The relatively simple concept is essential to explain to the team, and it can be done without the use of long training days or software.



A handy exercise - the paper-based search index


An effective exercise to understand the structure of the index and what a search engine actually does for us is to consider a text, here we have borrowed the definition of a door from Wikipedia:


A door is an opening or an entrance with a door in a door frame with hinges (possibly with a door step). A door can be open, closed or jammed. Doors often have a locking mechanism as well as a door handle. The door panel can be decorated and provided with a letter slot (for mail and newspapers) and a "door spy" that lets the resident see who is outside without even being observed.

The combination of a door and a door frame is a multi-joint mechanism. A door and a door frame (or other fortress) only meet the door property jointly.

A large door is often called a gate, this is located in a gate opening that can be without a gate (door).

 

If we were to find this text using an index, one could try to build an index on paper on this small text - we can immediately identify that the title of this knowledge element is


title: door


Already here we have actually built the first simple approach to the index, if you know that the topic you are looking for is a door, it will be natural to start by finding articles (or books) using a title list and find the topics that contain door. But we do not always know what the topic is exactly, and we can therefore try to deduce the keywords an article contains, these become "terms" for the search engine.

terms: door, entrance, gate (etc.)


These terms can also be the word stem or other more advanced ways of treating terms, e.g. synonyms. It can be a really good exercise to carry out on one's own knowledge, and try to get a better understanding of how the different parts of an article, the index and the words are put together.

 

From here, one can consider a discussion of another important part of the search engine - how do we build a search, based on what the user writes. In the example above, you can ask the team to consider how to answer the user's question, 'What is a small gate', how to analyze this question, hyphenation, etc. to consider how we can find the article in the index and how we assess whether the document is a relevant result.

 

Once you and your team have tried the exercise a few times, it quickly becomes clear what is actually in an index (and how much the search engine actually does for us in terms of word stems, synonyms, keyword translation, etc.) and how to better teach like-minded end users the application of search.

 

/Team Responza  

 

Behind Responza is a bunch of eager customer service experts who are passionate about helping to optimize your customer service. We are part of the consulting house Spitze & Co. Within Knowledge Management we are nerds on a whole new level. If you want to be a nerd as well, you can participate in the Danish Knowledge Management forum, or follow us on LinkedIn