Enterprise Search: Still Crazy after All These Years
Author: Stephen E. Arnold – Beyond Search
…(from source)Also under-considered is the issue of snippet length. A bit of research has been performed, but it involved web pages, which are themselves more easily scanned and assessed than content found in enterprise databases. Those documents are often several hundred pages long, so ranking algorithms often have trouble picking out a helpful snippet. Some platforms serve up a text sequence that contains the query term, others create computer-generated summaries of documents, and others reproduce the first few lines of each document…
…(from source)The next step is look at each result and decide whether it is relevant enough to click on to view the associated content. Simple! Or is it? As with many aspects of enterprise search, there seems to be no research on how snippet length and design support making informed decisions on relevance…
While I like Mr. Arnold’s highlights and commentary, I also encourage readers to review the source article, Scanning and Selecting Enterprise Search Results: Not as Easy as it Looks. I can see some interesting parallels to how we test and tune preservation/collection scope criteria. These issues with presentation of search or cluster results came up in my recent briefing with the Agnes Intelligence team. They were rightfully proud of their success at generating meaningful cluster names. If you are familiar with the ubiquitous concept wheel visualization commonly found in analytic platforms, you know that the top 3-5 terms do not always tell a meaningful story.