Wonderful Statement about Baked In Search Bias
Author: Stephen E. Arnold – Beyond Search
HackerNews’ post for this article: “Google’s Million’s of Search Results Are Not Being Served in the Later Pages Search Results.”
Second, optimization is a fancy word that translates to one or more engineers deciding what to do; for example, change a Bayesian prior assumption, trim content based on server latency, filter results by domain, etc.
Net net: Search and retrieval systems manifest bias, from the engineers, from the content itself, from the algorithms, and from user interfaces themselves. That’s why I say in my lectures, “Life is easier if one just believes everything one encounters online.” Thinking in a different way is difficult, requires specialist knowledge, and a willingness to verify… everything.
Stephen Arnold does a good job reality checking cloud system search and reviewing an interesting article on Google search. In the early days of eDiscovery, I learned the hard way that not all search engines or collection types returned accurate, complete results. eDiscovery marketing messaging has highlighted the gaps and limitations of native enterprise and archive search. Today I see those same eDiscovery platforms using Microsoft’s Graph API and other cloud source APIs to brag about their ‘integrations’ for direct collection. What I do not see are published acceptance testing case studies or customers running their own tests before they rely on these systems to place holds or certify completeness of discovery on these cloud repositories. My motto continues to be ‘trust by verify’.