Migrated from eDJGroupInc.com. Author: Greg Buckles. Published: 2014-07-22 20:00:00Format, images and links may no longer function correctly. 

My recent research and surveys have made it abundantly clear that ‘Analytics’ means many different things to the broad eDiscovery market. Even terms like Predictive Coding or Machine Learning can involve an infinite number of combinations of technologies, sampling strategies, training iterations and much more. Every provider seems to have a different take on how to leverage analytics in review. So what flavor of PC/TAR is right for you? For most consumers, the first priority is to differentiate between the major approaches that are offered as alternatives to traditional unorganized linear review.  Here is my simplistic break down based on my active research, though I am excluding most pre-review culling methods.

Prioritization/Acceleration  – The ESI is organized or clustered by a wide variety of manual or automated methods to improve review speed and quality. Although reviewers can bulk mark clusters/stacks/threads, the general market assumption is that all items have had at least one set of  eyes on them or another document so similar as to make no difference. The technique was introduced to the market over 10 years ago by Attenex, Stratify, Cataphora and others. Providers report up to 300% improvement in review speed depending upon the collection composition. This approach is considered fairly conservative and has low adoption resistance. 

Propagation – Reviewer decisions from seed and training sets are used to train an algorithm on a single or multiple issue basis. That ‘engine’ classifies unreviewed documents into training categories (relevant, non-relevant, privileged, etc). Some systems use random training samples while others select samples across the categories/clusters for what some providers call “active learning”. Everyone seems to have their own ‘secret sauce’ in sampling strategies. The propagation approach offers the highest potential review savings, but also faces the highest resistance according to my interviews. Biglaw firms and many providers of contract reviewers perceive these systems as a direct threat to their revenue. Too many of these systems are considered ‘black box’ technologies, requiring a subject matter expert to validate, operate or explain. The latest generation have worked hard to simplify and visualize the feedback metrics that the user uses to make the decision on when the training process can stop. You will hear providers talk about stability, confidence, recall, precision and other concepts for measuring the effectiveness of the system.

Recommendation – This approach uses the same kinds of training algorithms as propagation systems and frequently prioritizes review batches based on similarity, concepts and other ESI aspects. The system will display recommendations and sometime weights for review designations based on prior decisions. You could consider this a dynamic machine learning feedback system to improve review speed and consistency. Since the system itself is not applying review decisions, it may face less resistance for new users.

Quality Control/Pattern Analysis – The goal of these systems is to improve the quality and consistency of the review rather than decrease the cost. The engine uses similarity, concepts, word clusters and more to compare the decision patterns and spot potential false negatives/positives. This is especially valuable in matters with heavy privilege or confidentiality issues.

Ranked Navigation – The goal of this approach is to identify the key documents from an opposing production, internal investigation or other early case assessment scenario. The solution interface and workflow should enable a user to identify a relatively small number of the most important documents from the collection through clusters, sampling, profiles, social networking and other features.

I hope that this quick overview helps you understand the different categories of solutions that my research has identified so far. I am certain that our quickly evolving eDiscovery market will provide us new approaches in the coming months. The idea is to match your particular review goals, sophistication and risk tolerance against the solution categories so that you can compare potential solutions on an apple-to-apple basis. My forthcoming research report on analytic adoption will dig much deeper into differentiating features in the most common buying scenarios for review and other applications. The surveys will be open until the report is published, so take the 5 question survey to get access to the raw survey results and the report when it is published.

Greg Buckles can be reached at Greg@eDJGroupInc.com for offline comment, questions or consulting. His active research topics include mobile device discovery, the discovery impact of the cloud, Microsoft’s 2013 eDiscovery Center and multi-matter discovery. Recent consulting engagements include managing preservation during enterprise migrations, legacy tape eliminations, retention enablement and many more.


0 0 votes
Article Rating