Migrated from eDJGroupInc.com. Author: Greg Buckles. Published: 2014-05-18 20:00:00Format, images and links may no longer function correctly.
An “eDJ Brief” is our short write up of biannual free product briefings that eDJ will do for any product that may impact the markets we cover. We have found that forcing a provider to speed demo their product in half an hour generally makes them get to their differentiators. The eDJ Brief is meant to help you understand where the product fits and our top take-aways rather than a traditional product review.
Company: Content Analyst Company
Offering: CAAT
Briefing by: John Felahi – Chief Strategy Officer
Steven Toole – VP Marketing
Date: May 15, 2014
Overview: Content Analyst is one of a very small number of companies supply the OEM analytics embedded in the majority of early case assessment, review and processing platforms. Content Analyst has expanded on its original Latent Semantic Indexing (LSI) system with many additional text analytics algorithms. The full suite now spans conceptual search, dynamic clustering, auto-categorization, email threading, text near-duplicate identification, language identification and automatic summarization. As a consumer, you have only seen the CAAT functionality exposed through their partners’ interfaces. Most CAAT partners such as kCura Relativity, iPro Eclipse, iConect Xera and Mindseye TunnelVision only leverage selected analytics capabilities. Founded in 2004, Content Analyst has 13 patents and is one of the earliest analytic engines designed for integration into discovery products.
Key Notes:
- CAAT processes extracted text and metadata fields. This means that the quality of the analytics is related to the quality of the output of the processing engine.
- The LSI-based analytics capabilities are language agnostic. Email threading and text near duplicate identification is English only.
- The CAAT index size averages 5-8% of original source files in native format.
- Ability to categorize documents in motion against the original index for predictive coding. See eDJ Note regarding pros and challenges of holding index in memory. Automatic summarization is based on the identification and extraction of most relevant sentences to queries, concepts and other documents.
- SmartTrain™ excludes documents unsuitable for conceptual clustering combined with SmartSample™ training set optimization process to increase the quality and efficiency of PC/TAR workflows.
- Partners and customers from non-discovery usage cases driving text analytic visualizations and presentations.
- More than 18 eDiscovery platforms are powered by CAAT.
Greg Buckles can be reached at Greg@eDJGroupInc.com for offline comment, questions or consulting. His active research topics include mobile device discovery, the discovery impact of the cloud, Microsoft’s 2013 eDiscovery Center and multi-matter discovery. Recent consulting engagements include managing preservation during enterprise migrations, legacy tape eliminations, retention enablement and many more.
Take eDJ’s monthly survey on Analytics Adoption for Consumers AND Providers to get premium access to profiles.
eDJ Group is proud to promote the Information Governance Initiative’s 2014 IG Annual Survey. We encourage you to participate and will share our insights on year to year trends when the survey is closed.
Join Greg in Houston at the upcoming ARMA International regional program on June 30. InfoGov: Getting Your Data House in Order to Avoid Litigation Costs and Risk.