Migrated from eDJGroupInc.com. Author: michael simon. Published: 2014-09-07 20:00:00Format, images and links may no longer function correctly. 

Rose by any name –

is still the same.  PC-TAR?

Ya know, what-ever!

Why Aren’t Lawyers Using PC, or TAR, or PC-TAR, or CAR, or CAR-TAR, or Whatever We Call It?

It’s been a long time—too long—since I have written anything for the eDJ Group.  Once of my excuses is that I’ve been spending some time writing for some other sites, or put another way, exercising my inner snark

Several weeks ago, Greg Buckles posted preliminary results of his survey on analytics usage in eDiscovery.  Greg then questioned his own results, or at least the raw statistics generated from his survey, presenting what he saw as the actual results:

My best estimation is that only 5-7% of matters that reach the review stage . . .  actually use some form of PC-TAR.

I’m not surprised at his general conclusion, though I do feel disappointed at those numbers.  We all should feel that way; just a few years ago, machine learning technology represented something genuinely new and amazing.  Even within the over-hyped, commercial blitz of the eDiscovery industry, predictive coding was genuinely promoted by many as the way that the ever-rising costs of eDiscovery, created by technology, could be solved by technology.

If you fast-forward to today, though, it seems that other, older and simpler technologies that can provide some savings but are not a “game-changer” like machine learning are instead the market focus.  In Greg’s words:

The interviews made it clear that the market has a much broader definition of ‘Analytics’ than the marketers pushing PC-TAR . . .  I sought out technology agnostic service providers and in-house lit support managers with deep eDiscovery experience who had all used PC-TAR on actual cases. Every one of them quickly differentiated between the use of analytics to prioritize or optimize collections prior to review versus the actual use of any kind of machine learning PC-TAR technology. The use of ‘Accelerated Review’ during or just post processing to prioritize, cluster or otherwise optimize the collection . . . seems to be how my survey respondents were interpreting my question.

Greg is not the only one to notice this trend.  Just last week, David Horrigan’s timely and thoughtful article in the National Law Journal, “Tech is Litigant’s Boon, Not Profession’s Doom,” also noted the unfortunate tendency of eDiscovery marketing to lump machine learning in with other less sophisticated (and older) TAR technologies.  David cites industry experts who estimate that this must mean that machine learning is underrepresented in the surveys showing low adoption rates.

Let’s return to that first quote that I used from Greg, but let me add back in the preface that I left out last time:

When you ask about the actual use of PC-TAR for real machine learning review, the numbers plummet. My best estimation is that only 5-7% of matters that reach the review stage . . .  actually use some form of PC-TAR.

Unlike Greg, none of the experts quoted by David seem to have actually asked their respondents to distinguish their use (or, more accurately, non-use) of machine learning from other forms of TAR.  Instead, the NLJ-quoted experts interpreted the lack of positive responses on machine learning as a positive sign, or perhaps that even that merely asking about this specific technology is now pointless.  Is it significant that these experts work for companies that sell predictive coding technology and services?  Reasonable minds may differ on whether there’s any potential for bias here. 

Even if I am wrongly suspicious (and it wouldn’t be the first time, by any means), I don’t think there are too many in the industry who would fail to share in my disappointment.  What happened to all that hope about the glorious future of machine-assisted document analysis?  I believe that a number of reasons have contributed to machine learning malaise.  For starters, look at the terms being used to describe the technology: “predictive coding” . . . or is it “technology-assisted review” . . .  or maybe simply “TAR” . . . or maybe “CAR” . . . or is it “PC-TAR“ or “CAR/TAR”?  This “whole host of other names,” (quoting David Horrigan) has driven things to the point where every vendor in our industry can now claim with a clear conscience that their solutions include PC-TAR.  I wish I could claim this is my original thought, but Craig Ball made this observation in one of his articles several months ago (my apologies to Craig for not linking to the article, but I cannot seem to locate it now).  

Consider the term “predictive coding.”  I have been told by a number of industry leaders that they won’t let their company use that term because it usage supports Recommind’s trade dress.  However, as David points out in his article, Recommind doesn’t own that term: “Attempts to trademark ‘predictive coding’ were unsuccessful.”  In fact, even Recommind appears to be distancing itself from the term, as can be seen in this recent article by Recommind’s Drew Lewis (“ . . . and the really good firms aren’t even using the term anymore”).  Yet, even when we get past the “we don’t want to use our competitor’s branding” problem, a greater issue is that nobody actually likes the term “predictive coding.”  I can’t count how many industry professionals have told me that they hate the term “predictive coding” as being inaccurate and even potentially misleading because the process is neither “predictive” nor actual “coding.”

Next, consider “TAR,” a ghastly acronym indeed.  There’s an implicit connection to cigarette advertising.  Then there are the horrible puns “TAR” seems to engender (I hope that I will never, ever again have to see another article on the theme of “Escape the TAR pits of eDiscovery!”).  However, once you get past those small points, the real problem is that TAR is premised upon that incredibly broad word—the “technology” part of “technology-assisted review”—and that opens a giant loophole that lets every vendor claim their products incorporate TAR.  After all, looking at documents on a computer screen uses technology and constitutes Review, so all the necessary requirements of TAR have been met—right?

Finally, let’s consider “CAR.”  Actually, no, let’s not.  Not only does CAR lead to punnage so awful that makes that used with TAR look mild by comparison, but CAR only pretends to clarify a dodgy definition.  Is there anyone, inside or outside the eDiscovery bubble, who requires clarification that the “technology” in TAR involves computers instead of some other technology, such as the steam engine or the inclined plane?

All ranting aside, here is my modest proposal: let’s call this technology-that-genuinely-revolutionizes-textual-analysis by a term that, while we might not love, we can at least not hate—and that might be reasonably accurate, to boot.  How about the term that I have been sprinkling throughout this article: “machine learning?”  We know what the term means, and it distinguishes between cutting edge textual analysis and older technology approaches.  “Machine learning” might not be the best term, and I am open to anyone suggesting any other, better term that we can use as an industry.  Until then, though, we have to start somewhere.

Plus, the General Counsel clients who ultimately pay for the costs of eDiscovery won’t hate the term, and I believe they would actually like the clarity that it brings.  Lawyers who fear being replaced by this technology may hate it, but review attorneys are not the ones making purchasing decisions (cruel of me to say, but true).  Senior attorneys who rely upon the hours billed by those review attorneys might be uncomfortable with the term (and they do make purchasing decisions), but that assumes that they have even heard of any of these terms anyway (again, as per David Horrigan’s article: “. . . the majority of lawyers . . .  have never even heard of predictive coding.”).  Besides, with adoption rates for machine learning, whatever it is called, in the single digits anyway, do we really have that much to lose?

Now that I have gotten this out of my system, I’d like to turn to the more substantive issues potentially holding back the adoption of machine learning.  This is for my next column, but as a bit of foreshadowing, I’ve come to strongly suspect that the real reason for the disappointing rate of machine learning adoption is something completely different than the “usual suspects,” ones we keep hearing about every time the issue is brought up.  Stay tuned for next time.

Michael Simon – eDiscovery Expert Consultant – Seventh Samurai 

Contact Michael at Michael.Simon@Seventhsamurai.com

eDJ publishes content from independent sources and partners. If you have great information, perspective or analysis to share, please contact us for details. 



0 0 votes
Article Rating