Rule Based Categorization – Then and Now
A recent client discussion reminded me of my earliest attempts at rule based categorization and the hard lessons of that experiment. Back in 2000, my general counsel (GC) asked if it was possible to find and segregate all potentially privileged emails out of the hundreds of millions that we had to produce to many different parties. I took a couple hundred thousand email and spent a week crafting search criteria/rules and doing iterative sampling checks. I worked with our top paralegals and our long standing firms to incorporate everyone’s input. I segregated approximately 18% of that collection as potentially privileged and put the remainder in the review queue without telling my contract attorneys that it had been cleansed. I felt pretty good about the exercise and knew that my rules were overly inclusive, but the point was to determine the risk of privilege waiver if we gave all the regulators remote access to the ‘cleansed’ master collection while my review teams worked on the 15-18% at issue. In the middle of the review, my GC ‘volunteered’ to man a review station for a couple hours to see for himself how it worked. After all, it was his question that kicked off this experiment. What do you think that he found?