Migrated from eDJGroupInc.com. Author: Greg Buckles. Published: 2013-05-07 09:39:03 I cannot remember how many consulting clients have asked me to review their retention policy/schedule without having any hard data on their unstructured digital landfills. Typical corporate records manager, “We’ve been working on this retention schedule for over a year. We have over a hundred categories identified. We just need your help defining our software requirements and defining the user process.” Even well intentioned clients frequently get the cart before the horse. Tell me about your data sources and the content profiles before we try to determine whether an archive, content management or other system is appropriate to enforce retention policies. Last week I participated in a webinar with Jim McGann of Index Engines on Data Profiling to control risk and cost. Index Engines has had onsite and service offerings for relatively low cost inventory/profiling of tape collections, shares, SharePoint and more when compared to typical eDiscovery processing costs. We are hearing the big data players like Symantec, IBM and HP-Autonomy push the business intelligence message, but that CIO-level pitch can fly right over the heads of legal, records management and IT admins who are struggling with the day-to-day data glut. So how can data profiling drain these corporate backwater data swamps?Let’s start with a fast history lesson to understand why the backlog of legacy data has reached the breaking point. Judge Shira Scheindlin’s 2003 Zubalake v. UBS Warberg decision fired the first real eDiscovery shot that put corporate counsel on the spot for preserving ESI. The problem is that most of them just fired off hold notices to IT that started the accumulation of backup tapes and halted any major file clean up initiatives. The 2006 Federal Rules of Civil Procedure confirmed preservation obligations without providing any solid guidance on how to comply selectively. I did my share of damage evangelizing for enterprise archives, global journaling and ‘better safe than sorry’ solutions during this period. Our eDJ clients are now paying in spades for that early retention strategy. I now see the results of those ‘keep it all and let Legal sort it out’ days with every almost new client. Fear of sanction or adverse inference created our problem and understanding the composition and characteristics of those legacy repositories is the first step to cleaning them out. Now this is where some of you smart practitioners will say that you need to put your legal holds in place before even considering expiry/deletion/migration/etc. Certainly, Mikki Tomlinson’s Legal Hold Boot Camps hammer home this message. My argument is that you cannot preserve what you have not first identified. So much of this legacy ESI is old fashioned junk. The quantity of ESI may be growing exponentially, but the volume of actual critical business content is on a tame, linear growth curve. Check the definition of Signal-To-Noise Ratio for a good analogy. You can only preserve selectively is you understand the content and characteristics of your ESI. War story time. I flew in to a frozen Denver meeting with a small energy company who was trying to understand why their first round to hold searches had locked down a majority of their email archive. The goal was to educate and get agreement from the stakeholders for the right hold strategy. We used their tools to group email by direction, domain, conversations, attachment types and other data facets. The General Counsel (GC) diverted the meeting into a high speed triage of the profile report and was able to identify roughly 70% of the content for hold exclusion AND for immediate deletion by category. The key was enabling the key decision makers to see the patterns that aligned with the system’s ability to act on that data.
The 2012 eDJ-ViaLumina Information Governance Survey had some telling data points. Over half of respondents had data clean up initiatives in their sights for 2013. Know it to manage it. Get you legal holds in place so that IT and the users can apply business retention without fear of legal consequences. Three years ago, we would be using Robocopy, Directory Lister and other desktop apps to extract raw file lists for analysis with SQL queries. A new generation of mature applications have improved the performance and flexibility of the inventory process and are presenting the results in reports and interactive visual dashboards for real dynamic functionality. So before you try to define your hold, retention or other policies, see what your data can tell you about itself with the right tools.Greg Buckles can be reached at Greg@eDJGroupInc for offline comments or questions. His active research topics include mobile device discovery, the discovery impact of the cloud and Microsoft’s 2013 eDiscovery Center. Recent consulting engagements include managing preservation during enterprise migrations, legacy tape eliminations, retention enablement and many more.The First Step – Know Your Data
Share This Story, Choose Your Platform!
Subscribe
Please register or login to comment
0 Comments
oldest