Migrated from eDJGroupInc.com. Author: Greg Buckles. Published: 2013-06-17 06:30:07Format, images and links may no longer function correctly. My definition of Dark Data differs from Wikipedia:
“Data relevant to a discovery request that is either never disclosed or is produced without contextual information that could affect the interpretation of that data.”
My first interview on cloud sources as discovery targets turned up surprising frustration from the savvy eDiscovery Counsel for a national plaintiffs firm. I expected to hear about immature collection capabilities and defendants who struggled to preserve or collect from Office 365, SalesForce or other cloud systems. I did not expect that requesting parties might be completely in the dark about where a production comes from or how it was collected. eDJ’s consultants have had too many recent engagements supporting the evaluation or migration of email and files to the cloud to doubt the trend. Microsoft has been touting the rapid adoption of Office 365 with corporate and public sector verticals. Many corporations seem to have moved critical ESI to the cloud without a clear plan to meet eDiscovery and Information Governance requirements.
My latest focus poll and interviews are intended support the creation of a new market category on ‘Cloud Collection’ in the eDiscovery Matrix. Now I am wondering if we need a larger conversation about the lack of disclosure and transparency around discovery productions. Parties make requests and receive productions, but how much insight do they have into where the ESI resided or how it was collected? For a long time, we pretty much knew that email came from servers, PST files or archives. Only the last source played a ‘dark data’ role when when corporations used archive search and deduplication systems that were not designed for complete, validated retrieval in large enterprise environments. Email and files from Google Vault and Office 365 are being produced every day. How many requesting parties actually know where their productions came from? There are 20 products in my SharePoint Collection market category, but the majority of these products can only retrieve from document libraries. This makes SharePoint blogs, wikis, workflow, tasks, calendars and all the associated ‘context’ into Dark Data.
Now, no one really likes discovery battles. They are just not sexy. As I have been reminded, you don’t pay to dispute issues unless there is a clear strategic advantage at stake. Frankly, many attorneys neither understand nor care about the context of evidence or how it was collected. For many the primary goal is to resolve the case as soon as possible at the least cost, which is their job.
So why even care about dark data? In the early days of eDiscovery, we essentially converted ESI to a near paper format and pretended that the source did not matter. Some of us learned the hard way that paper and email have different discovery requirements. Some Luddite trolls are still hiding under bridges roaring, “Just print it out!” I am positing that ESI in the cloud and on mobile devices is intrinsically different from traditional unstructured enterprise ESI. If you don’t know where it came from, how can you make an informed decision as to the potential impact on your case? I am not advocating for forensic colonoscopies or audits. Instead I am hoping that the increasing maturity and education in the industry will result in better written interrogatories, standardized disclosure templates and sharp practitioners who will ask the right questions. The new Dodd-Frank mandate for transparent information governance should also pressure public corporations into better data awareness. So have you ever stumbled into dark data and found something critical? I would love to hear about it.
Greg Buckles can be reached at Greg@eDJGroupInc for offline comments or questions. His active research topics include mobile device discovery, the discovery impact of the cloud and Microsoft’s 2013 eDiscovery Center. Recent consulting engagements include managing preservation during enterprise migrations, legacy tape eliminations, retention enablement and many more.
Join Greg for upcoming webinars or contact him to participate in his focus polls:
- The Impact of Dodd-Frank on eDiscovery, Governance & IT – July 11, 2013
- Poll & Interview – Cloud Sources as eDiscovery Targets – ongoing