Migrated from eDJGroupInc.com. Author: Greg Buckles. Published: 2010-05-07 05:23:40Format, images and links may no longer function correctly. According to the eDiscoveryJournal web analytics, many of you are corporate IT administrators trying to understand these new eDiscovery requirements that your legal department keeps talking about. You provide technology and infrastructure while balancing business requirements against the total cost of ownership. Legal is bound by a different balancing act, cost against risk. This is a fundamental difference between a business unit asking for a document management system and collecting the ESI from twenty laptops for litigation. Many IT administrators tasked with executing collections from their systems have not been given a basic, plain language explanation of the legal requirements for collecting identified ESI when it may be presented as evidence at a later date.
The first rule of collection is to do no harm to the content, format or metadata that you are going to collect. That means using the right tools and testing those tools before you touch the real ESI. I am going to limit this discussion to the ‘reasonable, good faith effort’ standard of civil litigation, but there are arguments for going the extra mile to meet the more stringent criminal forensic collection standards. There are three basic components of any piece of ESI that must be preserved during a collection:
- Content – The internal ones and zeros of a file blob (short for Binary Large Object) must not be altered when it is copied from the original location to your media location. The normal method of verifying the content of a file is to use a utility to compare the Hash value (unique key based on either the MD5 or SHA-1 algorithm) of the original and the copied items. Most applications designed for legal collection will inventory this and other information prior to making the copy and then automatically verify afterward. The content of an item includes metadata fields that are stored inside item like version dates, title and other fields.
- Container Metadata – The filename, owner and date fields for a file actually exist in the file system where it resides rather than inside the file. These fields must be captured and replicated on the copied item where possible. The Windows ‘DateLastAccessed’ field is particularly tricky to maintain when running an inventory or copy on a file. I hope that you can see why your counsel might want to know if a report had not been touched for years prior to your collection, especially if their witness claimed to have reviewed it recently. If your entire collection now has yesterday’s date and you are the owner/creator on every file, you can bet that someone on the other side will want to know why. It is not always possible or practical to recreate container metadata on the copy set. You should test a sample set with your tools and tell counsel what was or could be altered. Only they can determine if the potential changes are significant in the context of their matter. Specialized collection tools that reset dates and wrap collections in a forensic container have gotten much more mature and user friendly. Most will provide a detailed inventory or load file that contains all of the captured metadata.
- Context – This is the most misunderstood and neglected aspect of an ESI collection. A perfect copy that cannot be authenticated is not evidence. In order for the court to recognize and admit a document or file into evidence, the producing party must be able to tell the court where the item came from, who collected it, how it was collected and where it has been at every step from origin to court. This is conventionally called the Chain of Custody. The corporate user who had ‘care, custody and control’ of the item is usually referred to as a custodian. At a deposition or when testifying in trial, an item may be shown to them and they will be asked if this is a true and correct copy of their file. This is called ‘authenticating’ the item so that it can go into evidence. Could you look at a spreadsheet years after you created it and swear that it was the exact spreadsheet that you made? Having clear documentation of where, when and how that item was collected along with the inventory of metadata should give you the confidence in your reasonable belief that this is indeed your spreadsheet.
I hope that you now have a greater understanding and appreciation for the differences between restoring a backup for a user and collecting a network share for legal. Most basic network search and copy applications are just not designed to document and preserve context and metadata. This article has focused on the basics of loose files on desktop and network storage locations. I will follow up with discussions on collections from communication, archive, document management and other data sources.