Migrated from eDJGroupInc.com. Author: Barry Murphy. Published: 2010-04-05 12:16:23Format, images and links may no longer function correctly. From time to time, you get asked a question so many times that it makes sense to just go find the answer. If only it were that easy, but I’m going to try. Lately, I’ve been asked about the need for forensic collection a lot. Organizations want to know how forensic collection fits into their overall information governance strategies. After all, why invest in an infrastructure to optimize eDiscovery when you’ll just have to image all custodians’ machines anyway? We’ve discussed some of the issues around this, especially forensic collection of SharePoint information in previous posts. eDiscoveryJournal will be producing a report on the role of forensic collection in information governance strategies later this month, but we wanted to give you some of our thinking in the meantime – as well as ask for your feedback and any experiential contributions you might have.
Let’s first talk about the definition of “forensic collection.” According to Nolo, forensics “refers to the use of science or technology to discover evidence for a court of law.” For the eDiscovery market, the term “forensic collection” is about the combination of people, processes, and technology to defensibly collect information in response to an investigation. In the early days of digital information (before organizations really tried to actually manage the information), products arose that could forensically capture information from computers. These products capture bit-by-bit copies of computer drives (known as disk imaging). The term forensics came to be virtually synonymous with disk imaging.
As systems for better managing information became available, the need for disk imaging started to diminish. What organizations learned was that a full copy of a custodian’s machine was not always required for civil litigation. No one wanted to pay to process full disk images when they could filter that information down to the relevant 1GB or 2GB (at 2005 processing prices of $2K / GB, that savings amounted to $36K / custodian for 20GB hard drives – pretty good savings, especially when you consider that cases could routinely have over 10 custodians). At the same time, consultants were telling organizations that it was safer to take full disk images for collection than risk not finding all potentially relevant information via other collection mechanisms. This advice was not necessarily untrue, but it was at the very least biased – these consultants were typically from the EDD processing provider and wanted to increase revenue with more high-priced processing. But, the point of the consultants stood – it is safer to take a full disk image if you don’t think other collection mechanisms will be able to get all the information necessary.
A “forensic collection,” then, is really a defensible collection – one that will stand up to rigorous questioning in court. Many collection tools can run a defensible collection by ensuring that all files and associated metadata don’t change. This brings us back to metadata and the challenges it presents for eDiscovery. To look more at how the courts view metadata, I turned to a colleague, John Patzakis – the founder of Digital Compliance Consulting, Inc. and former Chief Legal Officer at Guidance Software. John pointed me to Aguilar v. Immigration & Customs Enforcement Div. of U.S. Dep’t of Homeland Sec., 255 F.R.D. 350 (S.D.N.Y. 2008), a case that stated that metadata is no different than other forms of ESI and “thus is discoverable if it is relevant to the claim or defense of any party and is not privileged.” In addition, this case provided a good explanation of the different types of metadata, dividing it into three categories: substantive metadata (e.g., data reflecting “modifications to a document, such as prior edits or editorial comments, and data that instructs the computer how to display the fonts and spacing in a document”), system metadata (e.g., “the author, date and time of creation, and the date a document was modified”), and embedded metadata (e.g., “spreadsheet formulas, hidden columns, externally or internally linked files (such as sound files), hyperlinks, references and fields, and database information”). As Patzakis pointed out to me, “a defensible collection will include logging, reporting, and hashing where an organization shows it was able to get the metadata, log the search criteria, have transparency in the collection and preservation process.”
The next question that comes up, then, is really about where full bit-by-bit forensic collection fits in. Even with this case, organizations are left wonder how to fit full forensic collection into information governance strategies. Here is what we’re learning so far:
- Forensic collection does not always mean disk imaging. A good forensic collection is defensible and includes all the files and necessary metadata necessary to prove how, where, when, and why the information was collected.
- Full disk imaging does play a role in information governance strategies. There will be times when a full disk images is required:
- If criminal activity is suspected
- If custodians might be non-cooperative
- If the investigation is internal or due to regulatory requests
- If the organizations believes that other collection mechanisms will likely miss information
The key with disk imaging is to understand that it can be overkill while recognizing that there will be times when it’s required. As organizations plan to build out infrastructures that optimize eDiscovery, it’s important that disk imaging not be a killer should it come up. Rather, it should be integrated into the process so that potentially responsive information can be extracted from the images and then analyzed and reviewed alongside other information. We’re interested in hearing about your perspectives on disk imaging and forensic collection. Please email me at barry@ediscoveryjournal.com if you would like to discuss further.