Everyone knows how to send an email and Blind Carbon Copy (BCC) certain recipients separately from your public TO/CC recipients. I responded to a recent question on the Yahoo! LitSupport list regarding the best practices for production of email with BCC information. My response kicked off several offline questions about the actual nature of BCC information, preservation of such information and deduplication of email with/without BCC recipients. I have had to wrestle with this from an audit and a product development perspective several times, but it seems worthwhile to try to write a decent overview of BCC and eDiscovery.
The first thing to recognize is that the internet email message format standards have been continuously changing over time and any collection of historical email is likely to have originated from different Mail User Agents (Outlook, Thunderbird, etc). I am not even going to try to get into RFC Standards that comprise the current Multipurpose Internet Mail Extensions (MIME) formats or even all the different ways that email can originate or be transported. The fundamental point on BCC information is that it is handled from the Sender rather than by the Internet transport system or the Recipient system. There are several options that a Sending system can use to handle BCC recipients:
- The BCC: field is actually removed at time of sending.
- All TO: and CC: recipients receive a copy with the BCC: line removed, but the BCC: recipient receives a separate copy with the BCC: information intact.
- The contents of the BCC: field can be stripped for all TO:/CC: recipients but the empty BCC: field will now indicate that a BCC: recipient may have been on the original email.
- The BCC: information is actually held in an envelope message. This allows compliance/Journal routing and archive captured before the internal message is routed to the recipients.
On a practical level, I tend to focus on how this information manifests itself in corporate collections from different sources. You can immediately see that internal/outbound email will be more consistently formatted than inbound external email where the BCC: field was controlled by the external Sender. So that leaves us with a number of potential copies that may or may not have different information.
|Sent Folder||Sender||Yes||May have send receipts|
|Mailbox Folder||TO/CC Recipient||No||May have empty BCC:|
|Mailbox Folder||BCC Recipient||Yes||May contain other BCC: addresses|
|Journal Mailbox||Internal/Outbound||Yes||No receipt time/actions|
|Journal Mailbox||Inbound TO/CC||No|
|Journal Mailbox||Inbound BCC||Maybe|
To make matters even more confusing, Microsoft Exchange 2000, 2003 and 2007 all have differences in how they handle BCC: fields and message transport envelopes. Exchange 2003 journaling introduced envelope journaling and a true compliance capture of BCC: information on a corporate wide basis. These message within a message containers really challenged archiving and eDiscovery processing platforms for several years. I routinely saw review collections where EVERY email had multiple levels of attachment. The new Exchange 2010 Transport architecture has added granular categorization/rules for messages, but the BCC: and envelope handling seems pretty much the same as Exchange 2007. I have run into several notes from last year indicating that the initial releases of Exchange 2010 had some kind of issue preserving the BCC: information in the mailbox, but I am betting that this was either an isolated behavior OR that Microsoft has since fixed the bug.
So the next question I received was how to reconcile the different versions collected email from Custodial Mailboxes, PSTs and the corporate Journal Archive. On the practical side, I think that this is the kind of detail question that you want to get an agreement or at least acknowledgement from the requesting party prior to production. If they have requested native versions and intend to process the email, then resynching origin/BCC/User Action metafields from an accompanying load file could be daunting if they are not ESI savvy. If they have given you load file specs and just intend to load directly to their review platform, then they will probably want this kind of information in the load file.
Bottom line is communication. Know what they intend to do with your production to prevent unnecessary hassles and inadvertent accusations of deliberate spoliation. Just be careful of any process that ‘reinserts’ data back into the native email. Reassembling the envelopes is not common, but can be done.
On the dedup side, I always use BCC as criteria field, but only so that I can prioritize the BCC copies and make sure that we keep them first if possible. The main concern here is that you do not want to lose/discard the BCC information. It is fine to extract it into the load files, but BCC is usually important. I usually like to segregate it for separate review, just like all the potential privilege items. Deduplication criteria and handling does not yet have firm caselaw or any kinds of industry standards. I encourage you to test your chosen platform or service provider so that you understand how duplicates are determined, designated and handled for review and production. I love hearing about your experiences with different BCC and email format/version issues, so throw me a note if you have run into something that I did not cover here: Greg@eDiscoveryJournal.com.