Law.com Subscribers SAVE 30%

Call 855-808-4530 or email [email protected] to receive your discount on a new subscription.

Decoding Encrypted Documents in Cross-Border e-Discovery

By Robert Wickstrom
January 31, 2014

On the battlefields of ancient Rome, Julius Caesar used a cipher that changed the order of the letters of the alphabet to secure messages he sent to his infantry leaders. Caesar shifted the letters in his messages by three, such that the letter A would become D, safely encoding them from his mostly illiterate foes. Any messages the enemy intercepted were likely presumed to be written in a strange foreign language.

Since this first reported use of cryptography in 50 B.C., encryption techniques have become far more sophisticated and prevalent. Therefore, it is likely that any data you collect in discovery will contain files ' if not entire computers, hard drives or mobile devices ' that are encrypted, encoded or password-protected. And given the global nature of business, it is even more likely that some of these encrypted documents will be written in foreign languages, which adds another layer of complexity to document processing and review.

As a result, organizations should develop a strategy to determine whether encryption will be an issue as early as possible in a case. It is also important at the outset to develop a process for handling these very complicated document sets when time is of the essence and when local privacy laws must be considered.

A Typical Cross-Border Discovery Conundrum

Consider the following scenario:

You are a discovery manager for a company that is facing a short discovery deadline in a regulatory matter before a federal government agency. You have just arrived back in the United States after collecting data from a foreign-based company with documents in several languages, including Mandarin and Cantonese, at the direction of outside counsel. You expect you will need specialized language skills to translate the foreign-language documents. But you did not expect most of these documents to land on an exception report ' the list of files that your service provider could not extract full data or metadata from ' when it attempted to process the documents for review. Typically, files listed on an exception report include documents that need optical character recognition (OCR), zero-byte files (files that were never created), and structured files with tables in them, such as Microsoft Access databases. However, your exception list contains a fourth type of documents that you did not expect: files that the company's document custodians have encrypted or protected with passwords. With the discovery deadline just days away, it seems impossible that you will be able to figure out how to decrypt these files to enable their content to be reviewed, prepared, and produced in time.

What could the company and its counsel have done to prevent the surprise of encrypted data? And once this type of data is discovered, are there any efficient methods to process it? This article examines the best practices that will help you answer these questions.

How to Prevent the Surprise of Encryption

How can organizations and their outside counsel manage encrypted documents before seeing them on an exception report? Here is a suggested checklist of steps to undertake before beginning the data-collection process (adjust accordingly for each matter).

Ask questions

The first step during the pre-assessment stage is to ask custodians, business managers and IT personnel three critical questions:

  1. Do you use passwords to encrypt files or log on to any sites?
  2. Have you ever encrypted hard drives or seen encrypted hard drives (internal/external drives) at your company, (i.e., “is your computer or laptop encrypted”)?
  3. Do you know whether your company uses encryption for passwords, e-mails or documents?

These questions should be included in all of your pre-collection conversation with anyone you speak to at the company. The answers to these questions notify outside counsel and IT personnel that they should build in time to the discovery schedule to decrypt the data. Planning for this obstacle in advance can avoid costly delays and ensure compliance with tight discovery deadlines.

When asking these questions, it is also important to recognize the cultural norms of the country where you are collecting the data. For several countries, the issue of data encryption is a sensitive one, and people living within these countries will likely be less open to discussing questions about the use of encryption or passwords. Diplomatically work with general counsel and management to communicate the rationale behind the questions and that they are not meant to be intrusive or personal. Your objective is not to obtain the decryption key or passwords ' it is to assess the data landscape and the amount of work required to process that data.

Check the applicable law

Many countries, particularly in the European Union and the Asia-Pacific region, have enacted privacy laws that address the transport of private data outside of a jurisdiction.

Gather passwords

Check with managers and legal for a list of passwords accumulated on previous matters. Test these passwords against encrypted and password protected data first.

Decide on your approach

Your approach to managing encrypted data will depend on both your budget as well as the importance of the documents. You have two choices. The first, and often less expensive, option is to assemble a collaborative team composed of outside counsel, forensics experts, business managers and IT representatives who can work together to create a strategy for decrypting the data prior to processing. Spending some hours with the company's information security specialists can often save a significant amount of time. The second option, described in the next section, is to retain a specialist in forensic techniques to decode the documents. While effective, this process adds a layer of complexity, time and expense.

How to Manage Encrypted Data

If you discover encrypted data in a production, the next step is to determine your strategy to decrypt it. Because companies have many options when it comes to encrypting their data, including proprietary solutions, there is no silver bullet that makes this process a simple one.

Encryption algorithms protect data by ensuring there are so many possible combinations of “keys” to unlock the data that the chance of someone guessing the right password is infinitesimal. Therefore, the key to cracking the code is to test many permutations of passwords as quickly as possible. Several techniques are commonly employed to decode protected data.

Apply brute-force techniques

The brute-force attack is a common method used to decrypt data. With a brute-force strategy, a dictionary is used to generate millions ' and sometimes billions or trillions ' of passwords. The algorithm then tries the passwords successively with the hope of winning the lottery. Current technology has exceeded 300 billion passwords (hashes) in one second. Most of these algorithms are capable of creating permutations of these potential passwords; for instance, many use reverse dictionaries that spell words backwards. In addition, forensics experts often use dictionaries keyed to specific industries or foreign languages as a separate algorithm to generate likely passwords.

Create a biographical dictionary

Many custodians rely on personal passwords. A biographical dictionary gathers a custodian's personal information, such as places they have lived, names of family members, birthdays, anniversaries and names of pets, for inclusion in an algorithm. Forensic experts often gather this information from social media profiles.

In this technique, the algorithm applies a list of previously identified passwords to the encrypted data. This technique can be particularly effective when the same custodians store data on multiple devices.

No matter which technique you select, remember it is possible that encrypted documents may show up unencrypted elsewhere in a collection. For example, a custodian may e-mail a document to herself or store a file in the cloud. Although the content of these duplicate files remains identical, the encryption process changes the hash value of the file when saved, which is the unique identifying digital “fingerprint.” Therefore, there is no easy way to search for unencrypted duplicates of encrypted documents in a collection. A failure to recognize these duplicate files can raise the opposing party's suspicion about the soundness of your data collection process come production time, if you choose to identify them all as non-producible.

From Ciphers to Computers

From the first century to modern times, savvy leaders have been encrypting information to protect secrets. Quick on their heels, decoders have worked diligently through the years to demystify messages. Today, technology-savvy service providers and information technology managers carry on this tradition through legal, but nonetheless challenging circumstances.

Successfully managing encrypted data depends on the processes and technology you use. With the right know-how and technology, it is possible to properly collect and decrypt data and proceed with its review. The key to achieving these goals on a reasonable timetable is communication: get the forensic experts, collection team, and information technology staff on the same page early.


Robert Wickstrom is vice president of client development at Consilio. He can be reached at [email protected].

On the battlefields of ancient Rome, Julius Caesar used a cipher that changed the order of the letters of the alphabet to secure messages he sent to his infantry leaders. Caesar shifted the letters in his messages by three, such that the letter A would become D, safely encoding them from his mostly illiterate foes. Any messages the enemy intercepted were likely presumed to be written in a strange foreign language.

Since this first reported use of cryptography in 50 B.C., encryption techniques have become far more sophisticated and prevalent. Therefore, it is likely that any data you collect in discovery will contain files ' if not entire computers, hard drives or mobile devices ' that are encrypted, encoded or password-protected. And given the global nature of business, it is even more likely that some of these encrypted documents will be written in foreign languages, which adds another layer of complexity to document processing and review.

As a result, organizations should develop a strategy to determine whether encryption will be an issue as early as possible in a case. It is also important at the outset to develop a process for handling these very complicated document sets when time is of the essence and when local privacy laws must be considered.

A Typical Cross-Border Discovery Conundrum

Consider the following scenario:

You are a discovery manager for a company that is facing a short discovery deadline in a regulatory matter before a federal government agency. You have just arrived back in the United States after collecting data from a foreign-based company with documents in several languages, including Mandarin and Cantonese, at the direction of outside counsel. You expect you will need specialized language skills to translate the foreign-language documents. But you did not expect most of these documents to land on an exception report ' the list of files that your service provider could not extract full data or metadata from ' when it attempted to process the documents for review. Typically, files listed on an exception report include documents that need optical character recognition (OCR), zero-byte files (files that were never created), and structured files with tables in them, such as Microsoft Access databases. However, your exception list contains a fourth type of documents that you did not expect: files that the company's document custodians have encrypted or protected with passwords. With the discovery deadline just days away, it seems impossible that you will be able to figure out how to decrypt these files to enable their content to be reviewed, prepared, and produced in time.

What could the company and its counsel have done to prevent the surprise of encrypted data? And once this type of data is discovered, are there any efficient methods to process it? This article examines the best practices that will help you answer these questions.

How to Prevent the Surprise of Encryption

How can organizations and their outside counsel manage encrypted documents before seeing them on an exception report? Here is a suggested checklist of steps to undertake before beginning the data-collection process (adjust accordingly for each matter).

Ask questions

The first step during the pre-assessment stage is to ask custodians, business managers and IT personnel three critical questions:

  1. Do you use passwords to encrypt files or log on to any sites?
  2. Have you ever encrypted hard drives or seen encrypted hard drives (internal/external drives) at your company, (i.e., “is your computer or laptop encrypted”)?
  3. Do you know whether your company uses encryption for passwords, e-mails or documents?

These questions should be included in all of your pre-collection conversation with anyone you speak to at the company. The answers to these questions notify outside counsel and IT personnel that they should build in time to the discovery schedule to decrypt the data. Planning for this obstacle in advance can avoid costly delays and ensure compliance with tight discovery deadlines.

When asking these questions, it is also important to recognize the cultural norms of the country where you are collecting the data. For several countries, the issue of data encryption is a sensitive one, and people living within these countries will likely be less open to discussing questions about the use of encryption or passwords. Diplomatically work with general counsel and management to communicate the rationale behind the questions and that they are not meant to be intrusive or personal. Your objective is not to obtain the decryption key or passwords ' it is to assess the data landscape and the amount of work required to process that data.

Check the applicable law

Many countries, particularly in the European Union and the Asia-Pacific region, have enacted privacy laws that address the transport of private data outside of a jurisdiction.

Gather passwords

Check with managers and legal for a list of passwords accumulated on previous matters. Test these passwords against encrypted and password protected data first.

Decide on your approach

Your approach to managing encrypted data will depend on both your budget as well as the importance of the documents. You have two choices. The first, and often less expensive, option is to assemble a collaborative team composed of outside counsel, forensics experts, business managers and IT representatives who can work together to create a strategy for decrypting the data prior to processing. Spending some hours with the company's information security specialists can often save a significant amount of time. The second option, described in the next section, is to retain a specialist in forensic techniques to decode the documents. While effective, this process adds a layer of complexity, time and expense.

How to Manage Encrypted Data

If you discover encrypted data in a production, the next step is to determine your strategy to decrypt it. Because companies have many options when it comes to encrypting their data, including proprietary solutions, there is no silver bullet that makes this process a simple one.

Encryption algorithms protect data by ensuring there are so many possible combinations of “keys” to unlock the data that the chance of someone guessing the right password is infinitesimal. Therefore, the key to cracking the code is to test many permutations of passwords as quickly as possible. Several techniques are commonly employed to decode protected data.

Apply brute-force techniques

The brute-force attack is a common method used to decrypt data. With a brute-force strategy, a dictionary is used to generate millions ' and sometimes billions or trillions ' of passwords. The algorithm then tries the passwords successively with the hope of winning the lottery. Current technology has exceeded 300 billion passwords (hashes) in one second. Most of these algorithms are capable of creating permutations of these potential passwords; for instance, many use reverse dictionaries that spell words backwards. In addition, forensics experts often use dictionaries keyed to specific industries or foreign languages as a separate algorithm to generate likely passwords.

Create a biographical dictionary

Many custodians rely on personal passwords. A biographical dictionary gathers a custodian's personal information, such as places they have lived, names of family members, birthdays, anniversaries and names of pets, for inclusion in an algorithm. Forensic experts often gather this information from social media profiles.

In this technique, the algorithm applies a list of previously identified passwords to the encrypted data. This technique can be particularly effective when the same custodians store data on multiple devices.

No matter which technique you select, remember it is possible that encrypted documents may show up unencrypted elsewhere in a collection. For example, a custodian may e-mail a document to herself or store a file in the cloud. Although the content of these duplicate files remains identical, the encryption process changes the hash value of the file when saved, which is the unique identifying digital “fingerprint.” Therefore, there is no easy way to search for unencrypted duplicates of encrypted documents in a collection. A failure to recognize these duplicate files can raise the opposing party's suspicion about the soundness of your data collection process come production time, if you choose to identify them all as non-producible.

From Ciphers to Computers

From the first century to modern times, savvy leaders have been encrypting information to protect secrets. Quick on their heels, decoders have worked diligently through the years to demystify messages. Today, technology-savvy service providers and information technology managers carry on this tradition through legal, but nonetheless challenging circumstances.

Successfully managing encrypted data depends on the processes and technology you use. With the right know-how and technology, it is possible to properly collect and decrypt data and proceed with its review. The key to achieving these goals on a reasonable timetable is communication: get the forensic experts, collection team, and information technology staff on the same page early.


Robert Wickstrom is vice president of client development at Consilio. He can be reached at [email protected].

Read These Next
COVID-19 and Lease Negotiations: Early Termination Provisions Image

During the COVID-19 pandemic, some tenants were able to negotiate termination agreements with their landlords. But even though a landlord may agree to terminate a lease to regain control of a defaulting tenant's space without costly and lengthy litigation, typically a defaulting tenant that otherwise has no contractual right to terminate its lease will be in a much weaker bargaining position with respect to the conditions for termination.

How Secure Is the AI System Your Law Firm Is Using? Image

What Law Firms Need to Know Before Trusting AI Systems with Confidential Information In a profession where confidentiality is paramount, failing to address AI security concerns could have disastrous consequences. It is vital that law firms and those in related industries ask the right questions about AI security to protect their clients and their reputation.

Authentic Communications Today Increase Success for Value-Driven Clients Image

As the relationship between in-house and outside counsel continues to evolve, lawyers must continue to foster a client-first mindset, offer business-focused solutions, and embrace technology that helps deliver work faster and more efficiently.

Pleading Importation: ITC Decisions Highlight Need for Adequate Evidentiary Support Image

The International Trade Commission is empowered to block the importation into the United States of products that infringe U.S. intellectual property rights, In the past, the ITC generally instituted investigations without questioning the importation allegations in the complaint, however in several recent cases, the ITC declined to institute an investigation as to certain proposed respondents due to inadequate pleading of importation.

The Power of Your Inner Circle: Turning Friends and Social Contacts Into Business Allies Image

Practical strategies to explore doing business with friends and social contacts in a way that respects relationships and maximizes opportunities.