Law.com Subscribers SAVE 30%

Call 855-808-4530 or email [email protected] to receive your discount on a new subscription.

e-Document Conversion & Native Document Review

By Kristin M. Nimsger and Michele C.S. Lange
December 01, 2003

Today, at the blink of an eye, documents, images, and communication are transmitted across the country and across the globe via a web of networked computers. A recent study conducted by the University of California Berkley found that in 2002, people around the world created enough new information to fill 500,000 U.S. Libraries of Congress. Plus there was a 30% increase in stored information from 1999, the last time the same study was conducted. The increase of new data per person works out to be the equivalent of a stack of books 30 feet high (www.cnn.com/2003/TECH/ptech/10/29/information.study.reut/index.html). All of these electronic transactions are creating “footprints” on hard drives, backup tapes, and other media sources that courts have held are discoverable in civil litigation. The process of collecting, searching, and producing these e-data trails is called electronic discovery.

One of the emerging debates in the legal community on the topic of e-discovery is whether electronic documents and e-mail should be converted to a “uniform” format (such as a .tiff) or whether they should be kept in their native format for document review and production. This decision, which often occurs at the beginning of an e-discovery project, can affect almost every aspect of the e-discovery process going forward – preservation, metadata, searchability and cost. For example, if electronic data is not captured and accessed properly, the content of the document or the valuable metadata, behind the scenes data about the data, can be altered or destroyed.

Given that e-discovery is still rather new to some litigation attorneys, some misunderstandings about the technology behind converted file review and native file review are circulating. This article will clarify and answer some of the questions that have arisen surrounding the review and production of data in a standard file format or in its native format in electronic document discovery.

What Are the e-Discovery File Format Options?

As best practices emerge for electronic discovery projects, attorneys and litigation support professionals are finding that keeping electronic files in an electronic form for review and production can save time and money over old-fashioned paper review. Also, a wealth of important and discoverable information is available only when electronic documents are kept in electronic form. The days where U-Haul trucks delivered thousands of boxes of paper documents for review by hundreds of temporary attorneys are starting to go the way of the dinosaur. Today, review teams sit at rows of computer terminals reviewing, annotating and searching documents with the click of a mouse.

Electronic document review can generally occur in three ways: 1) using a local database (like Summation or Concordance), 2) looking at a collection of “loose” electronic files (.tiffs, .pdfs, or “native” files) on a CD-Rom or DVD, or 3) via an online document review repository – a Web-based tool into which the data files have been loaded, usually after they have been converted to a standard file format for viewing, categorization, and searching.

Whether the documents and e-mail are reviewed “natively” or in converted form in an online repository, the first step in the e-discovery process is always data collection. The media containing the potentially responsive data is gathered and the files in their original form are accessed. If counsel is conducting a “native” file review, the files are copied to another type of media (usually CD-Rom or DVD) for review. If counsel is using an online repository, the text of each document is extracted using high-speed processing tools, converted to a standard format (.tiff or .pdf), and placed in a database for online searching and review. There are also some new “hybrid” products that permit online review of native files, as long as the user has all of the native applications loaded locally on the computer where they are conducting the review.

Native data refers to documents still in the original file format in which they were created (ie, in the specific software applications used to create each individual document). For example, a native Microsoft Word file is a file in .doc format. This Microsoft Word file can only be opened, viewed, or modified in Microsoft Word or a compatible program to which it can be converted (such as another word processing program which can understand Microsoft Word files). When attorneys review files in their native formats, the files are typically copied from the hard drive of the person who created them to a CD-Rom or DVD so they can be reviewed by the attorneys in the case. In this case, the native document has not been processed or converted to a standard file type such as .tiff or .pdf. in order to facilitate viewing, review or production.

Some people believe that reviewing native files is somehow more holistic, or that it offers some “special” information, or that it is more cost effective than reviewing documents converted to .tiff or .pdf ' often because the files are reviewed exactly as they were created and no data conversion is needed. Others believe that conversion of files to a uniform format is the best choice because it allows for all of the advantages of high-speed processing technology, and there is no need for “native” applications on every computer used in the review. In short, many believe that document conversion is the fastest and most inexpensive method for narrowing down thousands and millions of electronic documents for review and ultimately for finding and producing responsive documents. The reality is that there are advantages and disadvantages to both native file review and converted file format review.

Examining Native Review: Strengths and Weaknesses

There are definitely pitfalls associated with reviewing files in their native format. First, native data must be viewed using the software application that was used to create the document including the proper version of the software. When conducting a document review with more than one person (or hundreds or people, as is often the case), the correct applications must be purchased and loaded on every machine used for document reviewing. This means that the firm needs to pay for and install dozens of applications (eg, MS Word, MS Outlook, MS Excel, WordPerfect, etc.) and all of the corresponding versions (97, 2000, XP, 2003, v3.0, v4.0, etc.) simply for the review to get underway.

Once the document review is started, other concerns arise. First, with native data, reviewers cannot search across the universe of documents without the assistance of a database of extracted text for each document. This means that each document must be scanned and the text extracted out using optical character recognition (OCR) or similar technology. The extracted text is then placed in a database for searching. Without this text extraction process, the limitations of native data searching are especially problematic when one wants to search the text of an e-mail's attachments, since most software packages do not provide this functionality. In addition, a reviewer cannot search the metadata of native data and some metadata, such as bcc's on e-mail, are not visible in their native applications.

Even more important than the inconvenience of native file searching is the potential of altering the native files. For example, word processing programs typically store “last modified” and “last accessed” dates in each document's metadata. The simple act of opening the file for review, even when no changes are made, will likely change the document's last accessed date, compromising the authenticity of the document.

Lastly, because native file applications were not built for e-document review, some of the standard functionality built into an online repository, such as creating redactions and annotations, is not available. Bates number overlays also cannot be placed on native files. Further, in many native programs, documents cannot easily be sorted and categorized for responsiveness and privilege.

On the other side of the debate are the benefits associated with native data review 'one of the main ones is cost savings. The producing party does not have to pay for conversion of the data to a common file format. In addition, a native document appears exactly as it appeared to the custodian who created and maintained the document. Reviewers can see all of the application's hidden features, such as spreadsheet formulas, tracked changes and links between documents, which might be lost when converting to a standard file format.

Examining Online Review: Strengths and Weaknesses

There are also advantages and disadvantages when files are converted and imported into an online repository for review. One of the concerns associated with converted file review is cost, given that most of the time an electronic evidence expert using proprietary conversion technology is needed to complete the work. In deciding which is the most cost-effective for the case at hand, compare the total cost of the native review to the total cost quoted by the electronic evidence expert, including the data load and hosting charges associated with the online repository effort. Sometimes, the costs associated with purchasing each native software application, on a computer-by-computer basis, can outweigh the costs associated with the conversion and online review process.

Next, if conversion is chosen, care should be taken to ensure that the entity performing the conversion preserves all of this important information, and that they optimize the conversion of the documents by expanding hidden columns within Microsoft Excel, etc., to capture all of the pertinent information in the conversion process. These are often limitations when data is converted and reviewed using an online repository.

Despite these challenges, there are several advantages in choosing file conversion and online repository review. First, the security and spoliation problems associated with native file review are eliminated if the data is converted to a common file format. All of the metadata is captured in the conversion process and displayed in a sortable and searchable table within the repository tool. Also, the online repository tool handles all of the security issues, ensuring that documents are not altered or compromised in any way during the review.

In addition, the searching constraints associated with native file review do not arise if the data is converted to a common file format. In fact, most online repositories offer robust Boolean and cutting edge concept searching, allowing reviewers to quickly target responsive or “hot” documents. Another benefit of using an online repository is the ability to create automated document logs (such as privilege logs) and run other reports to monitor the progress of the review. Lastly, robust online repository tools offer full-scale redaction and annotation capabilities, allowing counsel to review a document, while at the same time create notes and redactions on the document and build an entry in the privilege log.

Conclusion

In sum, the explosion of electronic evidence is forcing attorneys from all practice areas to become well-versed in technological processes for managing, reviewing and producing electronic documents and e-mail. The issue of reviewing and producing electronic documents in native versus converted form is still developing. In most cases, the limitations associated with native file review ' potential for spoliation, searching limitations and inability to redact ' is driving a choice of review method toward data conversion and online repository review. However, attorneys with cases involving electronic documents and e-mail must continue to be attentive to changes in the e-discovery tools and technology in order to best represent their clients.



Kristin M. Nimsger Michele C.S. Lange www.krollontrack.com

Today, at the blink of an eye, documents, images, and communication are transmitted across the country and across the globe via a web of networked computers. A recent study conducted by the University of California Berkley found that in 2002, people around the world created enough new information to fill 500,000 U.S. Libraries of Congress. Plus there was a 30% increase in stored information from 1999, the last time the same study was conducted. The increase of new data per person works out to be the equivalent of a stack of books 30 feet high (www.cnn.com/2003/TECH/ptech/10/29/information.study.reut/index.html). All of these electronic transactions are creating “footprints” on hard drives, backup tapes, and other media sources that courts have held are discoverable in civil litigation. The process of collecting, searching, and producing these e-data trails is called electronic discovery.

One of the emerging debates in the legal community on the topic of e-discovery is whether electronic documents and e-mail should be converted to a “uniform” format (such as a .tiff) or whether they should be kept in their native format for document review and production. This decision, which often occurs at the beginning of an e-discovery project, can affect almost every aspect of the e-discovery process going forward – preservation, metadata, searchability and cost. For example, if electronic data is not captured and accessed properly, the content of the document or the valuable metadata, behind the scenes data about the data, can be altered or destroyed.

Given that e-discovery is still rather new to some litigation attorneys, some misunderstandings about the technology behind converted file review and native file review are circulating. This article will clarify and answer some of the questions that have arisen surrounding the review and production of data in a standard file format or in its native format in electronic document discovery.

What Are the e-Discovery File Format Options?

As best practices emerge for electronic discovery projects, attorneys and litigation support professionals are finding that keeping electronic files in an electronic form for review and production can save time and money over old-fashioned paper review. Also, a wealth of important and discoverable information is available only when electronic documents are kept in electronic form. The days where U-Haul trucks delivered thousands of boxes of paper documents for review by hundreds of temporary attorneys are starting to go the way of the dinosaur. Today, review teams sit at rows of computer terminals reviewing, annotating and searching documents with the click of a mouse.

Electronic document review can generally occur in three ways: 1) using a local database (like Summation or Concordance), 2) looking at a collection of “loose” electronic files (.tiffs, .pdfs, or “native” files) on a CD-Rom or DVD, or 3) via an online document review repository – a Web-based tool into which the data files have been loaded, usually after they have been converted to a standard file format for viewing, categorization, and searching.

Whether the documents and e-mail are reviewed “natively” or in converted form in an online repository, the first step in the e-discovery process is always data collection. The media containing the potentially responsive data is gathered and the files in their original form are accessed. If counsel is conducting a “native” file review, the files are copied to another type of media (usually CD-Rom or DVD) for review. If counsel is using an online repository, the text of each document is extracted using high-speed processing tools, converted to a standard format (.tiff or .pdf), and placed in a database for online searching and review. There are also some new “hybrid” products that permit online review of native files, as long as the user has all of the native applications loaded locally on the computer where they are conducting the review.

Native data refers to documents still in the original file format in which they were created (ie, in the specific software applications used to create each individual document). For example, a native Microsoft Word file is a file in .doc format. This Microsoft Word file can only be opened, viewed, or modified in Microsoft Word or a compatible program to which it can be converted (such as another word processing program which can understand Microsoft Word files). When attorneys review files in their native formats, the files are typically copied from the hard drive of the person who created them to a CD-Rom or DVD so they can be reviewed by the attorneys in the case. In this case, the native document has not been processed or converted to a standard file type such as .tiff or .pdf. in order to facilitate viewing, review or production.

Some people believe that reviewing native files is somehow more holistic, or that it offers some “special” information, or that it is more cost effective than reviewing documents converted to .tiff or .pdf ' often because the files are reviewed exactly as they were created and no data conversion is needed. Others believe that conversion of files to a uniform format is the best choice because it allows for all of the advantages of high-speed processing technology, and there is no need for “native” applications on every computer used in the review. In short, many believe that document conversion is the fastest and most inexpensive method for narrowing down thousands and millions of electronic documents for review and ultimately for finding and producing responsive documents. The reality is that there are advantages and disadvantages to both native file review and converted file format review.

Examining Native Review: Strengths and Weaknesses

There are definitely pitfalls associated with reviewing files in their native format. First, native data must be viewed using the software application that was used to create the document including the proper version of the software. When conducting a document review with more than one person (or hundreds or people, as is often the case), the correct applications must be purchased and loaded on every machine used for document reviewing. This means that the firm needs to pay for and install dozens of applications (eg, MS Word, MS Outlook, MS Excel, WordPerfect, etc.) and all of the corresponding versions (97, 2000, XP, 2003, v3.0, v4.0, etc.) simply for the review to get underway.

Once the document review is started, other concerns arise. First, with native data, reviewers cannot search across the universe of documents without the assistance of a database of extracted text for each document. This means that each document must be scanned and the text extracted out using optical character recognition (OCR) or similar technology. The extracted text is then placed in a database for searching. Without this text extraction process, the limitations of native data searching are especially problematic when one wants to search the text of an e-mail's attachments, since most software packages do not provide this functionality. In addition, a reviewer cannot search the metadata of native data and some metadata, such as bcc's on e-mail, are not visible in their native applications.

Even more important than the inconvenience of native file searching is the potential of altering the native files. For example, word processing programs typically store “last modified” and “last accessed” dates in each document's metadata. The simple act of opening the file for review, even when no changes are made, will likely change the document's last accessed date, compromising the authenticity of the document.

Lastly, because native file applications were not built for e-document review, some of the standard functionality built into an online repository, such as creating redactions and annotations, is not available. Bates number overlays also cannot be placed on native files. Further, in many native programs, documents cannot easily be sorted and categorized for responsiveness and privilege.

On the other side of the debate are the benefits associated with native data review 'one of the main ones is cost savings. The producing party does not have to pay for conversion of the data to a common file format. In addition, a native document appears exactly as it appeared to the custodian who created and maintained the document. Reviewers can see all of the application's hidden features, such as spreadsheet formulas, tracked changes and links between documents, which might be lost when converting to a standard file format.

Examining Online Review: Strengths and Weaknesses

There are also advantages and disadvantages when files are converted and imported into an online repository for review. One of the concerns associated with converted file review is cost, given that most of the time an electronic evidence expert using proprietary conversion technology is needed to complete the work. In deciding which is the most cost-effective for the case at hand, compare the total cost of the native review to the total cost quoted by the electronic evidence expert, including the data load and hosting charges associated with the online repository effort. Sometimes, the costs associated with purchasing each native software application, on a computer-by-computer basis, can outweigh the costs associated with the conversion and online review process.

Next, if conversion is chosen, care should be taken to ensure that the entity performing the conversion preserves all of this important information, and that they optimize the conversion of the documents by expanding hidden columns within Microsoft Excel, etc., to capture all of the pertinent information in the conversion process. These are often limitations when data is converted and reviewed using an online repository.

Despite these challenges, there are several advantages in choosing file conversion and online repository review. First, the security and spoliation problems associated with native file review are eliminated if the data is converted to a common file format. All of the metadata is captured in the conversion process and displayed in a sortable and searchable table within the repository tool. Also, the online repository tool handles all of the security issues, ensuring that documents are not altered or compromised in any way during the review.

In addition, the searching constraints associated with native file review do not arise if the data is converted to a common file format. In fact, most online repositories offer robust Boolean and cutting edge concept searching, allowing reviewers to quickly target responsive or “hot” documents. Another benefit of using an online repository is the ability to create automated document logs (such as privilege logs) and run other reports to monitor the progress of the review. Lastly, robust online repository tools offer full-scale redaction and annotation capabilities, allowing counsel to review a document, while at the same time create notes and redactions on the document and build an entry in the privilege log.

Conclusion

In sum, the explosion of electronic evidence is forcing attorneys from all practice areas to become well-versed in technological processes for managing, reviewing and producing electronic documents and e-mail. The issue of reviewing and producing electronic documents in native versus converted form is still developing. In most cases, the limitations associated with native file review ' potential for spoliation, searching limitations and inability to redact ' is driving a choice of review method toward data conversion and online repository review. However, attorneys with cases involving electronic documents and e-mail must continue to be attentive to changes in the e-discovery tools and technology in order to best represent their clients.



Kristin M. Nimsger Michele C.S. Lange www.krollontrack.com
Read These Next
Strategy vs. Tactics: Two Sides of a Difficult Coin Image

With each successive large-scale cyber attack, it is slowly becoming clear that ransomware attacks are targeting the critical infrastructure of the most powerful country on the planet. Understanding the strategy, and tactics of our opponents, as well as the strategy and the tactics we implement as a response are vital to victory.

Major Differences In UK, U.S. Copyright Laws Image

This article highlights how copyright law in the United Kingdom differs from U.S. copyright law, and points out differences that may be crucial to entertainment and media businesses familiar with U.S law that are interested in operating in the United Kingdom or under UK law. The article also briefly addresses contrasts in UK and U.S. trademark law.

The Article 8 Opt In Image

The Article 8 opt-in election adds an additional layer of complexity to the already labyrinthine rules governing perfection of security interests under the UCC. A lender that is unaware of the nuances created by the opt in (may find its security interest vulnerable to being primed by another party that has taken steps to perfect in a superior manner under the circumstances.

Fresh Filings Image

Notable recent court filings in entertainment law.

Removing Restrictive Covenants In New York Image

In Rockwell v. Despart, the New York Supreme Court, Third Department, recently revisited a recurring question: When may a landowner seek judicial removal of a covenant restricting use of her land?