Law.com Subscribers SAVE 30%

Call 855-808-4530 or email [email protected] to receive your discount on a new subscription.

Increasing Speed and Confidence in Second Request Responses with New Technologies

By David J. Laing
September 29, 2011

Responding to Hart-Scott-Rodino Act Requests for Additional Information and Documentary Materials (more commonly known as “Second Requests”) presents substantial challenges in assembling a comprehensive and complete production of requested information and documents from company archives. The schedule is always limited, and the results must always be defensible against government challenge that the Second Request response is inadequate. Moving Second Request document productions forward rapidly, without sacrificing quality, can determine the success of the transaction.

Recently, while helping a client complete its Second Request response, Baker & McKenzie deployed predictive coding technology. Predictive coding, or document prioritization, is a process by which, using direction from as few as one attorney reviewing documents, software is able to apply that direction across an entire corpus of documents, coding a large body of documents at a fraction of the time and cost of individual document review. Baker deployed this technology to leverage the knowledge of its legal team and to decrease the time required to select documents for production in response to a U.S. Department of Justice (DOJ) Second Request. Results of this work not only helped the client complete its transaction on schedule, but also provided a model for future work on similar projects.

The Challenge

In response to a Second Request from DOJ, the client collected approximately 650 gigabytes of electronically stored information (ESI), estimated to be more than 3 million documents that contained potentially responsive documents and information. Unlike some litigation matters, neither the DOJ nor the client believed that significant information would be obtained through forensic analysis of deleted files, file fragments and slack space; the key objective was a comprehensive review of potentially responsive, readily available active data.

Baker retained Epiq Systems to process and host this ESI. In addition to conventional filters for culling out system and binary files, Epiq helped cull information in other, more nuanced, ways. Epiq worked with the Baker legal team to identify and remove “junk” e-mail messages from domains like Ebay.com and espn.com that could be attacked as a group. Epiq's consultants found additional patterns of “junk identification” to further reduce the volume of documents requiring review. Even with these techniques, Baker was left with more than two million documents that required substantive review with an eight-week deadline.

In coordination with counsel representing the other party in the transaction, Baker, representing the purchaser and the much larger of the two companies, began document review using traditional methods to identify documents of greater potential relevance. A review team comprising 60 lawyers focused first on documents and e-mail messages harvested from specific corporate employees and used a variety of keyword searches to lasso documents of initial interest. Epiq also used near-duplicate identification and e-mail threading solutions to group substantially similar and related documents, so that relevance determinations could be applied to larger chains of documents as appropriate. The near-duplication and threading technology employed is a product of Equivio, an e-discovery applications provider.

Deploying Predictive Coding to Speed Review

Even with the best efforts of legal teams in the proposed transaction, it became clear that it would be not be possible to complete review of the two disparate document collections within the schedule. To help complete the project within its original timeframe, Epiq suggested deploying its Equivio-based predictive relevance application called IQ Review'. Because Epiq was already hosting the documents, this technology could be quickly applied to all remaining documents without sacrificing the prior work product. In addition, the ongoing review could still continue, even as batches of documents were assembled through IQ Review.

Though classification technology has been available in other industries since the 1990s, predictive coding has only recently become both affordable and accessible in the e-discovery space. Acceptance of predictive coding for use in litigation has grown tremendously in the past year. Empirical testing, such as the National Institute of Standards and Technology's TREC Legal Track study, has demonstrated that these tools have matched if not exceeded traditional all-human subjective document review efforts to consistently identify relevant documents, while excluding irrelevant materials from review. In addition, federal district court judges have been noting the limitations of keyword and traditional linear document for some time, and are becoming increasingly active about asking litigants whether they are taking advantage of these new tools. (See, e.g., Disability Rights Council of Greater Washington v. Washington Metropolitan Transit Authority, 242 F.R.D. 139 (D.D.C. 2007) (“I bring to the parties' attention recent scholarship that argues that concept searching, as opposed to keyword searching, is more efficient and more likely to produce the most comprehensive results.”))

In this case, with 750,000 documents remaining and a very tight schedule, IQ Review offered a compelling alternative to adding yet more human document reviewers (who would not have had prior experience with this matter) and authorizing yet more overtime. Implementation of the predictive coding technology was straightforward: a senior attorney with thorough knowledge of the nuances of the case reviewed small batches of sample documents, rating them as responsive or not responsive. After each batch, the Equivio>Relevance engine behind IQ Review compared the expert's classifications with its own predictions, while constantly tuning its ability to assess document relevance for the case. When it found that it could learn nothing more from the document collection, the system terminated the training process, and applied its analysis to the entire document collection. In this case, approximately 40 sample batches of documents were required to achieve analytical stability ' approximately 10 hours of attorney time. This relatively small time investment provided valuable insight into the collection.

Results of Predictive Coding

Of the remaining 750,000 documents, only a small percentage (about 20%) received a high relevance score by the predictive coding engine. The majority of the documents, about 60% of the total remaining volume, received a very low score, suggesting that they were highly unlikely to be relevant. Documents between the very high and very low scoring clusters ' approximately 20% of the remaining documents ' were less likely to be substantively responsive to DOJ's information requests, but some of them would likely be relevant based on a technical reading of DOJ's information requests. These results were consistent with those that the 60-attorney review team had obtained on the standard document-by-document review.

Based on this analysis, the review team moved forward on three separate tracks. Reviewing documents with the highest Equivio>Relevance scores quickly proved them to have near-universal relevance to the DOJ requests. Every document in that top grouping was included in the production to DOJ. Documents in the median cluster that had substantially lower Equivio>Relevance scores were individually reviewed, though only relatively few documents were ultimately selected for production from that batch of material. For the lowest ranked documents, two separate methods were used to probe for potentially relevant materials. First, a variety of keyword search terms were used to look for any mention of these terms. Second, every 500th document in this subset was individually reviewed by a member of the review team. These two processes provided a quality-control verification of the predictive coding technology's results. Neither approach identified any responsive documents for production, and these materials were set aside.

Based on the predictive coding results, a large number of potentially relevant documents were defensibly excluded from time-consuming individual review. This permitted the team to focus on those documents that had a much greater likelihood of actual relevance and enabling the completion of the review on time and with substantially less effort than had been spent reviewing documents without the predictive coding ranking. Using predictive coding, review of the final 35% of the document collection took only about 10% of the total project time. The final 35% of the project incurred approximately 5% of the total project costs. Even with initial system training and substantial quality control throughout the review, the teams estimated that the review could have been completed in less than half the time required by standard document review had predictive coding been applied from the outset of the project.

Post-project review also determined that predictive coding had provided noticeably greater consistency in document classification. While quality control auditing of the “traditional review” portion of the document review found significant variability between the relevance determinations made by the 60 attorneys on the review team, documents organized through IQ Review were much more likely to be grouped consistently.

Conclusion

The final measure of a company's Second Request response is how it is received by the inquiring government entity. Stakes were high, as this transaction relied upon predictive coding technology to cull voluminous ESI that otherwise could not have been meaningfully reviewed in the time permitted to prepare responses. Based on its review of the materials provided, the DOJ raised no issues as to the sufficiency of the document production from either party, and it sought no additional evidentiary materials. Even more importantly, after receiving all comprehensive information about the proposed transaction, including much more than the Second Request materials, the DOJ made no material objections, and the transaction successfully closed as planned. While it is possible that the DOJ would have reached the same conclusion even if it had received a less comprehensive submission from the parties, the use of predictive coding to identify materials responsive to the government's Second Request greatly increased the parties' confidence in the quality of their submission and also substantially reduced the overall cost of this component of the transaction.

Neither the DOJ nor the Federal Trade Commission have expressly accepted or endorsed the use of predictive coding for Second Request document responses. The agencies have asserted their ability to approve keyword search terms for the identification, and limitation, of documents extracted from a company's ESI systems. Once extracted, the identification of responsive documents is a privileged process that the agencies do not have the right to approve, whether that process is individual document review, keyword searching or predictive technology. As more courts recognize that “concept searching, as opposed to keyword searching, is more efficient and more likely to produce the most comprehensive results,” the antitrust agencies should also recognize the increased accuracy and comprehensiveness provided by predictive coding technologies.


David J. Laing is a Principal in Baker & McKenzie LLP. He works in all areas of antitrust law, including antitrust and other regulations affecting of mergers and acquisitions. Laing previously was a Trial Attorney in U.S. Department of Justice, Antitrust Division, and was a Special Assistant U.S, Attorney.

Responding to Hart-Scott-Rodino Act Requests for Additional Information and Documentary Materials (more commonly known as “Second Requests”) presents substantial challenges in assembling a comprehensive and complete production of requested information and documents from company archives. The schedule is always limited, and the results must always be defensible against government challenge that the Second Request response is inadequate. Moving Second Request document productions forward rapidly, without sacrificing quality, can determine the success of the transaction.

Recently, while helping a client complete its Second Request response, Baker & McKenzie deployed predictive coding technology. Predictive coding, or document prioritization, is a process by which, using direction from as few as one attorney reviewing documents, software is able to apply that direction across an entire corpus of documents, coding a large body of documents at a fraction of the time and cost of individual document review. Baker deployed this technology to leverage the knowledge of its legal team and to decrease the time required to select documents for production in response to a U.S. Department of Justice (DOJ) Second Request. Results of this work not only helped the client complete its transaction on schedule, but also provided a model for future work on similar projects.

The Challenge

In response to a Second Request from DOJ, the client collected approximately 650 gigabytes of electronically stored information (ESI), estimated to be more than 3 million documents that contained potentially responsive documents and information. Unlike some litigation matters, neither the DOJ nor the client believed that significant information would be obtained through forensic analysis of deleted files, file fragments and slack space; the key objective was a comprehensive review of potentially responsive, readily available active data.

Baker retained Epiq Systems to process and host this ESI. In addition to conventional filters for culling out system and binary files, Epiq helped cull information in other, more nuanced, ways. Epiq worked with the Baker legal team to identify and remove “junk” e-mail messages from domains like Ebay.com and espn.com that could be attacked as a group. Epiq's consultants found additional patterns of “junk identification” to further reduce the volume of documents requiring review. Even with these techniques, Baker was left with more than two million documents that required substantive review with an eight-week deadline.

In coordination with counsel representing the other party in the transaction, Baker, representing the purchaser and the much larger of the two companies, began document review using traditional methods to identify documents of greater potential relevance. A review team comprising 60 lawyers focused first on documents and e-mail messages harvested from specific corporate employees and used a variety of keyword searches to lasso documents of initial interest. Epiq also used near-duplicate identification and e-mail threading solutions to group substantially similar and related documents, so that relevance determinations could be applied to larger chains of documents as appropriate. The near-duplication and threading technology employed is a product of Equivio, an e-discovery applications provider.

Deploying Predictive Coding to Speed Review

Even with the best efforts of legal teams in the proposed transaction, it became clear that it would be not be possible to complete review of the two disparate document collections within the schedule. To help complete the project within its original timeframe, Epiq suggested deploying its Equivio-based predictive relevance application called IQ Review'. Because Epiq was already hosting the documents, this technology could be quickly applied to all remaining documents without sacrificing the prior work product. In addition, the ongoing review could still continue, even as batches of documents were assembled through IQ Review.

Though classification technology has been available in other industries since the 1990s, predictive coding has only recently become both affordable and accessible in the e-discovery space. Acceptance of predictive coding for use in litigation has grown tremendously in the past year. Empirical testing, such as the National Institute of Standards and Technology's TREC Legal Track study, has demonstrated that these tools have matched if not exceeded traditional all-human subjective document review efforts to consistently identify relevant documents, while excluding irrelevant materials from review. In addition, federal district court judges have been noting the limitations of keyword and traditional linear document for some time, and are becoming increasingly active about asking litigants whether they are taking advantage of these new tools. ( See , e.g. , Disability Rights Council of Greater Washington v. Washington Metropolitan Transit Authority , 242 F.R.D. 139 (D.D.C. 2007) (“I bring to the parties' attention recent scholarship that argues that concept searching, as opposed to keyword searching, is more efficient and more likely to produce the most comprehensive results.”))

In this case, with 750,000 documents remaining and a very tight schedule, IQ Review offered a compelling alternative to adding yet more human document reviewers (who would not have had prior experience with this matter) and authorizing yet more overtime. Implementation of the predictive coding technology was straightforward: a senior attorney with thorough knowledge of the nuances of the case reviewed small batches of sample documents, rating them as responsive or not responsive. After each batch, the Equivio>Relevance engine behind IQ Review compared the expert's classifications with its own predictions, while constantly tuning its ability to assess document relevance for the case. When it found that it could learn nothing more from the document collection, the system terminated the training process, and applied its analysis to the entire document collection. In this case, approximately 40 sample batches of documents were required to achieve analytical stability ' approximately 10 hours of attorney time. This relatively small time investment provided valuable insight into the collection.

Results of Predictive Coding

Of the remaining 750,000 documents, only a small percentage (about 20%) received a high relevance score by the predictive coding engine. The majority of the documents, about 60% of the total remaining volume, received a very low score, suggesting that they were highly unlikely to be relevant. Documents between the very high and very low scoring clusters ' approximately 20% of the remaining documents ' were less likely to be substantively responsive to DOJ's information requests, but some of them would likely be relevant based on a technical reading of DOJ's information requests. These results were consistent with those that the 60-attorney review team had obtained on the standard document-by-document review.

Based on this analysis, the review team moved forward on three separate tracks. Reviewing documents with the highest Equivio>Relevance scores quickly proved them to have near-universal relevance to the DOJ requests. Every document in that top grouping was included in the production to DOJ. Documents in the median cluster that had substantially lower Equivio>Relevance scores were individually reviewed, though only relatively few documents were ultimately selected for production from that batch of material. For the lowest ranked documents, two separate methods were used to probe for potentially relevant materials. First, a variety of keyword search terms were used to look for any mention of these terms. Second, every 500th document in this subset was individually reviewed by a member of the review team. These two processes provided a quality-control verification of the predictive coding technology's results. Neither approach identified any responsive documents for production, and these materials were set aside.

Based on the predictive coding results, a large number of potentially relevant documents were defensibly excluded from time-consuming individual review. This permitted the team to focus on those documents that had a much greater likelihood of actual relevance and enabling the completion of the review on time and with substantially less effort than had been spent reviewing documents without the predictive coding ranking. Using predictive coding, review of the final 35% of the document collection took only about 10% of the total project time. The final 35% of the project incurred approximately 5% of the total project costs. Even with initial system training and substantial quality control throughout the review, the teams estimated that the review could have been completed in less than half the time required by standard document review had predictive coding been applied from the outset of the project.

Post-project review also determined that predictive coding had provided noticeably greater consistency in document classification. While quality control auditing of the “traditional review” portion of the document review found significant variability between the relevance determinations made by the 60 attorneys on the review team, documents organized through IQ Review were much more likely to be grouped consistently.

Conclusion

The final measure of a company's Second Request response is how it is received by the inquiring government entity. Stakes were high, as this transaction relied upon predictive coding technology to cull voluminous ESI that otherwise could not have been meaningfully reviewed in the time permitted to prepare responses. Based on its review of the materials provided, the DOJ raised no issues as to the sufficiency of the document production from either party, and it sought no additional evidentiary materials. Even more importantly, after receiving all comprehensive information about the proposed transaction, including much more than the Second Request materials, the DOJ made no material objections, and the transaction successfully closed as planned. While it is possible that the DOJ would have reached the same conclusion even if it had received a less comprehensive submission from the parties, the use of predictive coding to identify materials responsive to the government's Second Request greatly increased the parties' confidence in the quality of their submission and also substantially reduced the overall cost of this component of the transaction.

Neither the DOJ nor the Federal Trade Commission have expressly accepted or endorsed the use of predictive coding for Second Request document responses. The agencies have asserted their ability to approve keyword search terms for the identification, and limitation, of documents extracted from a company's ESI systems. Once extracted, the identification of responsive documents is a privileged process that the agencies do not have the right to approve, whether that process is individual document review, keyword searching or predictive technology. As more courts recognize that “concept searching, as opposed to keyword searching, is more efficient and more likely to produce the most comprehensive results,” the antitrust agencies should also recognize the increased accuracy and comprehensiveness provided by predictive coding technologies.


David J. Laing is a Principal in Baker & McKenzie LLP. He works in all areas of antitrust law, including antitrust and other regulations affecting of mergers and acquisitions. Laing previously was a Trial Attorney in U.S. Department of Justice, Antitrust Division, and was a Special Assistant U.S, Attorney.
Read These Next
Strategy vs. Tactics: Two Sides of a Difficult Coin Image

With each successive large-scale cyber attack, it is slowly becoming clear that ransomware attacks are targeting the critical infrastructure of the most powerful country on the planet. Understanding the strategy, and tactics of our opponents, as well as the strategy and the tactics we implement as a response are vital to victory.

'Huguenot LLC v. Megalith Capital Group Fund I, L.P.': A Tutorial On Contract Liability for Real Estate Purchasers Image

In June 2024, the First Department decided Huguenot LLC v. Megalith Capital Group Fund I, L.P., which resolved a question of liability for a group of condominium apartment buyers and in so doing, touched on a wide range of issues about how contracts can obligate purchasers of real property.

The Article 8 Opt In Image

The Article 8 opt-in election adds an additional layer of complexity to the already labyrinthine rules governing perfection of security interests under the UCC. A lender that is unaware of the nuances created by the opt in (may find its security interest vulnerable to being primed by another party that has taken steps to perfect in a superior manner under the circumstances.

Fresh Filings Image

Notable recent court filings in entertainment law.

CoStar Wins Injunction for Breach-of-Contract Damages In CRE Database Access Lawsuit Image

Latham & Watkins helped the largest U.S. commercial real estate research company prevail in a breach-of-contract dispute in District of Columbia federal court.