Law.com Subscribers SAVE 30%

Call 855-808-4530 or email [email protected] to receive your discount on a new subscription.

Technology-Assisted Review: One Size Doesn't Fit All

By Hope Swancy-Haslam
October 31, 2012

As data volumes increase year-after-year, counsel are focused on managing two key issues inherent in litigation: the cost and the time it takes to complete a large-volume document review. This article describes how leveraging technology to accelerate review, known as Technology-Assisted Review (TAR), is an effective tool for managing these issues. Further, it outlines the two key approaches to substantially accelerating review ' the artificial intelligence-based and language-based methodologies ' and discusses their relative benefits. Finally, the article recommends best practices for implementing each approach, according to case law, and details how to decide when to use each one.

Technology-Assisted Review Overview

Accelerating document review in litigation is a hot topic for a number of reasons, most notably because corporate-generated electronically stored information is growing at a rapid pace, while budgets remain unchanged. To make matters worse, the time constraints in litigation, discovery and document review are tightening from flooded, inflexible dockets. All of these factors play into a very real concern in litigation.

Historically, organizations have reviewed every document in a collection in order to minimize the perceived risk of producing privileged information or missing relevant data. But volumes are increasing such that this “linear review” is no longer a financially feasible approach.

Technology-assisted review has surfaced as a solution to these problems because of its ability to significantly decrease the time and expense of determining potential relevance of possibly millions of documents in a collection ' a process that can take up to 75% of the e-discovery budget. It often involves the interplay of humans and computers, and may use one or more technological
approaches that can include keyword search, Boolean querying, artificial intelligence, clustering, relevance ranking and/or sampling.

Despite clear benefits of TAR, apprehension persisted ' until recently. Important factors, such as the exact savings that can be provided, how the technology works, and, importantly, how competing offerings differ, remain perplexing issues for counsel as they determine the best approach to take. However, apprehension is waning due to recent opinions encouraging the use of TAR, and it is thus important for today's counsel to understand the technology and how to leverage the right alternative for the right problem.

Two Key Approaches

Today, two general TAR approaches are emerging ' one that leverages artificial intelligence to identify potentially relevant data in a document collection, and another that relies on a human's understanding of language to identify potentially relevant data. Both deliver significant savings in time and cost, but each approach has specific instances when its use is ideal.

For artificial intelligence-based TAR, two elements are most often in play: the need to arrive at quick decisions early in the litigation (assess the case early to decide whether to settle or litigate); and having enough time remaining available to read an average of 10,000 documents in order to train the system with a “seed set.” A language-based TAR approach is most appealing when transparency and insight into coding decisions are of paramount concern, when having the ability to audit reviewers in real time is important, and when an organization wants to incorporate this approach as a regular business practice.

The important point to note is that support for both approaches has been provided by recent court opinions, most notably Judge Andrew Peck's order in Da Silva Moore v. Publicis Groupe & MSL Group, No. 11 Civ. 1279 (ALC) (AJP) (S.D.N.Y. Feb. 24, 2012) and Kleen Products v. Packaging Corporation of America, Case No. 10 C 5711 (N.D. Ill. April 8, 2011).

In Da Silva, Judge Peck specifically holds that:

(Technology)-assisted review is
an acceptable way to search for
relevant ESI in appropriate cases.

This statement, given that an artificial intelligence-based approach was at issue, clearly gives us comfort in considering such an approach for expediting document review and minimizing its cost.

In Kleen, a case litigating the use of a language-based analytics workflow in document review, Judge Nan Nolan held for the producing party for a number of reasons, but specifically because their approach has been embraced by the court system for years. She specifically relies on Principle 6 of the Sedona Best Practices, Recommendations and Principles for Addressing Electronic Document Production in justifying her decision. Principle 6 directs that:

Responding parties are best situated to evaluate the procedures, methodologies, and technologies appropriate for preserving and producing their own electronically stored information. [Emphasis added.]

With this set of decisions in play, the runway for TAR is clear. Now, we must determine which approach is most appropriate for each case.

Choosing the Right Method

Generally, the makeup of your case and your data set will influence which TAR approach to take. Specific factors to consider include, but may not be limited to:

  • The estimated budget for the case;
  • The total amount in controversy;
  • The time allowed for producing responsive documents;
  • The volume of potentially-relevant data identified for document review; and
  • The need for transparency in support of your selection.

Regardless of the approach selected, particular attention must be given to The Sedona Conference Cooperation Proclamation before the approach is implemented. To emphasize this point, both Da Silva and Kleen reference the Proclamation as a key basis for their decisions. The Da Silva opinion provides:

Of course, the best approach to the use of computer-assisted coding (Technology-Assisted Review) is to follow the [Sedona Cooperation Proclamation] model. Advise opposing counsel that you plan to use computer-assisted coding and seek agreement; if you cannot, consider whether to abandon predictive coding for that case or go to the court for advance approval.

Da Silva Moore, 11 civ 1279 Slip Op., Feb. 24, 2012, at 5.

Without a showing that an agreement is in place, the ability to refute a challenge to your TAR protocol will likely be much more difficult.

Best Practices for Both Alternatives

Taking a look at an artificial intelligence-based approach first, it is important to document the following at the planning stage:

  • The parties' agreement;
  • The relative amount of ESI to be reviewed;
  • The superiority of an (artificial-intelligence based) review to the available alternatives;
  • The need for cost effectiveness and proportionality under Rule 26(b)(2)(C); and
  • The transparency of the process.

Once an agreement has been reached between the parties on this approach, the producing party should be able to address the following questions to support the results:

  • What was done to implement the agreed-upon process;
  • Why has that process produced a defensible result;
  • Were the documents used to train the system shared with opposing counsel in advance; and
  • Can a showing be made that sufficient quality control testing was done to validate the results.

Da Silva Moore, Slip Op. at 22.

Keep in mind that there can be a “blind spot” with this process in the crafting of the seed set. A few years ago, the common practice was to review as few as 500 documents in order to train the system on what to look for within the collection. In short, the system would say “show me some documents you are looking for (or parts of documents) and I'll find others like it.” With the volume of electronically stored information in litigation today comprising terabytes of data, a best practice now is to build a seed set of approximately 10,000 documents. This better ensures that all semantic patterns, which fuel the artificial intelligence by establishing associations between terms that occur in similar contexts, are captured ' thus helping protect you from under-inclusive results (missing critical data) or over-inclusive results (including too much data that isn't relevant into later rounds of review that yields greater downstream time and cost). Also, this task is of enough importance to the entire process that a senior-level attorney should be selecting these seed documents, so make sure you plan accordingly at the outset.

Contrast this with the language-based approach, which relies on human intelligence rather than artificial intelligence to drive the decision-making process. Again, this approach is often selected for cases where there are greater transparency requirements, often driven by the significance of the case to the producing party.

Overall, the focus is to assure that the relevant and potentially relevant documents are considered first. Meet with your consultant to discuss and commemorate the key issues in the case, and let them craft sample queries for document prioritization prior to review. Assign a senior attorney to apply these queries against the language in the data set to make synonym associations with other words. Based on these synonym associations, this workflow will assign the documents into the following categories and advance (or suppress) them accordingly.

  • “Highly relevant” documents:|
    • 10% of the data set.
    • Immediately advanced to Second-Pass Review.
  • “Might be relevant” documents|
    • 50% of the data set.
    • Reviewed in First-Pass Review.
  • “Not-relevant” documents|
    • 40% of the data.
    • Suppressed.

The documents marked as “might-be relevant” should be run through an eyes-on (first-pass) review where the potential relevance is weighed for each document. Consider leveraging technology that allows you to highlight the specific language within each document that the reviewer felt made the document relevant. This allows you to audit reviewer decisions and adjust the reviewer's understanding of the mater in real time, as well as “bulk tag” other documents containing similar language and reduce the remaining document collection by up to 50%. Further, make sure you sample the documents previously marked as “not relevant” and achieve some measurable certainty that no other potentially relevant documents were left behind. With the right methodology and testing, you can achieve a level of assurance of up to 99.9% that all responsive documents have been identified.

Finally, a clear and unique benefit of the language-based approach is the reusability of the work product. Particularly for those in highly regulated or litigious industries, make sure you don't approach each new matter as if it is your first. Save your work product from previous cases and build on it to drive even greater savings into future cases.

Regardless of which approach you choose, remember that implementing review acceleration technology and managing a case from beginning to end can be a difficult process and require resources that you may not have on staff. Consider retaining a technology and legal workflow expert to help you choose the right approach, and to supplement your current on-staff strengths. This will enhance the chances that a workflow is designed to meet the specific needs of your case and optimizes the results.

Conclusion

As we have learned from the opinions spinning out of the Da Silva and Kleen matters, technology-assisted review is fast becoming a standard in document review. Of course, not all TAR approaches are appropriate for all cases ' there is no one-size-fits-all solution. Be sure to select the right methodology for each unique problem, and take proactive steps to ensure that you achieve optimal results from your selected approach.


Hope Swancy-Haslam is Director of Analytics Market Development for RenewData. She has been delivering technology solutions to the legal market since 1992, including roles as Director of Electronic Discovery Services for a regional consulting firm based in Dallas, TX, as well as similar positions with Merrill Corporation and Engenium Corporation. She can be reached at [email protected].

As data volumes increase year-after-year, counsel are focused on managing two key issues inherent in litigation: the cost and the time it takes to complete a large-volume document review. This article describes how leveraging technology to accelerate review, known as Technology-Assisted Review (TAR), is an effective tool for managing these issues. Further, it outlines the two key approaches to substantially accelerating review ' the artificial intelligence-based and language-based methodologies ' and discusses their relative benefits. Finally, the article recommends best practices for implementing each approach, according to case law, and details how to decide when to use each one.

Technology-Assisted Review Overview

Accelerating document review in litigation is a hot topic for a number of reasons, most notably because corporate-generated electronically stored information is growing at a rapid pace, while budgets remain unchanged. To make matters worse, the time constraints in litigation, discovery and document review are tightening from flooded, inflexible dockets. All of these factors play into a very real concern in litigation.

Historically, organizations have reviewed every document in a collection in order to minimize the perceived risk of producing privileged information or missing relevant data. But volumes are increasing such that this “linear review” is no longer a financially feasible approach.

Technology-assisted review has surfaced as a solution to these problems because of its ability to significantly decrease the time and expense of determining potential relevance of possibly millions of documents in a collection ' a process that can take up to 75% of the e-discovery budget. It often involves the interplay of humans and computers, and may use one or more technological
approaches that can include keyword search, Boolean querying, artificial intelligence, clustering, relevance ranking and/or sampling.

Despite clear benefits of TAR, apprehension persisted ' until recently. Important factors, such as the exact savings that can be provided, how the technology works, and, importantly, how competing offerings differ, remain perplexing issues for counsel as they determine the best approach to take. However, apprehension is waning due to recent opinions encouraging the use of TAR, and it is thus important for today's counsel to understand the technology and how to leverage the right alternative for the right problem.

Two Key Approaches

Today, two general TAR approaches are emerging ' one that leverages artificial intelligence to identify potentially relevant data in a document collection, and another that relies on a human's understanding of language to identify potentially relevant data. Both deliver significant savings in time and cost, but each approach has specific instances when its use is ideal.

For artificial intelligence-based TAR, two elements are most often in play: the need to arrive at quick decisions early in the litigation (assess the case early to decide whether to settle or litigate); and having enough time remaining available to read an average of 10,000 documents in order to train the system with a “seed set.” A language-based TAR approach is most appealing when transparency and insight into coding decisions are of paramount concern, when having the ability to audit reviewers in real time is important, and when an organization wants to incorporate this approach as a regular business practice.

The important point to note is that support for both approaches has been provided by recent court opinions, most notably Judge Andrew Peck's order in Da Silva Moore v. Publicis Groupe & MSL Group, No. 11 Civ. 1279 (ALC) (AJP) (S.D.N.Y. Feb. 24, 2012) and Kleen Products v. Packaging Corporation of America , Case No. 10 C 5711 (N.D. Ill. April 8, 2011).

In Da Silva, Judge Peck specifically holds that:

(Technology)-assisted review is
an acceptable way to search for
relevant ESI in appropriate cases.

This statement, given that an artificial intelligence-based approach was at issue, clearly gives us comfort in considering such an approach for expediting document review and minimizing its cost.

In Kleen, a case litigating the use of a language-based analytics workflow in document review, Judge Nan Nolan held for the producing party for a number of reasons, but specifically because their approach has been embraced by the court system for years. She specifically relies on Principle 6 of the Sedona Best Practices, Recommendations and Principles for Addressing Electronic Document Production in justifying her decision. Principle 6 directs that:

Responding parties are best situated to evaluate the procedures, methodologies, and technologies appropriate for preserving and producing their own electronically stored information. [Emphasis added.]

With this set of decisions in play, the runway for TAR is clear. Now, we must determine which approach is most appropriate for each case.

Choosing the Right Method

Generally, the makeup of your case and your data set will influence which TAR approach to take. Specific factors to consider include, but may not be limited to:

  • The estimated budget for the case;
  • The total amount in controversy;
  • The time allowed for producing responsive documents;
  • The volume of potentially-relevant data identified for document review; and
  • The need for transparency in support of your selection.

Regardless of the approach selected, particular attention must be given to The Sedona Conference Cooperation Proclamation before the approach is implemented. To emphasize this point, both Da Silva and Kleen reference the Proclamation as a key basis for their decisions. The Da Silva opinion provides:

Of course, the best approach to the use of computer-assisted coding (Technology-Assisted Review) is to follow the [Sedona Cooperation Proclamation] model. Advise opposing counsel that you plan to use computer-assisted coding and seek agreement; if you cannot, consider whether to abandon predictive coding for that case or go to the court for advance approval.

Da Silva Moore, 11 civ 1279 Slip Op., Feb. 24, 2012, at 5.

Without a showing that an agreement is in place, the ability to refute a challenge to your TAR protocol will likely be much more difficult.

Best Practices for Both Alternatives

Taking a look at an artificial intelligence-based approach first, it is important to document the following at the planning stage:

  • The parties' agreement;
  • The relative amount of ESI to be reviewed;
  • The superiority of an (artificial-intelligence based) review to the available alternatives;
  • The need for cost effectiveness and proportionality under Rule 26(b)(2)(C); and
  • The transparency of the process.

Once an agreement has been reached between the parties on this approach, the producing party should be able to address the following questions to support the results:

  • What was done to implement the agreed-upon process;
  • Why has that process produced a defensible result;
  • Were the documents used to train the system shared with opposing counsel in advance; and
  • Can a showing be made that sufficient quality control testing was done to validate the results.

Da Silva Moore, Slip Op. at 22.

Keep in mind that there can be a “blind spot” with this process in the crafting of the seed set. A few years ago, the common practice was to review as few as 500 documents in order to train the system on what to look for within the collection. In short, the system would say “show me some documents you are looking for (or parts of documents) and I'll find others like it.” With the volume of electronically stored information in litigation today comprising terabytes of data, a best practice now is to build a seed set of approximately 10,000 documents. This better ensures that all semantic patterns, which fuel the artificial intelligence by establishing associations between terms that occur in similar contexts, are captured ' thus helping protect you from under-inclusive results (missing critical data) or over-inclusive results (including too much data that isn't relevant into later rounds of review that yields greater downstream time and cost). Also, this task is of enough importance to the entire process that a senior-level attorney should be selecting these seed documents, so make sure you plan accordingly at the outset.

Contrast this with the language-based approach, which relies on human intelligence rather than artificial intelligence to drive the decision-making process. Again, this approach is often selected for cases where there are greater transparency requirements, often driven by the significance of the case to the producing party.

Overall, the focus is to assure that the relevant and potentially relevant documents are considered first. Meet with your consultant to discuss and commemorate the key issues in the case, and let them craft sample queries for document prioritization prior to review. Assign a senior attorney to apply these queries against the language in the data set to make synonym associations with other words. Based on these synonym associations, this workflow will assign the documents into the following categories and advance (or suppress) them accordingly.

  • “Highly relevant” documents:|
    • 10% of the data set.
    • Immediately advanced to Second-Pass Review.
  • “Might be relevant” documents|
    • 50% of the data set.
    • Reviewed in First-Pass Review.
  • “Not-relevant” documents|
    • 40% of the data.
    • Suppressed.

The documents marked as “might-be relevant” should be run through an eyes-on (first-pass) review where the potential relevance is weighed for each document. Consider leveraging technology that allows you to highlight the specific language within each document that the reviewer felt made the document relevant. This allows you to audit reviewer decisions and adjust the reviewer's understanding of the mater in real time, as well as “bulk tag” other documents containing similar language and reduce the remaining document collection by up to 50%. Further, make sure you sample the documents previously marked as “not relevant” and achieve some measurable certainty that no other potentially relevant documents were left behind. With the right methodology and testing, you can achieve a level of assurance of up to 99.9% that all responsive documents have been identified.

Finally, a clear and unique benefit of the language-based approach is the reusability of the work product. Particularly for those in highly regulated or litigious industries, make sure you don't approach each new matter as if it is your first. Save your work product from previous cases and build on it to drive even greater savings into future cases.

Regardless of which approach you choose, remember that implementing review acceleration technology and managing a case from beginning to end can be a difficult process and require resources that you may not have on staff. Consider retaining a technology and legal workflow expert to help you choose the right approach, and to supplement your current on-staff strengths. This will enhance the chances that a workflow is designed to meet the specific needs of your case and optimizes the results.

Conclusion

As we have learned from the opinions spinning out of the Da Silva and Kleen matters, technology-assisted review is fast becoming a standard in document review. Of course, not all TAR approaches are appropriate for all cases ' there is no one-size-fits-all solution. Be sure to select the right methodology for each unique problem, and take proactive steps to ensure that you achieve optimal results from your selected approach.


Hope Swancy-Haslam is Director of Analytics Market Development for RenewData. She has been delivering technology solutions to the legal market since 1992, including roles as Director of Electronic Discovery Services for a regional consulting firm based in Dallas, TX, as well as similar positions with Merrill Corporation and Engenium Corporation. She can be reached at [email protected].

Read These Next
COVID-19 and Lease Negotiations: Early Termination Provisions Image

During the COVID-19 pandemic, some tenants were able to negotiate termination agreements with their landlords. But even though a landlord may agree to terminate a lease to regain control of a defaulting tenant's space without costly and lengthy litigation, typically a defaulting tenant that otherwise has no contractual right to terminate its lease will be in a much weaker bargaining position with respect to the conditions for termination.

How Secure Is the AI System Your Law Firm Is Using? Image

What Law Firms Need to Know Before Trusting AI Systems with Confidential Information In a profession where confidentiality is paramount, failing to address AI security concerns could have disastrous consequences. It is vital that law firms and those in related industries ask the right questions about AI security to protect their clients and their reputation.

Pleading Importation: ITC Decisions Highlight Need for Adequate Evidentiary Support Image

The International Trade Commission is empowered to block the importation into the United States of products that infringe U.S. intellectual property rights, In the past, the ITC generally instituted investigations without questioning the importation allegations in the complaint, however in several recent cases, the ITC declined to institute an investigation as to certain proposed respondents due to inadequate pleading of importation.

Authentic Communications Today Increase Success for Value-Driven Clients Image

As the relationship between in-house and outside counsel continues to evolve, lawyers must continue to foster a client-first mindset, offer business-focused solutions, and embrace technology that helps deliver work faster and more efficiently.

The Power of Your Inner Circle: Turning Friends and Social Contacts Into Business Allies Image

Practical strategies to explore doing business with friends and social contacts in a way that respects relationships and maximizes opportunities.