Call 855-808-4530 or email [email protected] to receive your discount on a new subscription.
Recent advances in technology assisted review (what I call “TAR 2.0″) include the ability to deal with low richness, rolling collections, and flexible inputs in addition to vast improvements in speed. These improvements now allow TAR to be used effectively in many more discovery workflows than its traditional “TAR 1.0″ use in classifying large numbers of documents for production.
To understand why these enhanced capabilities are useful, it helps to first break down the kinds of tasks we face in document review. Broadly speaking, they fall into three categories:
Classification
This is the most common form of document review in which documents are sorted into buckets such as responsive or non-responsive so we can do something different with each class of document. The most common example here is a review for production.
Protection
This is a higher level of review in which the purpose is to protect certain types of information from disclosure. The most common example is privilege review, but this also encompasses trade secrets and other forms of confidential, protected, or even embarrassing information, such as personally identifiable information (PII) or confidential supervisory information (CSI).
Knowledge Generation
The goal here is learning what stories the documents can tell us and discovering information that could prove useful to our case. A common example of this is searching and reviewing documents received in a production from an opposing party, or searching a collection for documents related to specific issues or deposition witnesses.
Many are probably already quite familiar with these types of tasks, but I want to get explicit and discuss them in detail because each of these three tasks has distinctly different recall and precision targets, which in turn have important implications for designing workflows and integrating TAR.
Metrics
Let's quickly review those two crucial metrics for measuring the effectiveness and defensibility of your discovery processes: “recall” and “precision.” Recall is a measure of completeness, the percentage of relevant documents actually retrieved. Precision measures purity, the percentage of retrieved documents that are relevant.
The higher the percentage of each, the better you've done. If you achieve 100% recall, then you have retrieved all the relevant documents. If all the documents you retrieve are relevant and have no extra junk mixed in, you've achieved 100% precision. But recall and precision are not friends. Typically, a technique that increases one will decrease the other.
This engineering tradeoff between recall and precision is why it helps to be explicit and think carefully about what we're trying to accomplish. Because the three categories of document review have different recall and precision targets, we must choose and tune our technologies ' including TAR ' with these specific goals in mind so that we maximize effectiveness and minimize cost and risk. Let me explain in more detail.
Classification Tasks
Start with classification ' the sorting of documents into buckets. We typically classify so that we can do different things with different subpopulations, such as review, discard, or produce.
Under the Federal Rules of Civil Procedure, and as emphasized by The Sedona Conference'and any number of court opinions, e-discovery is limited by principles of reasonableness and proportionality. As Magistrate Judge Andrew J. Peck wrote in the seminal case, Da Silva Moore v. Publicis Groupe, 2012 U.S. Dist. LEXIS 23350 (SDNY, Feb. 24, 2012):
The goal is for the review method to result in higher recall and higher precision than another review method, at a cost proportionate to the 'value' of the case.
As Judge Peck suggests, when we're talking document production, the goal is to get better results, not perfect results. Given this, you want to achieve reasonably high percentages of recall and precision, but with cost and effort that is proportionate to the case.
Thus, a goal of 80% recall ' a common TAR target ' could well be reasonable when reviewing for responsive documents, especially when current research suggests that the “gold standard” of complete eyes-on review by attorneys can't do any better than that at many times the cost.
Precision must also be reasonable, but requesting parties are usually more interested in making sure they get as many responsive documents as possible. So recall usually gets more attention here.
Protection Tasks
By contrast, when your task is to protect certain types of confidential information (most commonly privilege, but it could be trade secrets, confidential supervisory information, or anything else where the bell can't be unrung), you need to achieve 100% recall. Period. Nothing can fall through the cracks. This tends to be problematic in practice, as the goal is absolute perfection and the real world seldom obliges.
To approximate this perfection in practice, we usually need to use every tool in our toolkit to identify the documents that need to be protected ' not just TAR but also keyword searching and human review ' and use them effectively against each other.
The reason for this is simple: Different review methods make different kinds of mistakes. Human reviewers tend to make random mistakes. TAR systems tend to make very systematic errors, getting entire classifications of documents right or wrong. By combining different techniques into our workflows, one serves as a check against the others.
This is an important point about TAR for data protection tasks, and one I want to reemphasize. The best way to maximize recall is to stack techniques, not to replace them. Because TAR doesn't make the same class of errors as search terms and human review, it makes an excellent addition to privilege and other data protection workflows ' provided the technology can deal with low prevalence and be efficiently deployed.
Precision, on the other hand, is somewhat less important when your task is to protect documents. Precision doesn't need to be perfect, but because these tasks typically use many attorney hours, they're usually the most expensive part of review. Including unnecessary junk gets expensive quickly. You still want to achieve a fairly high level of precision (particularly to avoid having to log documents unnecessarily if you are maintaining a privilege log), but recall is still the key metric here.
Knowledge Generation Tasks
The final task we described above is where we get the name “discovery” in the first place. What stories do these documents tell? What stories can my opponents tell with these documents? What facts and knowledge can we learn from them? This is the discovery task that is often the most Google-like. For knowledge generation, we don't really care about recall. We don't want all the documents about a topic; we just want the best documents about a topic ' the ones that will end up in front of deponents or used at trial.
Precision is therefore the most important metric here. You don't want to waste your time going through junk ' or even duplicative and less relevant documents. This is where TAR can also help, prioritizing the document population by issue and concentrating the most interesting documents at the top of the list so that attorneys can quickly learn what they need to litigate the case.
One nitpicky detail about TAR for issue coding and knowledge generation should be mentioned, however. TAR algorithms rank documents according to their likelihood of getting a thumbs-up or a thumbs-down from a human reviewer. They do not rank documents based on how interesting they are. For example, in a review for responsiveness, some documents could be very easy to predict as being responsive, but not very interesting. On the other hand, some documents could be extremely interesting, but harder to predict because they are so unusual.
On the gripping hand, however, the more interesting documents tend to cluster near the top of the ranking. Interesting documents sort higher this way because they tend to contain stronger terms and concepts as well as more of them. TAR's ability to concentrate the interesting documents near the top of a ranked list thus makes it a useful addition to knowledge-generation workflows.
Conclusion
With this framework for thinking about, developing, and evaluating different discovery workflows, you are now better prepared to get into the specifics of how TAR can best be used for the specific tasks you have at hand.
In the end, the critical factor in your success will be how effectively you use all the tools and resources you have at your disposal, and TAR 2.0 is a powerful new addition to your toolbox.
[IMGCAP(1)]
Recent advances in technology assisted review (what I call “TAR 2.0″) include the ability to deal with low richness, rolling collections, and flexible inputs in addition to vast improvements in speed. These improvements now allow TAR to be used effectively in many more discovery workflows than its traditional “TAR 1.0″ use in classifying large numbers of documents for production.
To understand why these enhanced capabilities are useful, it helps to first break down the kinds of tasks we face in document review. Broadly speaking, they fall into three categories:
Classification
This is the most common form of document review in which documents are sorted into buckets such as responsive or non-responsive so we can do something different with each class of document. The most common example here is a review for production.
Protection
This is a higher level of review in which the purpose is to protect certain types of information from disclosure. The most common example is privilege review, but this also encompasses trade secrets and other forms of confidential, protected, or even embarrassing information, such as personally identifiable information (PII) or confidential supervisory information (CSI).
Knowledge Generation
The goal here is learning what stories the documents can tell us and discovering information that could prove useful to our case. A common example of this is searching and reviewing documents received in a production from an opposing party, or searching a collection for documents related to specific issues or deposition witnesses.
Many are probably already quite familiar with these types of tasks, but I want to get explicit and discuss them in detail because each of these three tasks has distinctly different recall and precision targets, which in turn have important implications for designing workflows and integrating TAR.
Metrics
Let's quickly review those two crucial metrics for measuring the effectiveness and defensibility of your discovery processes: “recall” and “precision.” Recall is a measure of completeness, the percentage of relevant documents actually retrieved. Precision measures purity, the percentage of retrieved documents that are relevant.
The higher the percentage of each, the better you've done. If you achieve 100% recall, then you have retrieved all the relevant documents. If all the documents you retrieve are relevant and have no extra junk mixed in, you've achieved 100% precision. But recall and precision are not friends. Typically, a technique that increases one will decrease the other.
This engineering tradeoff between recall and precision is why it helps to be explicit and think carefully about what we're trying to accomplish. Because the three categories of document review have different recall and precision targets, we must choose and tune our technologies ' including TAR ' with these specific goals in mind so that we maximize effectiveness and minimize cost and risk. Let me explain in more detail.
Classification Tasks
Start with classification ' the sorting of documents into buckets. We typically classify so that we can do different things with different subpopulations, such as review, discard, or produce.
Under the Federal Rules of Civil Procedure, and as emphasized by The Sedona Conference'and any number of court opinions, e-discovery is limited by principles of reasonableness and proportionality. As Magistrate Judge Andrew J. Peck wrote in the seminal case, Da Silva Moore v. Publicis Groupe, 2012 U.S. Dist. LEXIS 23350 (SDNY, Feb. 24, 2012):
The goal is for the review method to result in higher recall and higher precision than another review method, at a cost proportionate to the 'value' of the case.
As Judge Peck suggests, when we're talking document production, the goal is to get better results, not perfect results. Given this, you want to achieve reasonably high percentages of recall and precision, but with cost and effort that is proportionate to the case.
Thus, a goal of 80% recall ' a common TAR target ' could well be reasonable when reviewing for responsive documents, especially when current research suggests that the “gold standard” of complete eyes-on review by attorneys can't do any better than that at many times the cost.
Precision must also be reasonable, but requesting parties are usually more interested in making sure they get as many responsive documents as possible. So recall usually gets more attention here.
Protection Tasks
By contrast, when your task is to protect certain types of confidential information (most commonly privilege, but it could be trade secrets, confidential supervisory information, or anything else where the bell can't be unrung), you need to achieve 100% recall. Period. Nothing can fall through the cracks. This tends to be problematic in practice, as the goal is absolute perfection and the real world seldom obliges.
To approximate this perfection in practice, we usually need to use every tool in our toolkit to identify the documents that need to be protected ' not just TAR but also keyword searching and human review ' and use them effectively against each other.
The reason for this is simple: Different review methods make different kinds of mistakes. Human reviewers tend to make random mistakes. TAR systems tend to make very systematic errors, getting entire classifications of documents right or wrong. By combining different techniques into our workflows, one serves as a check against the others.
This is an important point about TAR for data protection tasks, and one I want to reemphasize. The best way to maximize recall is to stack techniques, not to replace them. Because TAR doesn't make the same class of errors as search terms and human review, it makes an excellent addition to privilege and other data protection workflows ' provided the technology can deal with low prevalence and be efficiently deployed.
Precision, on the other hand, is somewhat less important when your task is to protect documents. Precision doesn't need to be perfect, but because these tasks typically use many attorney hours, they're usually the most expensive part of review. Including unnecessary junk gets expensive quickly. You still want to achieve a fairly high level of precision (particularly to avoid having to log documents unnecessarily if you are maintaining a privilege log), but recall is still the key metric here.
Knowledge Generation Tasks
The final task we described above is where we get the name “discovery” in the first place. What stories do these documents tell? What stories can my opponents tell with these documents? What facts and knowledge can we learn from them? This is the discovery task that is often the most Google-like. For knowledge generation, we don't really care about recall. We don't want all the documents about a topic; we just want the best documents about a topic ' the ones that will end up in front of deponents or used at trial.
Precision is therefore the most important metric here. You don't want to waste your time going through junk ' or even duplicative and less relevant documents. This is where TAR can also help, prioritizing the document population by issue and concentrating the most interesting documents at the top of the list so that attorneys can quickly learn what they need to litigate the case.
One nitpicky detail about TAR for issue coding and knowledge generation should be mentioned, however. TAR algorithms rank documents according to their likelihood of getting a thumbs-up or a thumbs-down from a human reviewer. They do not rank documents based on how interesting they are. For example, in a review for responsiveness, some documents could be very easy to predict as being responsive, but not very interesting. On the other hand, some documents could be extremely interesting, but harder to predict because they are so unusual.
On the gripping hand, however, the more interesting documents tend to cluster near the top of the ranking. Interesting documents sort higher this way because they tend to contain stronger terms and concepts as well as more of them. TAR's ability to concentrate the interesting documents near the top of a ranked list thus makes it a useful addition to knowledge-generation workflows.
Conclusion
With this framework for thinking about, developing, and evaluating different discovery workflows, you are now better prepared to get into the specifics of how TAR can best be used for the specific tasks you have at hand.
In the end, the critical factor in your success will be how effectively you use all the tools and resources you have at your disposal, and TAR 2.0 is a powerful new addition to your toolbox.
[IMGCAP(1)]
With each successive large-scale cyber attack, it is slowly becoming clear that ransomware attacks are targeting the critical infrastructure of the most powerful country on the planet. Understanding the strategy, and tactics of our opponents, as well as the strategy and the tactics we implement as a response are vital to victory.
This article highlights how copyright law in the United Kingdom differs from U.S. copyright law, and points out differences that may be crucial to entertainment and media businesses familiar with U.S law that are interested in operating in the United Kingdom or under UK law. The article also briefly addresses contrasts in UK and U.S. trademark law.
The Article 8 opt-in election adds an additional layer of complexity to the already labyrinthine rules governing perfection of security interests under the UCC. A lender that is unaware of the nuances created by the opt in (may find its security interest vulnerable to being primed by another party that has taken steps to perfect in a superior manner under the circumstances.
In Rockwell v. Despart, the New York Supreme Court, Third Department, recently revisited a recurring question: When may a landowner seek judicial removal of a covenant restricting use of her land?
Making partner isn't cheap, and the cost is more than just the years of hard work and stress that associates put in as they reach for the brass ring.