Law.com Subscribers SAVE 30%

Call 855-808-4530 or email [email protected] to receive your discount on a new subscription.

Legal Tech: The 2019 EDRM TAR Guidelines — Recognizing the Evolving Role of the Subject Matter Expert

By Erin Baksa
May 01, 2019

After reading the new Technology Assisted Review (TAR) Guidelines from EDRM, it is clear that the evolution of the underlying technology in TAR solutions is reshaping the role of the subject matter expert (SME). While the Guidelines maintain the role of the SME — typically an experienced (and expensive) attorney most familiar with the project's subject matter — in ensuring reviewer accuracy and assisting in training the model, they also acknowledge the emergence of new technologies which can reduce the burden on the SME.

Newer active learning solutions allow for a continuous training of the model through a prioritized review. This spares the SME the review of multiple training and QC rounds associated with TAR 1.0 solutions, and allows more time for more targeted training. TAR 1.0 solutions can take more time to train the model, whereas the training of the active learning model begins after reviewing a smaller threshold amount of documents.

Training sets, traditionally a TAR 1.0 feature, are offered with some active learning solutions. These allow the SME to elevate training through an isolated review of conceptually-rich key documents, while the review team focuses on the prioritized review queue. The training set can either be a completely randomized sample across the corpus, or a seed set supplemented with a randomized sample. This approach aims to minimize bias while still injecting richness in the sample.

The Guidelines acknowledge that there are different views for the best method of selecting training sets. The different approaches result from varying levels of concern over bias in the training set by relying on “human judgment” or “differing preferences by human reviewers” to select the documents. The Guidelines instruct that any approach to selecting training data will produce an effective predictive model if it is used to produce a sufficiently broad training set. “Thus, differing views over selection of training data are less about whether an effective predictive model can be produced, than about how much work it will take to do so.”

Newer TAR solutions alleviate the burden of training in other ways. In some platforms, multiple models can run concurrently. This allows a reviewer training for relevance to simultaneously train for privilege or specific issues, thereby cutting back on costly re-review efforts.

Active learning solutions can also more easily address the challenge of supplemental collections. With earlier (TAR 1.0) solutions, when new data sets introduced new document features or concepts to the corpus, the model would need additional training in order to properly understand and categorize these new document types. Due to the static nature of the predictive coding index, each addition of this type would require the process of training to be started anew. This included the rebuilding of the index and repetition of the human review process. This redoubled review effort can include coding a seed set, and conducting the numerous rounds of training and QC review to reach stability.

With an active learning solution, since the model is continuously learning and improving its predictions, it can leverage its existing training to incorporate the new collection. This prevents the need to “start from scratch.”

With more time savings in model training through active learning, the SME can lend more of their expertise in QC review. In active learning solutions, differences between human coding decisions and model predictions are typically served up in two separate conflicts queues. These queues can be batched out or sampled for SME review. Where the documents in the project are comprised of user-created content and represent multiple concepts, the data set is considered to have a high conceptual richness. This may lead to a higher percentage of documents with features that the predictive coding model does not understand, which then can lead to disparate confidence levels and document populations with low coverage, posing a challenge to training.

The model's understanding of these documents and resulting prediction scores can be improved by training the system on more documents from lower coverage sets. To address this problem, some of today's active learning solutions have coverage queues and visualizations which eliminate the need for complex saved searches to review these sets. The SME can, therefore, easily sample documents from these sets to improve predictions for the greater review team.

With earlier TAR technologies, the SME might have been heavily involved with training the model throughout the life of the project. The newer features of today's active learning solutions can help to alleviate their burden and allow them to have time for other priorities. In providing a lower barrier to implementation, both in time and cost savings, active learning has become a more attractive option for fulfilling the proportionality and reasonableness of review requirements, both for the end client and the SME.

*****

Erin Baksa is a Senior Business Development Manager at Everlaw. Prior to Everlaw, she worked in ediscovery consulting as a Senior Manager for the Forensic Technology Services team at A&M Asia in Hong Kong. Previous consulting firms include Stroz Friedberg and DTI. Erin is a licensed attorney and has worked in the litigation industry for over 10 years.

 

This premium content is locked for Entertainment Law & Finance subscribers only

  • Stay current on the latest information, rulings, regulations, and trends
  • Includes practical, must-have information on copyrights, royalties, AI, and more
  • Tap into expert guidance from top entertainment lawyers and experts

For enterprise-wide or corporate acess, please contact Customer Service at [email protected] or 877-256-2473

Read These Next
Strategy vs. Tactics: Two Sides of a Difficult Coin Image

With each successive large-scale cyber attack, it is slowly becoming clear that ransomware attacks are targeting the critical infrastructure of the most powerful country on the planet. Understanding the strategy, and tactics of our opponents, as well as the strategy and the tactics we implement as a response are vital to victory.

'Huguenot LLC v. Megalith Capital Group Fund I, L.P.': A Tutorial On Contract Liability for Real Estate Purchasers Image

In June 2024, the First Department decided Huguenot LLC v. Megalith Capital Group Fund I, L.P., which resolved a question of liability for a group of condominium apartment buyers and in so doing, touched on a wide range of issues about how contracts can obligate purchasers of real property.

The Article 8 Opt In Image

The Article 8 opt-in election adds an additional layer of complexity to the already labyrinthine rules governing perfection of security interests under the UCC. A lender that is unaware of the nuances created by the opt in (may find its security interest vulnerable to being primed by another party that has taken steps to perfect in a superior manner under the circumstances.

CoStar Wins Injunction for Breach-of-Contract Damages In CRE Database Access Lawsuit Image

Latham & Watkins helped the largest U.S. commercial real estate research company prevail in a breach-of-contract dispute in District of Columbia federal court.

Fresh Filings Image

Notable recent court filings in entertainment law.