Law.com Subscribers SAVE 30%

Call 855-808-4530 or email [email protected] to receive your discount on a new subscription.

<b><i>Marketing Tech:</i></b> Data Science for the Law

By Andrew Duchon
November 02, 2017

“Data Science” and “Artificial Intelligence” are terms being thrown around in every field, including legal. But what are they? Why are they generating so much excitement these days? There are many definitions out there, but Data Science is really just statistics of the real world, including business and customer data.

The question is how do you get the right statistics? Often, fitting a line to your data is all you need to do, but what if something more sophisticated is required?

Machine Learning

Machine learning is one way to get the right statistics. This is a suite of techniques, sometimes using supervised methods: given input X (a bunch of colored pixels), the system should call it Y (a cat); or unsupervised methods: given a bunch of Xs, cluster them into two groups (which we might later label cats and dogs).

One type of machine learning receiving a lot of news coverage these days is neural networks, or deep learning, which is (very) loosely based on the structures of neurons in your brain. Perhaps you've seen pictures of the human brain with the different parts labeled. These are essentially layers of neurons that provide more and more sophisticated analyses.

Recently, techniques have been discovered to build artificial neural networks of the same complexity. These deep neural networks have been trained to not just differentiate cats from dogs, but even to “dream” and create art, music and paintings.

Other Tools

Data science has other tools in its toolbox. Network Analysis is a relatively distinct set of techniques that are used to get the right statistics when your data is a network, like friends on Facebook, or website links, or supply chains. Natural Language Processing can leverage the structure of language (subjects before predicates in English, usually) to get the right statistics of language, though recent work in deep learning indicates that even this structure can be learned with enough data.

Artificial Intelligence

There are many definitions of Artificial Intelligence (AI), but I like to think that AI refers not just to knowing the right statistics, but acting on them in real-time interactions with the real world, including humans, and working within those constraints.

Robots and self-driving cars are subject to gravity, momentum and traffic laws. Chatbots are (or should be) constrained by politeness. These are things that AI needs to understand and not have to learn in order to be active in the real world. Software systems, like some new ones in the legal field, where the user can interactively train the system, have a kind of AI veneer on top of the machine learning. Finally, beneath all these techniques is a substrate of hardware engineering, software, databases and big data techniques that have made this all possible and not frustratingly slow.

Data Science and You

So, what can Data Science do for you as legal marketers and business development professionals? Data Science answers questions. The types of questions fall into a few major groups. Classification: What animal is in this picture? How will this judge decide this case?

  1. Similarity and Clustering: Which document has a provision most similar to this one? How can we group our clients for targeted marketing?
  2. Generation: What caption would best describe this scene? Given the basic terms of a sale, how does the legal sales agreement read?
  3. Planning and Optimization: What are some of the key risks developing affecting companies in XYZ industry? Which question should I ask next to most quickly end this testimony?
  4. Regression: Given the news and the market, is this company likely to be facing increased legal fees? Given the RFPs we've submitted, what can we expect for revenue next year?

If you can characterize your problem as asking one of these questions, then a data science solution may be created for you.

More and more, software companies are focusing these solutions on the legal industry. While eDiscovery software has been around for years, machine learning techniques are now being applied to, e.g., contracts, both for discovery and generation, to jurisdiction selection (which judge is most likely to favor my client?), and to anomaly detection in narratives.

Most of these solutions and most of what lawyers and chief knowledge officers address is text. Lots and lots of text. Identifying the right text, generating the right text, summarizing the text, exposing issues with the text, understanding the implications of the text, etc. These are all problems that lawyers and legal professionals (CI experts, business analyst) must constantly address.

The good news is that this is a golden age for data science: There are really no more impediments to developing solutions to these issues. You can get all the data you want (through cloud-based services — which you also have to pay for). Open-source software is available to implement anything; and though you may spend a lot of time understanding it, free online training is also there to teach yourself how to use it.

Of note is open-source software that has come out in the last couple of years that makes it much easier to apply deep neural networks to text. This reduces the need for hard-to-find linguists, who have to work intimately with the data creating rules, and allows developers to focus directly on the problem of translating user need into products addressing those needs.

Of course, understanding those needs takes a lot of time, as well as understanding how the available data may be analyzed to address those needs. Then it boils down to creativity and effort.

Conclusion

Building deep neural networks is still much more of an art than a science, so creativity and intuition are required. A lot of the real effort though, where data scientists like myself spend most of our time, is just addressing the data itself: understanding it, “cleaning” it, and moving it around in scalable ways. This is the non-”sexy” part of the job that no one talks about, but one I strangely find interesting.

In any case, you should expect to see over the next few years many new products based off of these new capabilities, including products from my company, Manzama, e.g., Manzama Signals™, which will be available in early 2018 (Beta release for clients only this month) to help marketing and business development professionals understand corporate news much quicker and more thoroughly through classification and clustering. With this foundation, other techniques from data science will be able to help these professionals to answer myriad questions we're just beginning to explore.

*****
Andrew Duchon, Ph.D.
, Director, Data Science at Manzama, directs the areas of business intelligence, computational linguistics, network analysis and machine learning. He may be reached at [email protected].

This premium content is locked for Entertainment Law & Finance subscribers only

  • Stay current on the latest information, rulings, regulations, and trends
  • Includes practical, must-have information on copyrights, royalties, AI, and more
  • Tap into expert guidance from top entertainment lawyers and experts

For enterprise-wide or corporate acess, please contact Customer Service at [email protected] or 877-256-2473

Read These Next
Strategy vs. Tactics: Two Sides of a Difficult Coin Image

With each successive large-scale cyber attack, it is slowly becoming clear that ransomware attacks are targeting the critical infrastructure of the most powerful country on the planet. Understanding the strategy, and tactics of our opponents, as well as the strategy and the tactics we implement as a response are vital to victory.

'Huguenot LLC v. Megalith Capital Group Fund I, L.P.': A Tutorial On Contract Liability for Real Estate Purchasers Image

In June 2024, the First Department decided Huguenot LLC v. Megalith Capital Group Fund I, L.P., which resolved a question of liability for a group of condominium apartment buyers and in so doing, touched on a wide range of issues about how contracts can obligate purchasers of real property.

CoStar Wins Injunction for Breach-of-Contract Damages In CRE Database Access Lawsuit Image

Latham & Watkins helped the largest U.S. commercial real estate research company prevail in a breach-of-contract dispute in District of Columbia federal court.

The Article 8 Opt In Image

The Article 8 opt-in election adds an additional layer of complexity to the already labyrinthine rules governing perfection of security interests under the UCC. A lender that is unaware of the nuances created by the opt in (may find its security interest vulnerable to being primed by another party that has taken steps to perfect in a superior manner under the circumstances.

Fresh Filings Image

Notable recent court filings in entertainment law.