Listen to this story
|
Software for document review has existed for years and typically only helps to store and organise contracts. Enter NLP-based softwares that have raised the bar for what can be accomplished. Even though a firm stands to lose 5% to 40% of its value on a given deal due to inefficiency, contracting remains an activity that only some companies do efficiently.
Moreover, comprehending legal text can be challenging due to its verbosity and density and few expert-annotated datasets. The Atticus Project, a non-profit organisation, has introduced the Merger Agreement Understanding Dataset (MAUD), an expert-annotated reading comprehension dataset.
Legal NLP landscape
MAUD is based on the American Bar Association’s 2021 Public Target Deal Points Study, with over 39,000 examples and over 47,000 annotations that legal experts have manually labelled.
Subscribe to our Newsletter
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.
Previously, in 2021, the non-profit organisation also released Contract Understanding Atticus Dataset (CUAD) with annotations from lawyers. The extensive dataset is estimated to cost over $2 million, with a corpus of more than 13,000 labels in 510 commercial legal contracts.

Legal expert systems have been a hot topic of discussion since the 1970s. Early approaches once more used the presence of key terms and headings to guide information extraction, and it’s likely that many offerings still make use of some proportion of rule-based technology; however, not surprisingly, pretty much all the recent entrants into the space are using more sophisticated machine learning techniques.
The Goa-based Contractzy (formerly known as ‘The Legal Capsule’), was founded by Gautami Raiker in 2018. The contract lifecycle management (CLM) the platform provided was the only woman-founded Indian start up to be selected for Microsoft Emerge X Programme Highway to 100 Unicorns.
Founded a year ago in 2017, SpotDraft, started leveraging AI to automate and streamline the lengthy and complex contract lifecycle. Speaking to AIM, Madhav Bhagat, co-founder and CTO, SpotDraft, said, there are many legal datasets released and also the publically available (but unannotated) contract data from SEC EDGAR, MCA of India, etc. SpotDraft used these datasets to create ‘Legal pre-trained transformer models’ which understand legal concepts better compared to standard off-the-shelf models trained on web crawl data like BERT large and other transformer-based models.
More recently, such datasets have become useful in prompt creation for few shot reasoning with LLMs like GPT-3. Since these models are prohibitively expensive to train from scratch, SpotDraft uses such datasets to finetune them or just to create more relevant prompts. Both fine-tuning and better prompting can result in performance boosts of 20-30% over a standard model, Bhagat added.
Today, Startups like Lawgeex provide a service to review contracts and, in some cases, more accurately than humans. The firm emphasises the ability to compare contracts against predefined company policies. Other firms like Klarity, Clearlaw and LexCheck have grabbed the opportunity. They are developing AI systems that can automatically ingest proposed contracts, analyse them in full using NLP, and determine which portions of the contract are acceptable and which are problematic.
Is NLP smart enough for law?
Law firms aim to hammer out agreements with accurate and efficient preprogrammed parameters to review contracts. Unfortunately, the legal sector is too abstract, and the consequences too severe for the widespread adoption of AI. The problems and issues faced in artificial intelligence-based contracting need a lot of deliberations and discussions to eliminate the pitfalls and make the collaboration between legal contracting and artificial intelligence more efficient.
On the surface, seeing the issues in general, it might appear that discussing more on the issue of granting artificial intelligence a limited legal personality status might untangle the issues. However, the liability conundrum remains even if the artificial intelligence technology causes an error; the accountability will likely fall on the programmers.
Commenting on the accuracy of these models, Bhagat said, “All datasets come with inherent biases, as these datasets do not accurately capture Indian sensibilities, including naming conventions like “son of”, “resident of” etc. Further, since they are human annotated and some parts of the law are open to interpretation, there can be disagreements on certain things mentioned in the dataset if another lawyer were to review it. At times these biases can become issues as the models trained on these datasets will also learn these biases and thus answer accordingly.”
Moreover, smaller firms may need more financial strength to adopt the new technology. For example, to manage work, a law practitioner can purchase an AI assistant for cheaper compared to what a firm will spend on software to manage those tasks. Then there is the cost of tech support after the initial setup. Hence, law firms that can afford AI can perform better, financially, than the others.
Named entity recognition (NRE), which many NLP models rely on, may also be insufficient for legal work. Lengthy legal documents, especially court proceedings, may not always refer to an entity by the same name, making it harder for these models to highlight the relevant information.
Multistep questions also remain a challenge for today’s NLP models, yet these are common in law. Similarly, many legal issues are too nuanced to be black or white, “if this, then that” reasoning. The definition of a legal error changes depending on the application of abstract concepts, which AI has a hard time with.
In conclusion, and highlighting areas that can be further developed, Bhagat said, “Generally, to get better results we would want to see more explanations behind certain answers given as part of the dataset so that the model can be fed those and trained to give explanations of its own. This can help solve the explainability and interpretability problem that exists in the NLP domain, especially when dealing with black-box models.”