Just last week, I wrote about a first-of-its-kind benchmarking study of legal AI tools, and now comes another, much-different study, that is also the first of its kind.
This time, in a randomized controlled trial, researchers have found that the latest generation of AI technologies can significantly enhance both the quality and efficiency of legal work, potentially transforming how lawyers approach complex tasks.
While prior research has already shown the speed-related improvements in legal work, the authors say, this study is the first to provide empirical evidence “that AI tools can consistently and significantly enhance the quality of human lawyers’ work across various realistic legal assignments.”
The study, AI-Powered Lawyering: AI Reasoning Models, Retrieval Augmented Generation, and the Future of Legal Practice, was conducted by researchers from the University of Minnesota and University of Michigan law schools.
Specifically, the research focused on two emerging and distinct AI technologies: Reasoning models — in this case OpenAI’s o1-preview reasoning model — and tools that use Retrieval Augmented Generation (RAG) — in this case Vincent AI from vLex. The study notes that Vincent AI also provides assistance in the form of automated prompting.
Reasoning models, the authors explain, “are explicitly designed to use additional computational resources at the point of use, planning responses before generating them — much like a human taking longer to think and outline thoughts before answering a complex question.”
By contrast, RAG integrates with legal source materials in order to ground its outputs in authoritative sources, thereby minimizing hallucinations and enabling users to verify the AI’s output against the source.
The results showed that both AI tools substantially improved the quality of legal work in four out of six assignments tested, while also delivering significant and consistent efficiency gains across nearly all tasks.
“This marks a significant departure from earlier studies, which generally reported limited quality gains in realistic lawyering tasks,” note the researchers, led by law professor Daniel Schwarcz at the University of Minnesota Law School and Sam Manning of the Centre for the Governance of AI, who conducted the experiment with 127 law students from the two law schools.
Quality and Efficiency Enhanced
Participants completed six realistic legal assignments under three conditions: without AI assistance, with o1-preview, and with Vincent AI. The results showed that o1-preview yielded statistically significant quality improvements of approximately 10% to 28% across four assignments, while Vincent AI provided improvements ranging from 8% to 15%.
The study found that these quality improvements were concentrated in litigation-oriented tasks rather than transactional work. Both AI tools significantly enhanced the clarity, organization, and professionalism of submitted work across multiple assignments.
Notably, o1-preview also improved the analytical depth of participants’ work in three of the six assignments, demonstrating advantages in addressing complex legal reasoning that previous AI models could not match.
“Our most significant finding is that access to both o1-preview and Vincent AI led to statistically significant and meaningful improvements in overall quality of work across four of the six assignments tested — with o1-preview producing larger and more statistically significant gains than Vincent AI,” the researchers report.
Those quality improvements were primarily reflected in enhanced clarity, organization and professionalism of the submitted work, the authors say.
As noted, however, the quality improvements did not extend to the one transactional task the study test, which involved drafting a non-disclosure agreement.
By contract, the study found mixed evidence that either tool improved accuracy. The one exception, the authors say, is that o1-preview improved accuracy when the assigned task required participants to focus their analysis on a single document provided to them as part of the assignment (in this case a complaint).
In terms of efficiency, both AI tools reduced completion time across five of the six assignments — by 14-37% for Vincent AI and by 12-28% for o1-preview. This translated to significant productivity gains, with Vincent AI boosting productivity by approximately 38% to 115% and o1-preview increasing productivity by between 34% and 140%, with particularly strong effects in complex tasks like drafting persuasive letters and analyzing complaints.
Differences Between the Tools
The study also revealed interesting differences between the two AI technologies. While assignments completed with Vincent AI contained fewer hallucinations (3 total) than those produced using o1-preview (11 total), o1-preview showed stronger capabilities in enhancing the depth and rigor of legal analysis.
More significantly, the study found that o1-preview led to stronger and more widespread improvements in the quality of legal work than did Vincent AI.
In addition to enhancing clarity, organization, and professionalism, they found, o1-preview produced statistically significant and substantial improvements in the quality of the legal analysis contained in three of the six assignments.
“This finding suggests that, when it comes to their potential to improve legal work, AI reasoning models represent a difference in kind — not just degree – relative to earlier LLMs like GPT-4,” they conclude.
Automated Prompting
The study also considered the use of automated prompting, and specifically its use by Vincent AI. As the authors explain, Vincent AI supplies pre-crafted prompts based on factors such as the documents the user uploads and the tasks they would like to complete.
(As I explained in this recent article about Vincent AI’s latest updates, vLex refers to these automated prompts as “workflows.”)
“AI-enabled legal tech tools are increasingly automating the prompting process to help users ask more effective questions and improve AI-generated responses,” the study observes.
However, the authors avoid offering any definitive conclusion as to the extent to which automated prompting tools improve outcomes. In fact, they suggest that a reason for skepticism about these tools is the rapid pace of advancement in foundation models.
“LLMs are becoming increasingly adept at detecting context, which raises questions about the added value of certain prompting techniques,” the say.
‘Multiplicative Benefits’
That said, the authors emphasized that the two AI technologies enhance legal work in distinct yet complementary ways, suggesting that future integration of reasoning models with RAG capabilities could yield even greater benefits than observed in their study.
“The implications of our separate findings for Vincent AI and o1-preview are each independently significant,” the authors write. “Viewed together, however, they are even more noteworthy.
“That is because each AI system appears to enhance legal work through distinct mechanisms, which can be and already are being combined with one another in updated legal technology tools. Although this integration may result in additive benefits, it may also produce multiplicative benefits.”
The authors conclude:
“Ultimately, our findings suggest that legal AI may be moving toward an inflection point. AI tools are increasingly capable of handling two key aspects of legal work: (1) information retrieval and (2) reasoning. The convergence of these capabilities points to a future in which AI — by enhancing both the efficiency and the quality of the work that attorneys can produce — becomes more than just a helpful accessory. It becomes an integral part of the profession.”
The full study provides detailed analysis of the performance of these AI tools across different types of legal tasks and participant skill levels and provides valuable insights for anyone interested in keeping up with the rapidly evolving AI landscape.
In addition to Schwarcz and Sam Manning, other authors of the study were: Patrick Barry, clinical assistant professor and director of digital academic initiatives at the University of Michigan Law School; David R. Cleveland, clinical professor of law and director of legal research and writing at University of Minnesota Law School; J.J. Prescott, Henry King Ransom professor of law at the University of Michigan Law School; and Beverly Rich, practice innovation counsel at Ogletree Deakins.