Skip to content
Counsel and Code
Go back

AI Reads Documents. It Doesn't Read the Room.

Last week I asked Claude to review a stack of materials for a new matter. The file was a mess—dozens of PDFs in inconsistent formats, some scanned, some text, some in image-only versions. A few thousand pages total. The kind of document dump that used to eat three days of an associate’s life.

Claude produced a structured case summary in about twelve minutes. The summary was, by every objective measure, competent. It identified the parties. It mapped the timeline. It flagged the legal issues. It even drew sensible inferences about which arguments were strongest on the available evidence.

I read it carefully. I noted three or four small errors, which I corrected. I felt good about the workflow.

Then I had my first call with the client.

Within fifteen minutes of the conversation, I realized the AI’s entire analysis was built on a premise the client had already abandoned.

What the documents didn’t say

The file contained a complete set of pleadings from the prior lawyer—a full statement of claim, supporting evidence index, draft witness statements, an opinion letter. These documents were detailed and well-organized. They presented a coherent legal theory of the case.

Claude, faced with this file, did what any good document reviewer would do: it took the most fully-developed legal theory in the materials and used it as the organizing frame. The summary it produced was essentially an evaluation of whether the prior lawyer’s theory was supported by the evidence. It cross-referenced the witness statements against the pleadings, identified factual gaps, suggested supplementary evidence. The work was competent. The frame was wrong.

The client had fired the prior lawyer six weeks earlier. The reason: the prior lawyer’s theory of the case had been informally rejected by the court at a procedural hearing, and the client no longer believed it would succeed at trial. They were looking for a different theory. That was the whole reason they had come to me.

This crucial fact—the reason the client was sitting in my office at all—did not exist in any document. It existed only in the client’s head, in their experience of the prior hearing, in their assessment of the prior lawyer’s competence.

Claude could not access this information, because Claude can only read what’s written down. And the most important fact about the case wasn’t written down anywhere.

The verification trap

I want to name this failure mode, because I think it’s the most important and least discussed problem with AI document review.

I call it the verification trap. It works like this:

  1. A file contains documents at different levels of “authority”—some are casual notes, some are formal pleadings, some are expert opinions.
  2. AI tools, by their training, tend to defer to the most formal-looking, most fully-developed document as the organizing frame.
  3. The AI then produces analysis that verifies whether the rest of the materials support that frame.
  4. The output looks like rigorous analysis. It is, in fact, sophisticated confirmation bias.

This is a different failure mode from hallucination. Hallucination is the AI inventing a fact that isn’t there. The verification trap is the AI correctly identifying every fact in the file, but organizing them around the wrong question. The output is internally consistent. It just answers a question the client isn’t asking.

And the worst part: it’s almost invisible. If I hadn’t talked to the client before writing my opinion based on Claude’s summary, I would have produced beautiful analysis of a problem that no longer existed. The client would have read it, recognized the framing as the prior lawyer’s framing, and concluded that I was no better than the lawyer they just fired.

Why this happens (and why it won’t be fixed soon)

The verification trap is not a bug. It is a deep feature of how language models work.

A language model trained on legal materials has internalized a particular form of legal reasoning: given a set of pleadings, evaluate whether the evidence supports them. This is the most common legal task in the training data, because it is the most common form of legal writing. The model has seen ten thousand examples of “here is a claim; here is whether the evidence supports the claim.”

What the model has not seen many examples of is “here is a claim; here are the reasons the claim is wrong as a strategic matter, and here is what a different lawyer might do instead.” This kind of reasoning happens in lawyers’ heads, in private conversations with clients, in strategic memos that rarely get written down. The model can’t learn what it hasn’t seen.

So when you hand the model a file containing a fully-developed legal theory, the model treats that theory as the starting point. It cannot easily ask “what if we abandoned this entirely?” It doesn’t have the strategic frame to do so. The training data taught it to verify, not to reframe.

This is a limitation of the technology, but it’s also a limitation of the workflow we’re using. We are asking the AI to do something it wasn’t designed to do: not just analyze documents, but identify which document is worth analyzing.

What I do now

After this incident, I changed my document review workflow. The change is small but the effect is large.

I never ask the AI to review documents without first telling it what the client wants.

The prompt now starts with something like: “The client has asked me to evaluate whether the current legal theory can succeed at trial. They have indicated they may need to abandon this theory and develop a new one. Review the attached materials with the following two questions in mind: (1) what facts in the file support the existing theory, and (2) what facts in the file would support a different theory if we needed one.”

This kind of prompt changes everything. The AI is no longer organizing around the most developed document in the file. It is organizing around the client’s actual question. The output is dramatically more useful.

The lesson generalizes. Whenever I use AI to review materials, the prompt must include what the client’s situation actually is—not just what’s in the materials. The AI can’t ask the client. I have to tell it.

A protocol for AI document review

Based on this and similar incidents, here’s the protocol I now follow for any AI-assisted document review on a substantive matter:

Before the AI sees the documents, I spend five minutes writing a briefing. The briefing answers three questions:

I include this briefing as the first thing in the prompt, before any document is attached. I tell the AI explicitly: “Read the briefing first. Use it to interpret the materials, not the other way around.”

After the AI produces output, I check whether the output is responsive to the briefing, not to the materials. If the output is competent analysis of the wrong question, I know I haven’t given the AI enough context.

This sounds obvious when written out. It is not obvious in practice. The natural tendency, when using AI for document review, is to skip the briefing and just dump the materials. The AI produces something that looks good. You proceed. You discover the problem later, often much later, when you realize you’ve been working on the wrong version of the case.

The bigger lesson

The verification trap is not unique to document review. It applies to any task where AI is asked to organize complex material without knowing what the material is for.

In contract review: the AI evaluates clauses against standard practice, but doesn’t know which clauses are negotiable for this client.

In research: the AI summarizes case law on a question, but doesn’t know what the lawyer is actually trying to argue.

In drafting: the AI produces well-structured documents, but doesn’t know the audience, the strategic context, or the tone you’ve established with the recipient.

In every case, the failure is the same: the AI is doing rigorous work on a partial picture. The information the AI is missing isn’t technical—it’s contextual. It’s the kind of information that doesn’t get written down because everyone in the room already knows it.

The lawyers who use AI well are the ones who have learned to translate this implicit context into explicit prompts. The lawyers who use AI poorly are the ones who treat the AI as if it could read the room. It can’t. It can only read documents.

And the documents, as I was reminded last week, are almost never the whole story.


Part of an ongoing series on the limits of AI in legal practice. Related: why arguing with AI produces better results than trusting it.

If you’ve had a “verification trap” moment in your own practice, email [email protected]. The clearest examples make their way into future articles.


Share this post on:

Next Post
The Quiet Extinction of Small-Stakes Legal Work