Writing PRDs with AI: Moving from Content Generation to Technical Audit

I have a leather notebook that I carry everywhere. If you look closely at the spine, you can see where the thread passes through the hide. Those seams are where the structure lives. When the stitching is tight, the notebook survives years of being tossed into bags. When the stitching is loose, the pages eventually drift. Product requirements documents, or PRDs, are the seams of our software. They are the artifacts that hold the intent of the designer and the execution of the engineer together.

Writing PRDs with ai has become a popular shortcut. Most people use it to fill a blank page, asking a chatbot to describe a login screen or a checkout flow. But there is a hidden friction in this approach. When we use ai to generate the draft, we often gloss over the small things. We stop noticing the gaps because the prose looks so professional. This post is a case study of how we moved away from using ai as a writer and started using it as a structural auditor. We stopped asking it to build the house and started asking it to find the cracks in the foundation.

A carpenter's level showing structural alignment

The problem

Our team was moving fast, or so we thought. We were using Grammarly to polish our language and basic LLMs to outline our features. On the surface, our PRDs looked perfect. They had the right headers, the right tone, and a clear sense of legibility. But once these documents reached the engineering team, the flow state disappeared.

Engineers would start building and then hit a wall. A requirement would say "users can reset their password," but it would fail to specify what happens if the user has an active session on three different devices. Another requirement would describe a data filter but ignore the latency implications for a Supabase database with a million rows. These are logic gaps. They are the missing pieces of a mental model that a human designer often forgets to document because they seem obvious in the moment.

We were seeing a 14 percent hallucination rate in our technical requirements. This does not mean the ai was making up facts. It means the ai was creating requirements that were technically impossible or logically inconsistent with the rest of the system. We were shipping documents that looked like high-quality artifacts but functioned like brittle prototypes. We realized that the real value of writing prds with ai was not in the generation of words, but in the stress testing of logic.

What we tried first

We started with the most common approach. We created a template in a shared doc and used a prompt to fill it. We would feed the ai a few bullet points about a feature and ask it to write a full functional specification.

For example, we were building a new dashboard for a project management tool. We told the ai: "Write a PRD for a dashboard that shows task progress, team workload, and upcoming deadlines." The output was beautiful. It described a modular layout. It listed user stories. It even suggested some nice-to-have features like dark mode.

We felt productive. We were producing five PRDs a week instead of two. We thought we had found a way to bypass the slow, methodical work of thinking through every state. We were prioritizing speed over clarity, and we did not realize it yet. We were treating the PRD as a piece of content rather than a technical blueprint. This was our first mistake in the process of writing prds with ai.

What broke

Everything broke during Sprint 42. We handed off a PRD for a complex permissions system to Devin, our autonomous ai software engineer. We expected Devin to take the requirements and build the backend logic smoothly. Instead, the system stalled.

Devin identified a circular dependency in our requirements that the human team had missed. The PRD stated that "Project Leads can assign roles to any user," but a later section said "Role assignments must be approved by an Admin." In our organization, a Project Lead is not always an Admin. The ai was trying to follow two conflicting rules at the same time.

When we looked back at our other ai-generated PRDs, we found similar issues. We found that the ai-generated text often lacked the necessary affordance for edge cases. It did not account for what happens when a network request fails mid-transaction or how the system should behave when two users edit the same field simultaneously. Our attempt to validate a SaaS idea using AI had led us to create a mountain of technical debt because we were not auditing the output.

We also noticed a security risk. In our haste to use public models, we were accidentally feeding proprietary roadmap data into systems without proper sanitization. We were crossing a dangerous seam between internal strategy and public data sets. You can read more about our initial failures in this post-mortem of our failed automation.

A minimalist workspace with tracing paper and a laptop

The fix

We decided to flip the script. We stopped asking the ai to write the first draft. Instead, we wrote the draft ourselves, focusing on the core intent. Then, we used ai as a "Red Team" auditor. We developed a framework where the ai's only job was to find logic contradictions and missing edge cases.

We set up an automated workflow using n8n. When a PRD reached a "Review Ready" status in our system, n8n would pull the text and send it to a custom model trained on our internal technical standards. This model was specifically instructed to look for three things: logic collisions, missing error states, and unstated assumptions.

We used this heuristic to evaluate every requirement:

Internal Consistency: Does requirement A contradict requirement B?
State Coverage: What happens if the data is null, deleted, or duplicated?
Technical Constraint: Does this requirement violate our existing Supabase schema or API rate limits?

Here is a sample of the audit prompt we used in our workflow:

{
 "task": "Technical Audit",
 "input_document": "PRD_Draft_v2.md",
 "audit_focus": [
 "Identify contradictory user permissions",
 "List 5 specific edge cases for the data sync flow",
 "Check for missing loading and error states in the UI section"
 ],
 "output_format": "Table of risks and recommendations"
}

This shift changed our mental model. The ai was no longer a writer. It was a technical peer reviewer. We were using the ai to find the friction before the engineers did. We also implemented a strict security protocol. We used local processing for sensitive roadmap data, ensuring that our proprietary logic never left our controlled environment. This aligns with the standards suggested by the NIST AI Risk Management Framework.

Results

Changing our approach to writing prds with ai yielded immediate, measurable results. We tracked the number of "clarification pings" engineers sent to product managers during the first week of a sprint.

Metric	Manual + AI Drafting	Human Draft + AI Audit
Clarification Pings per Sprint	24	9
Logic Gaps Found in Dev	12%	2%
Time to Engineering Ready	6 days	2 days
Sprint Velocity Increase	0%	18%

The data showed that while the initial drafting phase took slightly longer for the PMs, the total time to ship a feature decreased. The "seam" between product and engineering became much tighter. Engineers felt more confident because the artifacts they were receiving were actually legible and actionable.

We also found that the ai was better at identifying edge cases than our senior designers. In one instance, the ai pointed out that a new "delete account" feature did not specify what should happen to shared files owned by that user. That single catch saved us at least two days of rework. This is a core part of the Human-in-the-Loop Protocol we now follow for all product discovery work.

What we would do differently

If we were starting over, we would focus on custom model training much earlier. General models are good, but they do not understand the specific affordances of our internal component library or our specific database architecture. We are now working on fine-tuning a model on our past five hundred Jira tickets and their corresponding PRDs. This will allow the ai to understand not just general logic, but the specific ways our systems tend to break.

We would also spend more time on the legibility of the audit reports. Initially, the ai would give us a list of fifty tiny issues. It was overwhelming. We had to learn how to prompt the ai to categorize risks into "Critical," "Warning," and "Nitpick." This helped the product managers focus on the most important structural changes first.

Writing prds with ai is not about escaping the work of thinking. It is about enhancing it. By using these tools as auditors rather than authors, we can ensure that our software is built on a foundation of clear logic rather than just a collection of professional sounding words. We are moving toward a future where the PRD is a living, verified artifact that serves both the human and the machine.

To learn more about the psychological side of design and how objects communicate their use, I highly recommend reading about Don Norman's concept of affordance. Understanding how humans perceive tools is the first step in building better ones with ai.

Enjoying the read?

Try tunedtools

AI workflows matched to your project, stack, and role - grounded in real sources.

Get started free →

no credit card · ~ 2 min

Tools mentioned in this post

Devin

Grammarly

n8n

Supabase

Keep reading.

Product Founder Notes

Automate client reporting with AI: A unit economics case study

We cut reporting time by 94% using AI agents. Here is the data on why our first attempt failed and how we fixed the unit economics.

AI Workflows Engineering

Claude Code vs Cursor for Large Codebases: A Senior Reality Check

A technical comparison of Claude Code and Cursor for 50k+ file repositories. Latency benchmarks, refactoring costs, and how they handle circular dependencies.