Charging for AI assisted work: Why your billable hours are a liability

If you are still billing by the hour while using LLMs, you are subsidizing your clients with your own margins. Here is how to price for output instead.

Marcus Chen
Marcus Chen
May 4, 2026
6 min read
Charging for AI assisted work: Why your billable hours are a liability

82 percent. That is the average drop in billable hours I saw across three boutique agencies that integrated LLMs into their workflows last year without changing their pricing model. They became five times faster at delivery but their revenue plummeted because they were still selling time. They optimized themselves out of a business. If you are charging for AI assisted work by the hour, you are effectively penalizing your own efficiency. You are trading a $150 per hour human rate for a $0.03 API call and letting the client keep the difference. This is a fast way to kill your unit economics.

Why this list

I have spent the last decade obsessed with ratios. LTV to CAC, MRR growth, and the cost of goods sold. Most advice on AI focuses on how to write better prompts. That is useless if you do not know how to capture the value those prompts create. This list exists because the traditional service model is breaking. When a tool like Devin can handle a software ticket in ten minutes that used to take a junior engineer four hours, the math of the billable hour collapses. You need a pricing strategy that accounts for the compute cost, the capital expenditure of building the workflow, and the massive increase in delivery speed. This list covers the only four models that actually protect your margins in an automated world.

Comparison of old time-based billing and new AI-driven compute power

1. The Outcome-Based Unit Model

Stop selling hours and start selling units of value. If you are building an SEO engine, you do not bill for the hours spent researching keywords. You bill for the published, optimized article. This shifts the risk of efficiency from the client to you. When I looked at the AI for SEO Content at Scale data, the unit cost of a high-quality post dropped significantly when we moved to a deterministic schema. By charging $200 per post instead of $100 per hour, the margin grew as the internal time spent per post dropped from three hours to thirty minutes.

Consider a translation firm. Traditionally, they charge per word. With an OpenAI API integration, they can process 100,000 words in seconds. If they stay on a per-word model, they maintain their top-line revenue while their COGS (Cost of Goods Sold) drops to nearly zero. The moment they switch to an hourly model to be 'fair' to the client, they lose. The client does not care if a human or a machine did the work as long as the quality meets the spec. Your pricing should reflect the market value of the result, not the electricity used to generate it.

2. The API Pass-Through with a Management Fee

Some clients are sophisticated enough to know you are using AI. They want to see the numbers. In these cases, the best way to handle charging for AI assisted work is a transparent pass-through. You bill the client for the raw compute costs from providers like Groq or OpenAI at cost, then add a flat management fee or a percentage markup for the proprietary 'wrapper' or workflow you have built.

This is particularly effective for high-volume data processing. If you are using Otter.ai to transcribe and analyze 500 hours of research calls, the client pays the subscription cost plus your fee for the synthesis. This protects you from fluctuating token costs. If OpenAI raises prices or introduces a new tier, your margin remains insulated because the client is covering the underlying infrastructure. It turns your service into a SaaS-plus-service hybrid.

Model Risk Profile Margin Potential Best For
Hourly High (Efficiency kills revenue) Low Legacy consulting
Per Unit Medium (Quality must be high) High Content, Code, Design
Pass-Through Low (Client pays compute) Medium Data processing, High-volume
Tiered Value Low Very High Strategy, High-impact outcomes

3. The Efficiency Premium Tier

Speed is a feature that people pay for. If a standard research report takes two weeks, charge $5,000. If you can deliver it in 24 hours using Notion AI and a custom scraper, charge $8,000. This is the 'Efficiency Premium'. You are not charging for the AI. You are charging for the time the client saves.

I have seen this work best in creative fields. A branding agency used to take a month for mood boards and initial concepts. By building a workflow that uses local LLMs for rapid iteration, they cut that to three days. They did not lower their price. They actually increased it by 20 percent and marketed it as a 'Sprint' package. Their CAC stayed the same, but their payback period on the talent they hired to build the AI tools was less than three months. They realized that the client is not buying a designer's time. They are buying a faster path to a product launch. For more on how to build these types of high-speed workflows, see our guide on Writing a newsletter with AI: Building a Curation Engine.

Expert using AI tools to accelerate professional workflow

4. The Hybrid Retainer with an AI Credit Floor

This is the most stable model for long-term retention. You charge a monthly retainer that covers a baseline of human expertise and a set 'credit' of AI-generated output. For example, a legal tech firm might charge $3,000 a month for 10 hours of senior partner review and unlimited AI-assisted document drafting. This creates a floor for your MRR while allowing the 'unlimited' nature of the AI to act as the primary hook for the client.

From a unit economics perspective, this is a winner. The marginal cost of one additional document is fractions of a cent when running on Groq Llama 3 models. Your fixed costs are the human experts. As your AI workflows improve, the '10 hours of review' becomes more efficient because the AI produces fewer errors. Your human staff can then handle more clients, effectively increasing your revenue per employee without increasing burnout. This is how you scale a service business like a software company. You want your retention curve to look like a flat line, not a slide, and providing 'unlimited' AI utility is a powerful way to lock people in.

What to try first

Do not overcomplicate this. If you are currently billing hourly, do not switch your entire client base overnight. That is a recipe for a churn disaster. Instead, take your most repetitive task and turn it into a 'Productized Service' with a fixed price.

  1. Identify one deliverable that AI now handles 60 percent of the heavy lifting for.
  2. Calculate your total cost for that deliverable, including API credits and human QA time.
  3. Set a flat price based on the historical hourly cost. If it used to take 10 hours at $150, the price is $1,500.
  4. Deliver it in 2 hours using your AI stack.

Your margin just jumped from a standard 30 percent to over 80 percent. That is the only way to survive the next two years. If you are looking for primary data on how search and content are shifting, check out the Gartner search volume predictions or the latest OpenAI rate limits to understand your scaling bottlenecks. The tools are getting cheaper and faster. If your pricing stays stagnant, your business is a ticking time bomb.