📊 Full opportunity report: The Model Is Only 10%: The Real Lesson of the New SDLC on ThorstenMeyerAI.com — validation score, market gap, and execution plan.

TL;DR

A recent whitepaper emphasizes that in AI-assisted software development, the model itself is only about 10% of the system’s behavior. The focus is shifting toward harness design, verification, and context engineering, which are now the key to successful AI integration.

A new whitepaper from Google, authored by Addy Osmani, Shubham Saboo, and Sokratis Kartakis, states that the most significant shift in software engineering is moving away from emphasizing AI model size toward focusing on harness design, verification, and intent expression. This challenges the common assumption that larger models are the primary driver of AI performance.

The paper reports that as of early 2026, 85% of professional developers use AI coding agents regularly, with 51% using them daily, and approximately 41% of all new code being AI-generated. Despite these figures, the authors argue that the model size — often touted as the key factor — accounts for only about 10% of the system’s behavior.

The core insight is that 90% of an AI agent’s effectiveness depends on the harness — the prompts, tools, rules, and context policies wrapped around the model. Changes to the harness, such as tweaking prompts or adding tools, have been shown to significantly improve performance, often more than switching to a larger model.

The paper emphasizes that failures in AI agents are usually due to configuration issues, missing tools, or vague rules, rather than the model itself. This shifts strategic focus toward system design, context engineering, and verification.

At a glance
reportWhen: published March 2026
The developmentA Google whitepaper highlights that the core of effective AI-driven software development is not the size of the AI model but the harness and verification systems surrounding it.
The Model Is Only 10% — The New SDLC With Vibe Coding
AI Dispatch · Field Notes
Google · Osmani, Saboo & Kartakis · May 2026

The model is only 10%

A Google whitepaper argues software’s biggest shift is from writing code to expressing intent. Its sharpest claim: the model you obsess over is the smallest part of the system — the scaffolding around it does the real work.

A spectrum, not a binary — the differentiator is how outputs get verified
Vibe Coding
Casual prompts · “does it seem to work?” · disposable code · high risk
Structured AI-Assisted
Detailed prompts + constraints · manual testing · features in real codebases
Agentic Engineering
Formal specs · automated tests + evals + CI gates · production scale · low risk
Tests verify the deterministic; evals verify the rest. Without both, it’s vibe coding — however clever the prompt.
The idea worth building your strategy around
Agent = Model + Harness
~10%
HARNESS — prompts · tools · context · hooks · sandboxes · observability
MODEL~90% IS YOUR SURFACE AREA, NOT THE PROVIDER’S
Outside Top 30 → Top 5 on Terminal Bench 2.0 by changing only the harness — same model.
“Most agent failures, examined honestly, are configuration failures” — a missing tool, a vague rule, a noisy context.
The economics: it’s a token-cost problem (CapEx vs OpEx)
Vibe Coding
Low CapEx · High OpEx
Looks free, hides debt: token burn (fix-it loops), maintenance tax (AI spaghetti), security remediation. Crosses over to 3–10× more per feature.
Agentic Engineering
High CapEx · Low OpEx
Pay upfront (specs, evals, context), then ship cheaply. Levers: context engineering for first-pass success + intelligent model routing — cheap models for the easy work.
85%
of devs use AI coding agents (51% daily)
41%
of all new code is AI-generated
~90%
of agent behavior is the harness, not the model
+19%
longer on some tasks (METR) — verification is the cost
The read

The clearest map yet of how serious AI development works — and mostly tool-agnostic. But it’s a Google funnel: the concepts are neutral, the on-ramps point to Gemini, Jules & the ADK. If the harness is 90% and it’s yours, your moat and your costs both live there — so own your scaffolding, route across models, and remember: AI amplifies whatever engineering culture it lands in.

Source: Osmani, Saboo & Kartakis, “The New SDLC With Vibe Coding,” Google (May 2026). Figures are the paper’s own, incl. METR & LangChain. Analysis is the author’s.
thorstenmeyerai.com

Why Focusing on Harness and Verification Changes AI Strategy

This development matters because it redefines where organizations should invest resources for AI integration. Instead of chasing larger models, companies should prioritize building robust harnesses, context management, and verification processes. This approach can lead to more cost-effective, secure, and reliable AI systems, reducing operational costs and improving performance.

By understanding that the model is only 10% of the system, engineering teams can better allocate efforts toward system architecture and process improvements, which are more controllable and durable over time. This shift also impacts how AI tools are evaluated and adopted across industries.

Amazon

AI prompt engineering tools

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Background on AI System Design and the SDLC Shift

Historically, AI development focused heavily on increasing model size and complexity, with larger models believed to deliver better results. However, recent trends show that the effectiveness of AI agents depends more on how they are integrated into workflows. The paper references experiments where tweaking prompts and system scaffolding outperformed simply upgrading models, challenging the traditional emphasis on model size.

This aligns with broader industry observations that system design, context management, and verification are critical to successful AI deployment, especially as AI becomes embedded in everyday software development processes.

“The biggest shift in software engineering isn’t a new language or framework; it’s moving from writing code to expressing intent and trusting machines to interpret it.”

— Addy Osmani

Amazon

software verification tools

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Unclear Aspects of Implementing the New SDLC Approach

It remains unclear how quickly organizations will adopt this shift in focus from models to harness and verification. The long-term impact on AI development costs, security, and performance metrics is still being studied. Additionally, the specific best practices for system design and context engineering are still evolving, with ongoing experimentation needed to establish standards.

Amazon

AI system harness design software

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Next Steps for AI System Optimization and Industry Adoption

Organizations are expected to begin re-evaluating their AI development strategies, emphasizing harness design, context management, and verification tools. Future research will likely focus on developing standardized frameworks for system configuration and testing, as well as tools to simplify context engineering. Monitoring industry case studies will reveal how this approach impacts costs, security, and reliability over time.

Amazon

automated testing tools for AI

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

Why is the model only 10% of the system’s behavior?

According to the recent whitepaper, the model’s behavior is heavily influenced by the surrounding harness, prompts, tools, and verification processes, which together determine the system’s overall effectiveness.

How does this shift affect AI development costs?

Focusing on harness and verification can reduce ongoing operational costs by minimizing token waste, improving security, and decreasing maintenance, despite higher initial investment in system design.

What practical steps should organizations take now?

Organizations should prioritize building robust harnesses, developing verification protocols, and improving context engineering to optimize AI performance and cost-efficiency.

Will larger models become obsolete?

Not necessarily, but the emphasis will shift from model size to system integration and control. Larger models may still be useful, but their impact will be maximized when combined with well-designed harnesses.

Source: ThorstenMeyerAI.com

You May Also Like

Mac vs GPU Tower for Local LLMs: The Heat-and-Noise Tradeoff

Thorsten Meyer AI frames the local LLM hardware choice around speed, memory, heat and noise.

Internet of Shit: AI Poop Analysis App Offered to Sell Me Database of Its Users’ Poops

A developer of a stool analysis app is offering a database of 150,000 user images for sale, raising concerns over data privacy and misuse.

7 Best PC Routers for Prime Day Deals in 2026

Discover the best PC router deals for Prime Day 2026, including Wi-Fi 7, Wi-Fi 6, and security-focused options tailored for different needs.

AI is a technology not a product

Experts clarify that AI is a pervasive technology, not a standalone product, impacting how companies like Apple approach innovation and consumer experiences.