If coding assistants don’t deliver as much productivity as one might expect, it’s because they cannibalize the phase during which specification gaps are most often detected.
Admittedly, the finding comes from an American startup that has made this its business. But the line of thought it arrived at is valuable, particularly from the indicators it drew on.
Three stats to frame the issue
Last week, this startup – Bicameral AI – published a “manifesto,” titled “Code assistants solve the wrong problem.” It opened its argument with three statistics.
The first statistic is sourced from Index.dev (a tech recruiting platform). Bicameral AI presents it as: teams that used AI completed 21% more tasks, but the overall delivery across the organization did not improve.
Index.dev has in fact drawn this figure from the report “The AI Productivity Paradox,” which Faros AI (a development platform) published in the summer of 2025. It appears that Bicameral AI has offered an incomplete summary: the 21% rate applies to teams with extensive use of AI, and delivery is evaluated specifically on the DORA metrics.
Second statistic: experienced developers who used coding assistants were 19% slower, even though they believed they were faster. Hard to verify: the article that reports this, published in January 2025, is no longer accessible. Its author is METR, a nonprofit organization that studies AI’s societal impact. Its founder is a former member of DeepMind and OpenAI.
Third statistic: 48% of code generated by AI contains vulnerabilities. It is supposed to come from Apiiro (an AppSec specialist) and date from 2024. However, the link provided by Bicameral AI points to a September 2025 post that does not directly give this figure. Apiiro has nonetheless regularly given close estimates (“more than 40%” in April 2025, “up to 50%” in August 2025…).
When AI ends up wasting the time it had saved
The manifesto continues with a reference to a Reddit thread with nearly a thousand comments. A senior developer shares a positive experience of his team with AI.
Bicameral AI highlights one comment: the hardest part isn’t writing the code, but managing the edge cases that arise during implementation. The startup builds on this: coding assistants are known for not surfacing specification gaps, but for concealing them. As a result, more time is spent on code review… while with AI, managers expect more.
In this context, the share of developers who believe managers don’t grasp their pain points rises markedly: 63% in 2025, up from 49% in 2024.
These figures come from Atlassian’s latest State of DevEx survey. Bicameral AI cites it to claim that coding assistants save developers nearly 10 hours per week. But the increase in inefficiencies elsewhere in the development cycle almost offsets that gain.
Again, the startup shortens the link. Atlassian actually states that GenAI tools, in general, save at least 10 hours per week for 68% of developers. But at the same time, 50% lose more than 10 hours on tasks other than coding*.
This through-line leads Bicameral AI to an observation: the gap between business intent and implementation is created during product meetings. The startup cites a survey it conducted among developers. In this context, the majority (63%) reported discovering unexpected constraints after committing to implementation.
The specification gaps, spotted mainly during implementation
The comments on the manifesto and the additional survey responses prompted the publication, this week, of a second article. Bicameral AI confirms that the rate of developers uncovering technical constraints at the implementation stage remains high (50%).
The startup mentions another figure: 70% say these constraints must be known beyond their own team, to populations that do not interact regularly with the codebase. Yet, they claim, communicating them is difficult. Documentation practices don’t help: 52% of respondents in their survey transmit technical constraints by copying and pasting into Slack, and 25% mention them verbally, with no written trace. More broadly, 35% of communications produce no persistent artifact.
Bottom line: in practice, the conflict between the product specs and the reality of engineering only becomes apparent during the implementation phase. And, when AI monopolizes this phase, the discovery work must be pushed upstream to the planning phase, lest it slide into the code-review phase—and become all the harder to manage.
Salvation in prompts? The chicken-and-egg paradox
Bicameral begins from the premise that coding assistants are “accommodating”: they can ask for clarifications, but they typically do not suggest exploring other options.
“Just tell the AI to challenge you,” the response to the startup essentially goes. It answers with the chicken-and-egg angle: to prompt correctly, you must already know how technical and product constraints might clash. A knowledge that, as things stand, mainly surfaces only at the implementation phase…
The upstream treatment of the problem could, counterintuitively at first glance, rely on LLMs. These would examine how a given specification could impact existing code structures. On that basis, one could imagine a real-time display of the engineering context during a meeting. An option that some participants in the survey actually suggested.
* In 2024, IDC explained that coding accounted for only 16% of developers’ time. Against 14% for writing specifications and tests, 13% for security, 12% for application monitoring, etc.