At Cursor, the latest shift in its business model did not come without pain.
The company offers an AI-powered code editor. In mid-June, it announced expanding its lineup of individual plans with an Ultra subscription priced at $200/month. At the same time, it claimed that the Pro plan ($20/month) would become “more generous,” moving to a model… “unlimited with throttling.”
By late June, the second act followed: the announcement was updated with a few clarifications. In broad terms:
- By default, the Pro plan becomes subject to compute limits rather than request limits;
- All users receive at least $20 per month in inference credits, at API prices;
- Access to Auto mode becomes unlimited;
- It is possible to stay on the old Pro plan and its 500-requests/month quota;
- The package does not apply to auto-complete input, whose use remains unlimited.
A few days later, Cursor went back for a third time, in response to user pushback. Many users criticized on the official forum and elsewhere a lack of transparency on several aspects. Among them, the definition of “throughput limits” and the impossibility of viewing the compute quota used.
The per-request model, more tenable with the latest LLMs
Talking about throughput limits wasn’t intuitive for what is in fact a monthly credit pool, acknowledges Cursor. This pool is, for all individual plans, worth at least the price of the subscription. Thus, on the Pro plan, it is at least $20 (versus $200 on Cursor Ultra and $60 on Cursor Pro+, introduced June 24).
Each request made outside Auto mode (automatic model selection) draws on these $20. The amount is based on the number of tokens used, converted according to Cursor’s API pricing… with a 20% surcharge.
Once this pool is exhausted, one enters a “grace period.” Cursor then grants additional requests on a best-effort basis, “as available,” roughly every 5 to 24 hours.
When the two limits are exceeded, three options remain: stay in Auto mode, switch to pay-as-you-go (at API pricing), or move to a higher tier.
The previous system was, in itself, more readable: each model consumed a certain number of requests out of the 500 allocated per month. Beyond that, usage-based billing kicked in. A more tenable approach, according to Cursor, is that the new models, with their reasoning capabilities, tend to use far more tokens per request.
Pushed by the situation, the company finally offered some indicative quotas. Based on the median number of tokens used, the Pro plan provides access to 225 requests on Claude Sonnet 4 (down from 250 previously), 550 on Gemini, and 650 on GPT-4.1. This “covers the usage that most subscribers to this plan have.”
Cursor also pledged to reimburse the “unexpected overages” that the new formula might have generated between June 16 and July 4. It also updated its documentation. Notably the page on the so-called throughput limits. It now discusses “burst” and “local” limits. The former correspond to the $20 Pro plan, $60 Pro+ plan, and $200 Ultra plan. The latter refer to the “grace period.” They vary according to the models, the length of conversations, and the size of attached files.
On the same topic
View all Data & AI articles
AGNTCY at the Linux Foundation: what is this Internet project […]
By
Clément Bohic
3 min.
Variable naming, a key factor for code assistants
By
Clément Bohic
GenAI, explored but underused for managing microservices
By
Clément Bohic
The National AI & Digital Council is established
By
The Editorial Team
In the United States, a national AI action plan marked by deregulation
By
Clément Bohic