GitHub Copilot changes to a paid service, revealing the "biggest lie" in the AI industry

Original Title: AI’s Economics Don’t Make Sense

Original Author: Ed Zitron, Where’s Your Ed At
Original Compilation: Deep Tide TechFlow

Deep Tide Guide: Microsoft has finally reached its limit, and GitHub Copilot is switching from a monthly fee to token-based billing. This isn’t a product upgrade but a collective bankruptcy of the entire AI industry subsidy scheme—OpenAI, Anthropic, etc., hide real costs behind monthly fees, making users burn through $8-13 worth of computing power for every $1 spent, training a generation of usage habits that are fundamentally unsustainable. When prices return to reality, you’ll realize those “revolutionary” AI tools might just be expensive toys.

I just wrote an article about how OpenAI is killing Oracle; today’s piece uses some of that material.

It’s one of the best articles I’ve ever written, and I’m very proud of it.

Subscribing to the paid version is not only super cost-effective but also allows me to produce these in-depth, large-scale research articles for free every week.

Yesterday morning, GitHub Copilot users received confirmation of a report I covered a week ago—all GitHub Copilot plans will switch to usage-based billing on June 1, 2026.

Microsoft will no longer give users a fixed number of “requests,” but will charge based on the actual model costs incurred by users. Microsoft claims this is “…an important step toward a sustainable, reliable Copilot business and all user experiences.” How much users can use now depends on how many tokens their subscription can buy (for example, the $19/month plan can use tokens worth $19).

Translation: We can no longer subsidize GitHub Copilot users, or else Amy Hood (Microsoft CFO) will start hitting people with baseball bats.

This announcement itself is an interesting preview, showing how these price changes will be packaged:

Copilot is no longer the product from a year ago. It has evolved from an editor assistant into an intelligent platform capable of running long, multi-step coding sessions, using the latest models to iterate across entire codebases. The use of intelligent agents is becoming the default mode, which brings significantly higher computational and reasoning demands.

Now, a quick chat question and a few hours of autonomous coding sessions might cost the user the same. GitHub has been absorbing the rising inference costs behind this usage, but the premium request-based model is no longer sustainable. Usage-based billing solves this problem. It better aligns pricing with actual usage, helps us maintain long-term service reliability, and reduces the need to restrict heavy users.

See, it’s not Microsoft subsidizing nearly two million people’s computing power; it’s AI becoming so powerful and complex that it’s essentially a different product!

Although Copilot may “not be the product from a year ago,” the underlying economic mismatch remains: Microsoft has allowed users to burn through tokens exceeding their subscription fee for three years. According to a Wall Street Journal report from October 2023:

Individual users pay $10 per month for this AI assistant. Data from an insider reveals that in the first few months of this year, the company’s average loss per user exceeded $20 monthly, with some users costing the company $80 per month.

Naturally, GitHub Copilot users are now protesting, saying the product is “dead” and “completely ruined.”

I predicted this day two years ago in “The Subprime AI Crisis”:

That day has finally arrived because every AI service you use is subsidizing compute power, and every service is losing money because of it:

When you pay for AI startup services—of course including OpenAI and Anthropic—you’re paying a monthly fee, like Anthropic’s Claude at $20, $100, or $200 per month, Perplexity at $20 or $200, or OpenAI at $8, $20, or $200 per month subscription.

In some enterprise scenarios, you get “quotas” for certain work units, like Lovable offering users “100 monthly units” in a $25 subscription, plus $25 cloud hosting (until the end of Q1 2026), with quotas rolling over across months.

When you use these services, the companies either pay AI labs per million tokens, or (for Anthropic and OpenAI) pay cloud providers for GPU rental to run models. A token is roughly 3/4 of a word.

As a user, you don’t feel the token consumption—just the input and output process. AI labs hide service costs behind “tokens,” “messages,” or percentage-based 5-hour rate limits, so you don’t really know how much it costs.

On the backend, AI startups are burning money like crazy. Until recently, Anthropic allowed you to burn up to $8 worth of compute for every $1 of subscription fee. OpenAI also permits this, though measuring the exact amount is difficult.

AI startups and cloud giants think they can attract enough users with subsidies and losses, making users addicted so they won’t switch even when prices rise. I believe they also thought token costs would decrease over time, but what actually happened is: although some models’ prices may have dropped, newer “reasoning” models burn more tokens, making inference costs seemingly higher over time.

Both assumptions are wrong because the monthly subscription model is fundamentally incompatible with any service connected to large language models.

The core economic model of generative AI has collapsed

Think of it this way: when Uber (no, this is completely different from Uber) raises ride prices, the underlying economic logic doesn’t change; what’s presented to passengers and drivers remains the same—users pay per trip, drivers earn per trip.

Drivers still pay for fuel, insurance, local permits, and vehicle financing—costs not subsidized by Uber. Uber’s huge losses come from subsidies, endless marketing, and doomed R&D efforts like autonomous vehicles.

Generative AI subscriptions are completely different from Uber

To illustrate the scale of AI pricing mismatch, imagine a parallel history where Uber has a very different business model.

Generative AI subscriptions are like Uber charging $20 a month for 100 rides within 100 miles, with gasoline at $150 per gallon, and Uber paying for fuel because some insist oil will someday become so cheap it’s not worth measuring.

Uber would eventually start charging monthly fees for ride eligibility, then bill based on fuel consumption. Suddenly, users paying $20 for 100 rides would face a $26 fuel fee for a 10-mile trip costing $10 in fuel. Naturally, users would be annoyed.

Though exaggerated, this is a pretty accurate metaphor for what’s happening in the generative AI industry, especially with GitHub Copilot.

Previously, Copilot’s pricing allowed for 300 premium requests per month and “unlimited chat requests” using models like GPT-5 mini.

Each request (Microsoft says) is “any interaction where you ask Copilot to do something,” and later, more expensive models take up more requests, e.g., Claude Opus 4.6 uses three premium requests. When you run out of premium requests, Copilot lets you freely use cheaper models for the rest of the month.

It wasn’t always like this. Until May 2025, Microsoft offered unlimited model use, but users still felt restrictions and were angry.

Like every AI company, Microsoft has deceived customers with unsustainable services because selling monthly subscription-based LLM services simply doesn’t work.

If you want to estimate how much token-based services might cost, a user in the GitHub Copilot subforum found that a single premium request consumes about $11 worth of tokens, because a “request” involves using 60,000 tokens in context, several tools, and a bunch of internal “rounds” (what the model does) to generate output.

And the unreliability of large language models that can hallucinate. While a deadlocked premium request spitting out half-baked code is frustrating, when you pay yourself, the same failure is less forgivable.

Users are also trained to use products in a completely different way from token-based billing. Many probably don’t even realize how many “tokens” they burn or how much a specific task costs, which varies depending on the model used.

This is entirely different from Uber—anyone claiming otherwise is just trying to defend bad behavior. Uber might raise prices, but it doesn’t need to dramatically change its fundamental economic logic; users don’t have to completely change how they use the product because Uber suddenly starts charging per gallon.

Monthly AI subscriptions are part of an AI subsidy scam—deliberately separating generative AI from its actual costs

There has never—and will never be—a financially viable way to offer LLM-driven services unless charging based on each user’s actual token consumption. In deceiving users, these companies create products with illusory benefits and questionable ROI.

This has been obvious for years.

From an economic perspective, monthly subscriptions only make sense when costs are relatively static. Gyms sell memberships because they roughly know equipment wear, class operating costs, and utility expenses over a period.

Google Workspace customers—at least before AI—pay for document access or storage, plus ongoing costs for Google Docs and other services. Digital storage costs are relatively low (and unlike LLMs, Google Workspace doesn’t demand high compute), so a heavy Google Drive user doesn’t erode their subscription profit.

But these services deliberately hide token counts or how much a specific activity costs, so users don’t really understand what rate limits mean. Sudden changes in rate limits make customers desperately try to figure out how much they can actually do.

It’s a malicious, manipulative, and deceptive business practice, designed solely to let Anthropic, OpenAI, and other AI companies expand their user base by making most AI users perceive their real or imagined benefits from burning $8-13.5 worth of tokens for every $1 subscription.

The only goal of this deliberate deception: ensure most people never understand the true costs of generative AI.

When The Atlantic Monthly wrote a passionate article about Claude Code being Anthropic’s “ChatGPT moment,” it was based on a $20 monthly subscription, not on the underlying token costs that Anthropic incurs. This, in turn, led the author to forgive the model’s “small errors” or when it “gets stuck on more complex programming tasks.”

If the author paid for her actual token consumption, and each stall cost $15 in tokens, I doubt she’d be so forgiving of these failures.

But that’s part of the scam.

Very, very importantly, mainstream media writers who cover AI often don’t understand how much these services actually cost. Any mainstream articles about ChatGPT or Claude Code are written by people who almost don’t know how much each individual task might cost the user.

Remember: most generative AI services are experimental products, functioning differently from any other modern software or hardware. People can’t just walk up to ChatGPT or Claude and start demanding it work.

I mean, you can, but if your prompts are off, or you don’t understand how it works, or it makes mistakes in what you feed it, or if it simply messes up, it will spit out stuff you don’t like, which in turn means you have to prompt it again. LLMs are inherently unpredictable.

You can’t guarantee that an LLM will perform a specific action, or produce a realistic result. You can’t know how much a particular task— even one you’ve done many times before—might cost, or when the model might go haywire and delete something, or simply refuse to do something and claim it did.

If users had to pay real rates, I think many would abandon the product immediately because exploring what an LLM can do can easily burn $5 in tokens.

Side note: In fact, you can burn a lot of money without ever getting the results you want because LLMs are not true artificial intelligence! No one with real understanding of its limitations can easily burn $30, $50, or even $100 trying to persuade an LLM to do what it claims it can.

There’s a term for this. Flattery. LLMs are often designed to affirm users, even when they’re spouting dangerous nonsense, which can extend to asking “Do you want this technically or economically impossible thing?” Of course! That’s why the industry works so hard to hide these costs—that’s damn extortion!

I believe most AI subscriptions will inevitably shift to token-based billing, especially now that Anthropic and OpenAI are doing so for enterprise clients.

Can ordinary companies afford to switch to token-based billing? Anthropic estimates that users spend $13-30 daily on Claude Code (over $7,000 annually), and large organizations spend hundreds of thousands or millions of dollars per year.

As I discussed last week, Uber’s CTO said in a meeting that it spent its entire 2026 AI budget in just a few months. Goldman Sachs recommends some companies spend up to 10% of employee salaries on AI tokens, potentially rising to 100% in the coming quarters.

This is a direct result of training every AI user to maximize usage while hiding the true costs. Every large company requiring every employee to “use AI as much as possible” is doing so while largely ignoring or completely disconnecting from their actual token consumption. As companies are forced to pay real costs, I’m not sure how they can justify any investment in this technology economically.

Of course, you might say “engineers deliver code faster” or similar nonsense—I get it. But how much faster? How much money did you make or save? If you spend 10% of your labor costs on AI tokens, does that extra expense pay off elsewhere?

I’m not sure it does. I doubt any company investing heavily in tokens has seen a return on investment, which is why every study on AI ROI finds little evidence of it.

Most of the time, those who rave about the possibilities of generative AI are experiencing it without bearing the real costs.

Every Twitter lunatic endlessly bragging about their entire engineering team’s crazy use of Claude Code, all on $125/month Teams subscriptions, similar to Anthropic’s $100/month consumer plan. Every LinkedIn “expert” claiming they do “hours of work in minutes” with Perplexity’s Max plan, costing at most $200 per month.

In reality, that $125/month Teams subscription for 10 people probably racks up $5,000 to $10,000 in API call costs each month, or even more.

Last week, Anthropic’s growth chief Amol Avasare said their Max plan is designed for heavy chat use, not for what people do with Claude Code and Cowork, and explicitly mentioned they’re “considering different options to continue providing a quality experience,” meaning “we will adjust prices at some point.”

I’m not sure people realize how expensive these tokens are, especially for coding projects involving large codebases and frequent calls to coding and infrastructure tools. Can someone afford $350, $400, or $500 a month? Can they handle spending over that in a single month? What if their budget is exceeded? Or they simply can’t afford the cost to finish their work?

Here’s a more practical example: until early April, Anthropic’s own Claude Code developer documentation (archived) stated “[users of Claude Code] average cost is $6 per developer per day, with 90% of users staying below $12 daily.” As of this week, it now reads:

Claude Code charges per API token. Subscription plans (Pro, Max, Team, Enterprise) are listed at claude.com/pricing. The cost per developer varies greatly depending on model choice, codebase size, and usage patterns (like running multiple instances or automation). In enterprise deployments, the average cost is about $13 per active developer per day, or $150–$250 per month, with 90% of users staying below $30 per active day.

To estimate your team’s expenses, start with a small pilot group, set benchmarks with the tracking tools below, then scale up.

Assuming an average of 21 workdays per month, the average cost per Claude Code user is about $273 per month, or $3,276 annually. At $30 per workday, that’s $630 per month, or $7,560 per year.

These numbers are staggering. Even more so if you use any of Anthropic’s newer models—you can’t just spend $30 a day. Claude Opus 4.7 costs $5 per million input tokens and $25 per million output tokens. A million tokens is roughly 50k lines of code. If you’re using the so-called state-of-the-art models, you’ll definitely run through at least a million tokens, and if you’re unsure which model to use for specific tasks, that number skyrockets.

Let’s play with this $30 figure again.

For a 10-person dev team, that’s $75,600 a year, and that’s just working days.

If you increase to an average of $50 per day over three months, it jumps to $88,200.

Add one more month at over $100 per day, and it’s $102,900 annually.

Spending $300 a day means a 10-person team spends $756,000 per year.

While this might be feasible for well-funded startups or companies like Meta with deep reserves, any cost-conscious business would find it hard to justify spending five or six figures on a “productivity boost” that nobody can really measure.

Now, I see three main categories of companies:

Large organizations like Spotify or Uber, with AI-obsessed CEOs, allowing unlimited budgets. I’d also include well-funded startups in this group.

Small startups using subsidized “Teams” subscriptions.

Individual users paying monthly for Claude or other AI subscriptions.

Large organizations can still claim they’re burning millions of dollars in AI tokens for software engineers, citing the “benefit” that “top engineers” don’t write any code—an extremely dubious claim.

Just one bad earnings call can change this narrative. At some point, investors—even those who have been hyping the AI bubble—will start questioning the rising R&D costs (AI token consumption is often hidden here) when revenue growth stalls.

This will likely lead to layoffs to cut costs, like Meta, and eventually a pullback when asked, “Are these tools really helping us do better and faster work?”

I also believe that within six months, startups burning 10% or more of their human costs on AI tokens will find it hard to convince investors that such spending is necessary.

Once everyone switches to token-based billing, I doubt we’ll see as much hype around generative AI anymore.

The economics of AI data centers and compute power are fundamentally irrational

People talk about AI data centers in ways that are completely detached from reality. I think most don’t realize how absurd the entire era has become.

Building AI data centers is expensive, operating costs are high, but actual revenue is minimal

According to TD Cowen’s Jerome Darling, key IT (GPU and related hardware) costs are about $30 million, with data center capacity costing $14 million per megawatt. It seems a data center takes one to three years to build, assuming power is available.

By the end of 2028, only 15.2 GW of the planned 114 GW data centers are under construction in some form or fashion. “Under construction” might just mean “a hole in the ground.” It does not mean—nor should it—that the capacity will be available soon.

Let’s start simple: whenever you think “100MW,” think “$4.4 billion,” mostly for NVIDIA GPUs.

So, each AI data center initially loses hundreds of millions of dollars. Even with a six-year depreciation schedule, it takes years to break even… and with NVIDIA’s annual upgrade cycle, once you’ve signed your first client contract, those GPUs are unlikely to make that much money.

It’s also unclear whether the customer base for AI compute exists outside of OpenAI and Anthropic, which together demand about 50% of the data center capacity. If either can’t pay, it creates a systemic weakness.

Regardless, it’s unclear what ongoing rates these data centers charge. Spot prices might be around $4.50 per hour per B200 GPU, but long-term contracts are usually much cheaper. One founder (per The Information) said they pay about $3.70 per hour per GPU for a one-year commitment.

It’s important to distinguish between spot costs—randomly spinning up GPUs on someone else’s servers—and contracted capacity, which accounts for most of the capital expenditure. Most data centers are built with a few large clients in mind, who can negotiate cheaper blended rates.

Thus, many data centers charge far less than $3.70 per hour because they bill per megawatt (or kilowatt).

This is where economics start to break down.

The collapse of the 100MW data center economics—$2.55 per hour, 16% gross margin at 100% occupancy, unprofitable due to debt

This is the initial cost of a 100MW data center. A 100MW facility might only have about 85MW of billable capacity. Based on discussions with sources familiar with ultra-large-scale billing, they expect about $12.5 million per megawatt in annual revenue, or roughly $50k per year.

Now, I should clarify: most data center companies don’t actually build these themselves but leave it to “co-location” providers like Applied Digital, also called “hosting partners.” For example, CoreWeave pays Applied Digital for hosting in North Dakota. CoreWeave manages all GPUs and tech inside the data center.

To illustrate the economic mismatch, I’ll use a hypothetical example: a data center renting capacity to a theoretical AI compute company.

This data center’s GPUs are likely NVIDIA Blackwell chips. More likely, it uses pods of 1.48B200 GPUs, retailing around $45,000 each, or $56,250 per GPU. With an 85MW IT load, total capital expenditure per megawatt is about 36.78, or roughly $1.06B in total IT capex, including about $2.67 billion in GPUs.

Suppose this data center is in Ellendale, North Dakota, where industrial electricity costs about 6.31 cents per kWh, totaling roughly $55.4 million annually. Based on discussions, ongoing costs like maintenance, staffing, power replacements are about 12% of revenue, or roughly $128 million annually, bringing total costs to about $183.4 million.

Wait, sorry. You also have to pay hosting fees based on key IT, which Brightlio estimates at about $180–$200 per kilowatt per month, depending on deployment scale and location. I’ve seen lower at around $130, which I used here, or about $1.33 million annually. That brings us to roughly $316.4 million.

No, that’s still less than $1.06 billion, so we’re okay, right?

Wrong! You have $3.13B in IT equipment to depreciate, which at six years means about $521 million per year. That’s about $837 million annually, leaving roughly $168 million in annual profit, or about 16.7% gross margin…

…if you maintain 100% occupancy! You see, it might take a month or two to install these GPUs and get clients onboard, during which your revenue is zero, but costs—hosting, power, operations—keep piling up, even at much lower rates (I modeled 10% power and 15% hosting/operational costs). That means a daily loss of about $3.27 million.

For this example, assume it takes an extra month to get operational, costing about $102 million, which is gone and can’t be recovered, bringing annual total costs including depreciation to about $939.4 million, or a 6.6% gross margin.

Wait, damn, you didn’t buy these GPUs with debt, right? You did? How bad is it? Oh my— you got a six-year asset-backed loan with 80% loan-to-value, meaning you borrowed $2.8 billion at 6% interest.

Your bank, in its eternal generosity, gave you a deal—12 months grace period, paying only interest… about $168 million, making your first-year total costs (excluding a delayed month for fairness) about $133M… revenue $1.06 billion.

That’s a 5.19% gross margin, and you haven’t even started paying down principal. When that happens, you’ll pay $54.1 million monthly in debt, totaling about $649 million annually over five years, or roughly $1.48 billion, with a gross margin of about -40%.

I must clarify, this assumes 100% utilization, and tenants pay on time every time.

Stargate Abilene is a disaster—$2.94 per GPU per hour, $10 billion annual revenue, years behind schedule, with only one tenant losing billions annually

Let’s talk about what should be the most economically feasible project in data center history—a large campus built by Oracle for the world’s biggest AI company, Oracle, a nearly hyper-scale enterprise with decades of experience selling expensive databases and business software to corporations and governments.

Haha, just kidding. This place is a nightmare.

Stargate Abilene, eight buildings, 1.2GW of capacity, about 824MW of key IT, announced in July 2024. As of April 27, 2026, only two buildings are operational and generating revenue, the third has almost no IT equipment. I estimate the total cost of Stargate Abilene at about $52.8 billion.

Based on my own reporting, Oracle expects about $10 billion in annual revenue from Stargate Abilene, and I estimate about $75 billion total revenue from the 7.1GW capacity built for a single client—OpenAI. As I also reported, Oracle estimates that in 2024, Abilene will cost at least $2.14 billion annually in hosting and electricity paid to land developer Crusoe.

I should also add that Oracle appears to be paying all the construction costs for Abilene.

Based on my calculations and reports, I estimate that once fully operational, Abilene’s gross margin is roughly 37.47%:

I must clarify, 37.47% gross margin might be too high because I don’t know Oracle’s actual insurance or personnel costs precisely, only estimates based on documents I’ve seen.

I should also clarify that Oracle is betting the entire damn future on projects like Stargate Abilene, incurring billions of dollars of upfront costs, and even if OpenAI pays every bill on time, it will take years to turn a profit.

Unfortunately, I can’t tell how much of Abilene is financed by debt. I only know that Oracle raised about $18 billion through bonds in September 2025, with maturities from 7 to 40 years, and recently reported negative cash flow of $24.7 billion in a quarter.

What I do know is that Oracle signed a 15-year lease with developer Crusoe, and its future heavily depends on OpenAI’s continued payments, which in turn depend on Oracle completing the Stargate Abilene project.

I also need to clarify that $3.85 billion in annual profit is only possible if OpenAI pays on time, and everything proceeds as planned with Abilene.

If OpenAI cannot raise $852 billion in revenue, financing, and debt within four years, the Stargate data center project will sink Oracle

Unfortunately, the opposite is happening:

According to DatacenterDynamics, the first 200MW of power was supposed to be operational by “2025.” Over time, the deployment that was supposed to start in early 2025 “has the potential to reach 1GW in 2025,” with full 1.2GW capacity by mid-2026, and 64,000 GPUs deployed by mid-2026. As of September 30, 2025, “two buildings are online.”

By December 12, 2025, Oracle co-CEO Clay Magouyurk said, “Abilene is on schedule, with over 96,000 NVIDIA Grace Blackwell GB200 GPUs delivered, which is the number for two buildings.”

Four months later, on April 22, 2026, Oracle tweeted, “…at Abilene, 200MW is already in operation, and delivery of the eight-building campus is still on schedule.” It’s unclear whether this refers to the 200MW key IT capacity or the total available power for Abilene. Either way, it’s enough for only two buildings, meaning Oracle is definitely not “on schedule.”

This is a huge problem. OpenAI can only pay for actual compute capacity, and only 206MW of key IT is generating revenue. The third building will need at least another month (if not a quarter) to become operational.

However, there’s an even bigger, more fundamental issue with the entire Stargate project—none of this makes sense unless OpenAI’s absurd, cartoonish forecasts come true.

As I discussed on Friday:

Let me repeat these numbers: the 7.1GW Stargate data center under construction will generate about $75 billion in annual revenue, with total costs exceeding $340 billion. Oracle’s free cash flow is negative $24.7 billion, and other business lines are stagnating, making its negative profit and low-profit cloud business the only growth engine.

To truly pay for its compute contracts—including those with Amazon, Microsoft, CoreWeave, Google, Cerberas, and Oracle—OpenAI must raise or earn within four years…

Original Link

Check out the rhythm at BlockBeats’ job openings

Join the official BlockBeats community:

Telegram Subscription Group: https://t.me/theblockbeats

Telegram Group Chat: https://t.me/BlockBeats_App

Twitter Official Account: https://twitter.com/BlockBeatsAsia

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments