English

DeepSeek Made Its 75% AI Price Cut Permanent. The Floor Just Dropped

Descending AI price line dropping to a new floor representing falling enterprise AI costs

This isn't a sale. It's a new standard.

On roughly May 23, 2026, China's DeepSeek turned a promotional price cut into permanent pricing for its flagship model. The 75% reduction was supposed to expire. It didn't. And that matters to you as a CEO even if you'll never run a single workload on DeepSeek.

According to InfoWorld, the permanent cut on DeepSeek's V4-Pro model drops output token costs to roughly $0.87 per million tokens, down from $3.48. That's not a rounding error. It's a repricing of what the market will accept as a reasonable number on an artificial intelligence (AI) vendor invoice.

What Actually Changed

The cut went permanent across the V4-Pro application programming interface (API). Output tokens: $3.48 dropped to $0.87 per million. Cache-miss input: $1.74 dropped to $0.435. Cached input moved from roughly $0.0145 down to $0.0036. All cuts held at roughly 75%.

What makes this a strategic event rather than a product update is timing and signal. DeepSeek set an expiration date on the promotional rate, then removed it. That's a deliberate act of market anchoring. They're telling every buyer of enterprise AI what a commodity-grade tier should cost at scale.

And it's working. According to InfoWorld, analysts expect the leading Western providers, including OpenAI, Anthropic, and Google, to respond not with matching cuts but with expanded cache discounts, batch application programming interface (API) pricing, and enterprise-tier concessions. That response pattern is exactly what you want going into renewal negotiations.

Key Facts

  • By one comparison cited by InfoWorld, for roughly 100 million output tokens per month, DeepSeek V4-Pro comes in near $348 versus roughly $2,500 for Anthropic and roughly $3,000 for OpenAI's comparable tier.
  • DeepSeek is backed by a large state-supported funding round through China's "Big Fund," with a reported targeted valuation between $10 billion and $45 billion.
  • Western vendors are not expected to match DeepSeek's rates directly. Analysts expect the response to come through cache discounts, batch pricing, and committed-use rate reductions.

Why a Price You Won't Pay Still Reprices Your Contracts

Monthly AI cost comparison bars showing 3000 dollars GPT-5.5, 2500 dollars Claude, 348 dollars DeepSeek

You don't need to use DeepSeek for the permanent cut to affect your costs. Price floors shift negotiating power across the entire market.

Think about what happens when you walk into a SaaS renewal knowing there's a comparable alternative priced 7x lower. You don't need to switch. You need to reference it. Your incumbent vendor knows what the floor is. They'll move on cache discounts or batch rates before they lose a multi-year contract.

That's the real impact here. The new floor gives every buyer of enterprise AI a credible anchor at renewal. Before this cut, "the market rate" meant whatever GPT-4o or Claude Opus was charging. Now the market rate has a public, permanent, documented lower bound. Use it.

The complication is that not all AI spend is fungible. Your AI vendor evaluation framework should already account for this distinction, but the DeepSeek situation makes it urgent. Read the token consumption piece on how enterprise AI bills keep rising even as token prices fall: falling unit prices don't automatically mean lower invoices if your workloads are scaling up.

The Catch: Sovereignty and Integration

Here's the part that gets lost in the headline number.

DeepSeek is a Chinese company operating under Chinese jurisdiction. When your systems send prompts, documents, embeddings, and logs to a Chinese-hosted API, you're crossing legal regimes with very different rules on data handling, government access, and intellectual property. For most regulated industries, including finance, healthcare, and legal services, that's not a vendor decision. It's a compliance decision, and the answer is usually no.

There's a path around this. DeepSeek V4-Pro's underlying model is open-weight, meaning a chief information officer (CIO) with the right infrastructure can self-host it inside their own perimeter and fully realize the cost advantage without the data crossing any border. But self-hosting capable frontier-grade models requires significant infrastructure investment and ongoing engineering work. It's not a zero-cost option.

The practical picture for most enterprises right now looks like this: you probably won't run DeepSeek on sensitive workloads. But you can absolutely use its pricing as leverage. And for genuinely low-sensitivity workloads, open-weight models running locally are increasingly viable. Read about why open-weights are getting cheaper and how that changes the build-vs-buy calculus.

The Price-Floor Leverage Test

This is a three-question framework for turning the new floor into action before your next renewal.

Question 1: What is your current blended cost per million tokens, and how far above the floor is it?

Pull your last 90 days of AI invoices. Calculate a blended rate across all vendors and models. Compare it to the new DeepSeek floor. If you're running at $3 to $5 per million tokens on commodity tasks, the gap is substantial and negotiable. If you're at $15 to $20 because you're running heavy reasoning models on complex tasks, the floor matters less for those workloads.

Question 2: Which workloads are genuinely sensitive versus commodity?

Not everything needs Claude Opus or GPT-5.5. Internal summarization, data extraction from structured documents, draft generation for internal communications, and search-augmentation over internal knowledge bases may all be candidates for commodity-priced inference. The workloads that require premium models are usually those touching customer data, regulated output, or high-stakes decisions. Draw that line explicitly before your next renewal.

Question 3: What specific concession will you ask for, and when?

Don't walk into a renewal without a named target. Cache discount percentages, batch API pricing for background workloads, and committed-use rates in exchange for longer contract terms are all on the table right now. Vendors know the floor exists. They're prepared to offer concessions to avoid losing accounts. But they'll wait for you to ask.

The DeepSeek cut accelerates a pricing war that was already underway. The Big Four AI stack procurement pattern from May 2026 shows how leading enterprises are starting to separate premium and commodity tiers deliberately. The pattern is spreading.

What to Do Before Your Next Renewal

Start with a workload audit. Get a clear list of every AI-powered process in your organization, who owns it, what model it runs on, and roughly what it costs per month. Most CEOs don't have this list. That's the first gap to close.

Then identify which items on that list require premium vendor guarantees and which are commodity-eligible. For the commodity-eligible workloads, calculate what switching or renegotiating would save annually. That number becomes your leverage figure.

Take it into renewal. Use the floor as the anchor. Reference the gap between your current rate and what the market now accepts as the commodity benchmark. Ask for cache discounts, batch pricing, or both. Get something in writing before the next 12-month term locks.

Finally, watch what OpenAI and Anthropic announce in Q3. Based on InfoWorld's reporting, expanded batch and cache pricing tiers are coming. If you're approaching renewal in the next 90 days, it may be worth requesting a short-term extension to see what those tiers look like. The vendor calculus around Anthropic's positioning is shifting in real time.

The premium-model question is worth a separate read too: the Anthropic Opus 4 model decision framework walks through when a $15 per million token model is genuinely justified versus when it's a default habit.

The floor dropped. That's your opening. Use it.

Frequently Asked Questions

Does DeepSeek's price cut mean I should switch my enterprise AI stack to DeepSeek?

Probably not entirely, and maybe not at all. The procurement decision depends on data sovereignty requirements, compliance obligations, integration depth with your current stack, and workload sensitivity. For most regulated enterprises, the data risk of using a Chinese-hosted API is disqualifying for sensitive workloads. The more immediate value is using the floor as negotiating leverage with your existing vendors.

Can my company self-host DeepSeek to get the cost advantage without the data risk?

Yes, in principle. DeepSeek V4-Pro runs on open weights, which means a technical team can deploy it inside your own infrastructure. The practical requirements include significant compute resources (A100 or H100 class GPUs at scale), engineering capacity to maintain the deployment, and integration work with your existing systems. For large enterprises with mature AI infrastructure, this is viable. For most mid-market organizations, it requires a dedicated project and budget allocation before the savings materialize.

If I'm not using DeepSeek, how do I actually use this price cut in a vendor negotiation?

Reference the floor directly. In your renewal conversation, note that commodity AI inference is now publicly available at roughly $0.87 per million output tokens on a permanent basis. Ask your vendor what they're prepared to offer on cache discounts, batch processing, or committed-use pricing to close the gap on workloads that don't require their premium capabilities. Most vendors have unpublished discount tiers they'll surface when asked.

Learn More