=== HEADLINE ===
DeepSeek V4-Pro Just Tied Claude Opus On SWE-Bench

=== STORY_URL ===
https://jaysoncraig.ca/sandbox/faces/deepseek-v4-pro

=== TWITTER_THREAD ===
1/ A Chinese AI lab just shipped an open weights model that ties Claude Opus 4.6 on the toughest coding benchmark in the industry. MIT license. One seventh the price. The coding frontier just became a commodity.

2/ DeepSeek V4-Pro shipped April 24, 2026. 1.6 trillion total parameters, 49 billion active, 1 million token context. SWE-bench Verified: 80.6%. Claude Opus 4.6: 80.8%. The gap is now a rounding error.

3/ The pricing is the part that should keep every closed API CFO awake. Claude charges $5 / $25 per million input/output tokens. V4-Pro charges $1.74 / $3.48. Roughly one seventh the cost. And you can self host the weights for free.

4/ It is not just SWE-bench. LiveCodeBench: V4-Pro 93.5% vs Claude 88.8%. Codeforces: V4-Pro 3206 vs GPT-5.4 3168. Terminal-Bench 2.0: V4-Pro 67.9% vs Claude 65.4%. Three coding evals. V4-Pro leads on all three.

5/ Where the closed labs still win: Humanity's Last Exam, SimpleQA, HMMT 2026. World knowledge and the hardest reasoning belong to Gemini and the closed frontier. But coding is the use case paying the bills right now.

6/ The geopolitical tell: V4-Pro was validated on Huawei Ascend NPUs alongside Nvidia GPUs. SMIC jumped 10% on the Hong Kong open within hours. The inference layer US export controls were designed to bottleneck just got routed around.

7/ This is the floor. Open weights at frontier. From here the only direction is down on price and up on capability. Full breakdown: [INSERT YT URL]

=== LINKEDIN_POST ===
DeepSeek V4-Pro hit 80.6% on SWE-bench. Claude Opus 4.6 hit 80.8%. The model is open weights, MIT licensed, one seventh the price.

Released April 24, 2026 under MIT License. 1.6 trillion total parameters with 49 billion active per token. 1 million token context window. Downloadable from Hugging Face on day one and validated for inference on both Nvidia and Huawei Ascend silicon.

The benchmark sweep is more striking than the headline. On LiveCodeBench, V4-Pro hit 93.5% against Claude's 88.8% and Gemini 3.1 Pro's 91.7%. On Codeforces, it pulled a 3206 rating, ahead of GPT-5.4 at 3168. On Terminal-Bench 2.0, 67.9% against Claude's 65.4%. V4-Pro leads on three of the four headline coding evals and ties on the fourth.

The pricing rewrites the unit economics. Claude Opus 4.6 charges $5 per million input tokens and $25 per million output. V4-Pro charges $1.74 input and $3.48 output. Roughly one seventh the marginal cost on the API. Self hosting on rented capacity drives it lower still.

The closed labs still hold the lead in pure reasoning, world knowledge, and the hardest math. That gap is real but narrow, and it is not the part of the model that drives revenue at Cursor, Replit, or GitHub Copilot. Coding is the use case paying the bills right now, and parity on coding is the part that DOES NOT GET WALKED BACK.

What this means: capability parity on benchmarks is no longer the moat for the closed labs. The defensible ground is now red team maturity, production reliability, vendor support, and integration depth. Expect price compression at the top of the closed API stack within the quarter.

Watch the full breakdown: [INSERT YT URL]

Source: DeepSeek — https://api-docs.deepseek.com/news/news260424

=== NEWSLETTER ===
Subject: DeepSeek V4-Pro just tied Claude Opus on coding

On Friday, April 24, a Chinese AI lab quietly published the model that ends a one-year story. DeepSeek V4-Pro hit 80.6% on SWE-bench Verified. Anthropic's Claude Opus 4.6 sits at 80.8%. The gap on the benchmark the entire coding agent industry agreed defines the frontier is now two tenths of a single percentage point.

Why this matters more than the average benchmark headline. The model is open weights, MIT licensed, downloadable from Hugging Face the same day. You can self host it. You can fine tune it. You can run it inside an air gapped network. None of those are options with Claude. And the API pricing lands at one seventh of Claude's marginal cost: $1.74 / $3.48 per million input/output tokens against Anthropic's $5 / $25.

The benchmark sweep is even more lopsided once you look past SWE-bench. V4-Pro leads Claude on LiveCodeBench (93.5 vs 88.8), beats GPT-5.4 on the Codeforces algorithmic competition rating (3206 vs 3168), and edges Claude on Terminal-Bench 2.0 (67.9 vs 65.4). The closed labs still hold the lead on world knowledge and the hardest reasoning, but the use case driving the entire AI economy right now is coding, and on coding the gap is gone.

The market noticed. SMIC, the Chinese chipmaker that fabs the Huawei Ascend silicon DeepSeek validated for inference, jumped 10% on the Hong Kong open within hours. Two closed Chinese model labs fell more than 9%. Closed lab pricing committees are meeting on Monday.

Watch: [INSERT YT URL]

— Jane Sterling

=== SHORT_SCRIPT ===
A Chinese AI lab just tied Claude Opus on the hardest coding benchmark in the industry. Then they released the weights for FREE. And priced the API at one seventh of what Anthropic charges.

Here is the number. Anthropic's Claude Opus 4.6 sits at 80.8 percent on SWE-bench Verified. That is the test that asks a model to actually fix real GitHub issues end to end. It is the benchmark every coding agent company has been chasing for a year. DeepSeek V4-Pro just hit 80.6 percent. Two tenths of one percentage point. And you can download the weights from Hugging Face right now, MIT licensed, run it on your own hardware, fine tune it however you want.

V4-Pro also leads on LiveCodeBench, on Codeforces, on Terminal-Bench 2.0. Three out of four headline coding evals. The closed labs still win on world knowledge and the hardest reasoning. But coding is the use case paying the bills right now, and on coding the gap is GONE.

This is the floor. Open weights at frontier. From here the only direction is down on price and up on capability. Stay sharp.

=== HASHTAGS_TWITTER ===
#DeepSeek #SWEBench #OpenWeights

=== HASHTAGS_LINKEDIN ===
#AI #OpenSourceAI #DeepSeek #CodingAgents #Anthropic