=== TAG === AI Models === HEADLINE === DeepSeek V4-Pro Just Tied Claude Opus On SWE-Bench === META_DESC === DeepSeek V4-Pro hit 80.6% on SWE-bench Verified, within 0.2 points of Claude Opus 4.6, but it costs $1.74/$3.48 per million tokens against Claude's $5/$25 and the weights are MIT licensed. The coding frontier just became a commodity. === DATE === April 26, 2026 === AUTHOR === Jane Sterling === READ_TIME === 9-minute read === HERO_IMG === img/content.png === SCRIPT_LABEL === Video Script (9 min, clean transcript for captioning) === SCRIPT === A Chinese AI lab just shipped a model that matches Claude Opus 4.6 on the toughest coding benchmark in the industry. They released the weights for free under MIT license. They priced the API at one seventh of what Anthropic charges. And they did it on a Friday afternoon while most of Silicon Valley was already heading into the weekend. DeepSeek V4-Pro went live on April 24th, 2026. The official announcement landed on the DeepSeek developer news feed at the same hour Hugging Face turned on the model card. The Pro model carries 1.6 trillion total parameters with 49 billion active per token in a mixture of experts arrangement. The Flash sibling carries 284 billion total with 13 billion active. Both are MIT licensed, both ship with a 1 million token context window as the default, and both can be downloaded immediately. If you have followed the coding agent race for the last twelve months, that 1.6 trillion number should make you sit up. Anthropic's Claude Opus 4.6 had been the unchallenged king of agentic coding work since the start of the year. It scored 80.8 percent on SWE-bench Verified, the Princeton benchmark that asks a model to actually fix real GitHub issues end to end. Every closed lab in the world has been chasing that number. DeepSeek V4-Pro hit 80.6 percent. That is two tenths of a single percentage point behind a model you cannot self host, you cannot fine tune, and you cannot run inside an air gapped network. V4-Pro you can do all three with. This is not a benchmark where the gap closes by accident. SWE-bench Verified is the eval that broke open the agentic coding category in the first place. When Cognition demoed Devin, the headline number was a SWE-bench score. When Anthropic launched Claude 3.5 Sonnet for coding, the headline was SWE-bench. It is the test the entire industry has agreed measures the thing that actually matters, which is whether you can hand the model a real bug and walk away. For an open weights model to pull within 0.2 points of the closed leader is a structural event. It is the moment the coding frontier became a commodity. And that brings us to the price tag, because the price tag is the part that should keep every closed API CFO awake tonight. Claude Opus 4.6 charges 5 dollars per million input tokens and 25 dollars per million output tokens. DeepSeek V4-Pro charges 1 dollar 74 cents per million input on cache miss, 14 and a half cents per million on cache hit, and 3 dollars 48 cents per million output. That is roughly one seventh the marginal cost of Claude for the same coding work, and you can host it yourself if you do not even want to pay that. If you are running a coding agent at scale, the math just changed. That is the headline. Now let us look at what is actually inside. The benchmark sweep DeepSeek published with the release reads like a competitive teardown of the closed frontier. On LiveCodeBench, V4-Pro hit 93.5 percent, ahead of Claude at 88.8 and Gemini 3.1 Pro at 91.7. On Codeforces, the algorithmic competition rating, V4-Pro pulled a 3206, ahead of GPT-5.4 at 3168 and Gemini at 3052. On Terminal-Bench 2.0, the eval that measures how well an agent can drive a real Linux shell and recover from its own mistakes, V4-Pro scored 67.9 percent against Claude's 65.4. Three coding evals. V4-Pro leads on all three. The story is not all one sided. On Humanity's Last Exam, the ferociously hard reasoning benchmark, V4-Pro hit 37.7 percent against Claude's 40, GPT-5.4 at 39.8, and Gemini at 44.4. On SimpleQA Verified, the world knowledge eval, V4-Pro scored 57.9 against Gemini at 75.6. On HMMT 2026, the math competition benchmark, V4-Pro hit 95.2 against Claude at 96.2 and GPT-5.4 at 97.7. So V4-Pro is the new coding frontier. It is not the new everything frontier. World knowledge and the hardest reasoning still belong to Gemini and the closed labs, by margins that matter. But coding is the use case that pays the bills right now. Cursor, Replit, GitHub Copilot, every YC backed coding agent startup, Anthropic's own Claude Code product, all of them depend on a frontier coding model and all of them have been paying frontier prices to get one. V4-Pro just made that economic moat much shallower. How did DeepSeek pull this off. The technical details matter. V4-Pro was trained on 33 trillion tokens. It uses a mixture of FP8 and FP4 precision, which is far more aggressive than the FP16 most US labs train at. DeepSeek says memory consumption is 9.5 to 13.7 times lower than V3.2, which is the kind of efficiency claim that, if it holds up under independent reproduction, is going to force a rethink across the industry. They also validated the model on Huawei Ascend NPUs alongside Nvidia GPUs. That last detail is the geopolitical tell. DeepSeek is openly building a chip supply chain that does not need TSMC or Nvidia for inference. The market noticed within hours. Shares of SMIC, the Chinese chipmaker that fabs the Ascend silicon, jumped 10 percent on the Hong Kong open. Shares of MiniMax and Knowledge Atlas, two Chinese closed model competitors, fell more than 9 percent. The pricing pressure does not just hit Anthropic and OpenAI. It hits every closed model lab in the world, and the closed Chinese labs are getting hit first because they cannot retreat behind a regulatory moat. Michael Kratsios, the US science advisor, came out the same day with a statement calling Chinese AI development industrial scale copying of American work, saying there is nothing innovative about systematically extracting and copying. China's foreign ministry called the accusation groundless and a smear. Both statements were filed within 24 hours of the V4-Pro weights going public. The political response is now baked in. So what does it mean when an open weights model ties the closed coding frontier at one seventh the price. It means three things, and they all happen at once. First, the coding agent business model just became a margin trap for anyone running closed weights. If you are Cursor charging twenty dollars a month per developer seat, the cost of inference under the hood used to be your moat. Now your competitor can self host V4-Pro on rented Ascend or H100 capacity and undercut you on the unit economics while delivering a coding model that ties yours on the eval that matters. Cursor's CEO has been aggressive about model agnosticism precisely because his team saw this coming. Anthropic, by contrast, has built its enterprise pitch on the premise that Opus is uniquely good at coding. That premise just got a lot harder to defend. Second, the export control conversation gets more complicated, not simpler. The US government has spent two years tightening the rules around shipping advanced GPUs to China on the theory that compute scarcity would slow down the Chinese frontier. V4-Pro's training run almost certainly involved smuggled or stockpiled H100 capacity, but the model itself runs efficiently on Ascend silicon for inference. That means the bottleneck the export controls were designed to create, which was the inference layer, is the layer DeepSeek just routed around. Expect the US response to escalate, possibly toward open weights distribution itself, which is a fight nobody in the open source community wants to have. Third, and this is the one that takes longer to play out, the safety story for frontier AI just got messier. Anthropic, OpenAI, and Google have spent years arguing that their closed models can be more rigorously evaluated, red teamed, and gated than open ones. That argument depends on the closed models actually being meaningfully better at the things that matter. V4-Pro is a 1.6 trillion parameter model, MIT licensed, that you can fine tune on whatever dataset you want, including ones the original lab would refuse. The capability gap that justified gating just narrowed. The next time a frontier capability emerges, the open weights version is going to be days or weeks behind, not months. There are still real reasons to be skeptical. Early reviewers said benchmarks do not tell the full story, and they do not. V4-Pro is weaker on world knowledge and the hardest reasoning. Production reliability under sustained agent workloads is unproven. Anthropic has years of red team data and incident response infrastructure that DeepSeek simply does not. Choosing V4-Pro for a regulated workload today is not the same as choosing Claude. But for the median coding task, which is the use case driving the entire AI economy right now, the gap is gone. The coding frontier is open, MIT licensed, and one seventh the price. That is the part that does not get walked back. If you are a developer, you will be testing V4-Pro inside your editor by next week. If you are a closed model lab, your pricing committee is meeting on Monday. And if you are wondering how the next twelve months of the AI race look, this is the floor. Open weights at frontier. From here, the only direction is down on price and up on capability. I am Jane Sterling. This was Sterling Intelligence. === SCRIPT_HTML === === ANNOTATED_LABEL === Annotated Script (with b-roll & cut cues) === ANNOTATED_HTML === [TALKING HEAD — hook]
A Chinese AI lab just shipped a model that matches Claude Opus 4.6 on the toughest coding benchmark in the industry. They released the weights for free under MIT license. They priced the API at one seventh of what Anthropic charges. And they did it on a Friday afternoon while most of Silicon Valley was already heading into the weekend.
[CUT] [VOICEOVER — scene 1] [B-ROLL: company-logo:DeepSeek]DeepSeek V4-Pro went live on April 24th, 2026. The official announcement landed on the DeepSeek developer news feed at the same hour Hugging Face turned on the model card.
[B-ROLL: screen-capture:DeepSeek api-docs news page] [STAT CARD: "1.6T total params · 49B active"] [STAT CARD: "284B total · 13B active (V4-Flash)"] [STAT CARD: "1M token context, MIT license"]The Pro model carries 1.6 trillion total parameters with 49 billion active per token in a mixture of experts arrangement. The Flash sibling carries 284 billion total with 13 billion active. Both are MIT licensed, both ship with a 1 million token context window as the default, and both can be downloaded immediately.
[B-ROLL: ai-abstract] [STAT CARD: "Claude Opus 4.6: 80.8% SWE-bench Verified"] [STAT CARD: "DeepSeek V4-Pro: 80.6% SWE-bench Verified"]If you have followed the coding agent race for the last twelve months, that 1.6 trillion number should make you sit up. Anthropic's Claude Opus 4.6 had been the unchallenged king of agentic coding work since the start of the year. It scored 80.8 percent on SWE-bench Verified, the Princeton benchmark that asks a model to actually fix real GitHub issues end to end. Every closed lab in the world has been chasing that number. DeepSeek V4-Pro hit 80.6 percent. That is two tenths of a single percentage point behind a model you cannot self host, you cannot fine tune, and you cannot run inside an air gapped network. V4-Pro you can do all three with.
[B-ROLL: code-terminal] [B-ROLL: stills:Devin demo screen]This is not a benchmark where the gap closes by accident. SWE-bench Verified is the eval that broke open the agentic coding category in the first place. When Cognition demoed Devin, the headline number was a SWE-bench score. When Anthropic launched Claude 3.5 Sonnet for coding, the headline was SWE-bench. It is the test the entire industry has agreed measures the thing that actually matters, which is whether you can hand the model a real bug and walk away. For an open weights model to pull within 0.2 points of the closed leader is a structural event. It is the moment the coding frontier became a commodity.
[STAT CARD: "Claude Opus 4.6: $5 / $25 per M tokens"] [STAT CARD: "V4-Pro: $1.74 / $3.48 per M tokens"] [STAT CARD: "$0.145 per M input on cache hit"] [STAT CARD: "~1/7th the marginal cost of Claude"]And that brings us to the price tag, because the price tag is the part that should keep every closed API CFO awake tonight. Claude Opus 4.6 charges 5 dollars per million input tokens and 25 dollars per million output tokens. DeepSeek V4-Pro charges 1 dollar 74 cents per million input on cache miss, 14 and a half cents per million on cache hit, and 3 dollars 48 cents per million output. That is roughly one seventh the marginal cost of Claude for the same coding work, and you can host it yourself if you do not even want to pay that. If you are running a coding agent at scale, the math just changed.
[/VOICEOVER] [TALKING HEAD — transition]That is the headline. Now let us look at what is actually inside.
[CUT] [VOICEOVER — scene 2] [B-ROLL: finance-charts] [STAT CARD: "LiveCodeBench: V4-Pro 93.5% · Claude 88.8% · Gemini 91.7%"] [STAT CARD: "Codeforces: V4-Pro 3206 · GPT-5.4 3168 · Gemini 3052"] [STAT CARD: "Terminal-Bench 2.0: V4-Pro 67.9% · Claude 65.4%"]The benchmark sweep DeepSeek published with the release reads like a competitive teardown of the closed frontier. On LiveCodeBench, V4-Pro hit 93.5 percent, ahead of Claude at 88.8 and Gemini 3.1 Pro at 91.7. On Codeforces, the algorithmic competition rating, V4-Pro pulled a 3206, ahead of GPT-5.4 at 3168 and Gemini at 3052. On Terminal-Bench 2.0, the eval that measures how well an agent can drive a real Linux shell and recover from its own mistakes, V4-Pro scored 67.9 percent against Claude's 65.4. Three coding evals. V4-Pro leads on all three.
[B-ROLL: data-center] [STAT CARD: "HLE: V4-Pro 37.7% · Claude 40% · GPT-5.4 39.8% · Gemini 44.4%"] [STAT CARD: "SimpleQA Verified: V4-Pro 57.9% · Gemini 75.6%"] [STAT CARD: "HMMT 2026: V4-Pro 95.2% · Claude 96.2% · GPT-5.4 97.7%"]The story is not all one sided. On Humanity's Last Exam, the ferociously hard reasoning benchmark, V4-Pro hit 37.7 percent against Claude's 40, GPT-5.4 at 39.8, and Gemini at 44.4. On SimpleQA Verified, the world knowledge eval, V4-Pro scored 57.9 against Gemini at 75.6. On HMMT 2026, the math competition benchmark, V4-Pro hit 95.2 against Claude at 96.2 and GPT-5.4 at 97.7. So V4-Pro is the new coding frontier. It is not the new everything frontier. World knowledge and the hardest reasoning still belong to Gemini and the closed labs, by margins that matter.
[B-ROLL: company-logo:Cursor] [B-ROLL: company-logo:GitHub Copilot]But coding is the use case that pays the bills right now. Cursor, Replit, GitHub Copilot, every YC backed coding agent startup, Anthropic's own Claude Code product, all of them depend on a frontier coding model and all of them have been paying frontier prices to get one. V4-Pro just made that economic moat much shallower.
[B-ROLL: code-terminal] [STAT CARD: "33T training tokens"] [STAT CARD: "Mixed FP8 / FP4 precision"] [STAT CARD: "9.5x to 13.7x less memory than V3.2"]How did DeepSeek pull this off. The technical details matter. V4-Pro was trained on 33 trillion tokens. It uses a mixture of FP8 and FP4 precision, which is far more aggressive than the FP16 most US labs train at. DeepSeek says memory consumption is 9.5 to 13.7 times lower than V3.2, which is the kind of efficiency claim that, if it holds up under independent reproduction, is going to force a rethink across the industry. They also validated the model on Huawei Ascend NPUs alongside Nvidia GPUs. That last detail is the geopolitical tell. DeepSeek is openly building a chip supply chain that does not need TSMC or Nvidia for inference.
[B-ROLL: finance-charts] [STAT CARD: "SMIC +10% on Hong Kong open"] [STAT CARD: "MiniMax / Knowledge Atlas −9%"]The market noticed within hours. Shares of SMIC, the Chinese chipmaker that fabs the Ascend silicon, jumped 10 percent on the Hong Kong open. Shares of MiniMax and Knowledge Atlas, two Chinese closed model competitors, fell more than 9 percent. The pricing pressure does not just hit Anthropic and OpenAI. It hits every closed model lab in the world, and the closed Chinese labs are getting hit first because they cannot retreat behind a regulatory moat.
[B-ROLL: news-studio] [B-ROLL: stills:Michael Kratsios White House]Michael Kratsios, the US science advisor, came out the same day with a statement calling Chinese AI development industrial scale copying of American work, saying there is nothing innovative about systematically extracting and copying. China's foreign ministry called the accusation groundless and a smear. Both statements were filed within 24 hours of the V4-Pro weights going public. The political response is now baked in.
[/VOICEOVER] [TALKING HEAD — transition]So what does it mean when an open weights model ties the closed coding frontier at one seventh the price.
It means three things, and they all happen at once.
[CUT] [VOICEOVER — scene 3] [B-ROLL: company-logo:Cursor] [B-ROLL: company-logo:Anthropic]First, the coding agent business model just became a margin trap for anyone running closed weights. If you are Cursor charging twenty dollars a month per developer seat, the cost of inference under the hood used to be your moat. Now your competitor can self host V4-Pro on rented Ascend or H100 capacity and undercut you on the unit economics while delivering a coding model that ties yours on the eval that matters. Cursor's CEO has been aggressive about model agnosticism precisely because his team saw this coming. Anthropic, by contrast, has built its enterprise pitch on the premise that Opus is uniquely good at coding. That premise just got a lot harder to defend.
[B-ROLL: courtroom] [B-ROLL: stills:Commerce Department export controls]Second, the export control conversation gets more complicated, not simpler. The US government has spent two years tightening the rules around shipping advanced GPUs to China on the theory that compute scarcity would slow down the Chinese frontier. V4-Pro's training run almost certainly involved smuggled or stockpiled H100 capacity, but the model itself runs efficiently on Ascend silicon for inference. That means the bottleneck the export controls were designed to create, which was the inference layer, is the layer DeepSeek just routed around. Expect the US response to escalate, possibly toward open weights distribution itself, which is a fight nobody in the open source community wants to have.
[B-ROLL: ai-abstract] [B-ROLL: code-terminal]Third, and this is the one that takes longer to play out, the safety story for frontier AI just got messier. Anthropic, OpenAI, and Google have spent years arguing that their closed models can be more rigorously evaluated, red teamed, and gated than open ones. That argument depends on the closed models actually being meaningfully better at the things that matter. V4-Pro is a 1.6 trillion parameter model, MIT licensed, that you can fine tune on whatever dataset you want, including ones the original lab would refuse. The capability gap that justified gating just narrowed. The next time a frontier capability emerges, the open weights version is going to be days or weeks behind, not months.
[B-ROLL: transition]There are still real reasons to be skeptical. Early reviewers said benchmarks do not tell the full story, and they do not. V4-Pro is weaker on world knowledge and the hardest reasoning. Production reliability under sustained agent workloads is unproven. Anthropic has years of red team data and incident response infrastructure that DeepSeek simply does not. Choosing V4-Pro for a regulated workload today is not the same as choosing Claude.
But for the median coding task, which is the use case driving the entire AI economy right now, the gap is gone. The coding frontier is open, MIT licensed, and one seventh the price. That is the part that does not get walked back.
[/VOICEOVER] [TALKING HEAD — sign-off]If you are a developer, you will be testing V4-Pro inside your editor by next week. If you are a closed model lab, your pricing committee is meeting on Monday. And if you are wondering how the next twelve months of the AI race look, this is the floor. Open weights at frontier. From here, the only direction is down on price and up on capability. I am Jane Sterling. This was Sterling Intelligence.
=== ARTICLE_HTML ===On April 24, 2026, DeepSeek released V4-Pro and V4-Flash under the MIT License. The headline number is the one Anthropic does not want printed: 80.6 percent on SWE-bench Verified, two tenths of a point behind Claude Opus 4.6, at roughly one seventh the marginal cost.
That gap, on the benchmark the entire coding agent industry has agreed defines the frontier, is now small enough to be a rounding error. The model is open weights, downloadable from Hugging Face on day one, and validated for inference on both Nvidia and Huawei Ascend silicon.
The implication is structural. The closed frontier coding model was the highest margin product in the AI stack. It is not anymore.
V4-Pro carries 1.6 trillion total parameters with 49 billion active per forward pass. V4-Flash carries 284 billion total with 13 billion active. Both use a mixture of experts architecture, both ship with a 1 million token context window by default, and both released under MIT License simultaneously through DeepSeek's API, Hugging Face, and a public web service.
The training run consumed 33 trillion tokens. DeepSeek's own technical report describes a mixed FP8 and FP4 precision scheme, with memory consumption claimed at 9.5 to 13.7 times lower than V3.2, the previous generation.
On SWE-bench Verified, V4-Pro scored 80.6 percent against Claude Opus 4.6's 80.8 percent. V4-Flash scored 79.0 percent. On LiveCodeBench, V4-Pro hit 93.5 percent, ahead of Claude at 88.8 and Gemini 3.1 Pro at 91.7. On Codeforces, V4-Pro pulled a 3206 rating, ahead of GPT-5.4 at 3168 and Gemini at 3052. On Terminal-Bench 2.0, V4-Pro scored 67.9 percent against Claude's 65.4.
Across the four headline coding evals, V4-Pro leads on three and ties on the fourth. For an open weights model, that is unprecedented.
On Humanity's Last Exam, the hardest published reasoning benchmark, V4-Pro scored 37.7 percent against Claude's 40, GPT-5.4 at 39.8, and Gemini 3.1 Pro at 44.4. On SimpleQA Verified, the world knowledge eval, V4-Pro hit 57.9 against Gemini's 75.6. On HMMT 2026, the math competition benchmark, V4-Pro scored 95.2 against Claude at 96.2 and GPT-5.4 at 97.7.
The closed labs still hold the lead in pure reasoning, world knowledge, and the hardest math. The gap is real but narrow, and it is not the part of the model that drives revenue at Cursor, Replit, or GitHub Copilot.
V4-Pro is priced at $1.74 per million input tokens on cache miss, $0.145 per million on cache hit, and $3.48 per million output tokens. V4-Flash is $0.028 per million input on cache hit and $0.28 per million output. Claude Opus 4.6 is priced at $5 per million input and $25 per million output. GPT-5.5 is $5 input and $30 output.
For coding workloads at scale, V4-Pro's marginal cost lands at roughly one seventh of Claude Opus 4.6 and one ninth of GPT-5.5. Self hosting on rented capacity drives the marginal cost lower still.
Within hours of the release, SMIC shares jumped 10 percent on the Hong Kong open. MiniMax and Knowledge Atlas, two Chinese closed model competitors, fell more than 9 percent. Huawei announced full Ascend processor support for the model the same day. Western coding agent companies have not yet posted formal responses, but observers expect repricing within weeks.
Michael Kratsios, the White House science advisor, described Chinese AI development as industrial scale copying of American work, saying there is nothing innovative about systematically extracting and copying. China's foreign ministry called the allegations groundless and a smear against the achievements of China's AI industry.
The release also signals a closer integration between DeepSeek and Huawei's Ascend silicon roadmap. DeepSeek validated V4-Pro for inference on Ascend NPUs alongside Nvidia GPUs, sidestepping the chokepoint US export controls were designed to create.
The two labs that built their enterprise pitch on coding superiority face the most direct exposure. Anthropic in particular has positioned Claude Opus and Claude Code around the premise of a unique coding edge. That premise is now defensible only on closed source advantages: red team maturity, production reliability, vendor support, and integration depth. Capability parity on benchmarks is no longer the moat.
Expect price compression at the top of the closed API stack within the quarter, plus accelerated investment in coding agent products that monetize beyond raw token cost.
Three signals will determine how durable this moment is. First, independent reproductions of the SWE-bench Verified score over the next two weeks. Second, whether US export control policy expands to cover open weights distribution itself, an escalation that would touch Hugging Face directly. Third, whether the next Anthropic or OpenAI model release reclaims a clear lead on the coding benchmarks or whether the gap holds.
=== YOUTUBE_DESC === A Chinese open source AI just tied Claude Opus 4.6 on the toughest coding benchmark in the industry. And it costs one seventh as much. DeepSeek V4-Pro launched April 24, 2026 under MIT License. 1.6 trillion total parameters, 49 billion active, 1 million token context. SWE-bench Verified: 80.6 percent against Claude's 80.8. LiveCodeBench: 93.5 against Claude's 88.8. Codeforces: 3206 against GPT-5.4 at 3168. Terminal-Bench 2.0: 67.9 against Claude's 65.4. Three coding evals where V4-Pro leads, one where it ties. The coding frontier just became a commodity. In this episode of Sterling Intelligence we walk through what shipped, what the benchmarks actually mean, the pricing math against Claude Opus 4.6 and GPT-5.5, the Huawei Ascend hardware angle, and the three structural shifts this triggers across coding agents, US export controls, and the safety story for frontier AI. Subscribe for AI news without the hype. New episodes every week. No filler. ⏱ Chapters 00:00 Cold open: an open model just tied Claude on coding 00:38 V4-Pro and V4-Flash specs, MIT License, 1M context 01:32 Why SWE-bench Verified is the benchmark that matters 02:24 Pricing: V4-Pro vs Claude Opus 4.6 vs GPT-5.5 03:18 Coding sweep: LiveCodeBench, Codeforces, Terminal-Bench 2.0 04:25 Where the closed labs still win: HLE, SimpleQA, HMMT 05:10 The training: 33T tokens, FP8/FP4, Huawei Ascend 06:05 Market reaction: SMIC up 10%, MiniMax down 9% 06:48 Kratsios vs Beijing: industrial scale copying or smear 07:25 Implication 1: coding agent margins collapse 08:05 Implication 2: export controls just got harder 08:45 Implication 3: the safety story for frontier AI 09:20 Caveats and the median coding task verdict 09:48 Sign off 🔗 Sources in the article companion at sterling.ai/faces/deepseek-v4-pro #DeepSeek #V4Pro #OpenSource #ClaudeOpus #SWEBench #AICoding #MITLicense #Huawei #ExportControls #FrontierAI #ChinaAI #CodingAgents #Cursor #Anthropic #OpenAI === TITLES_HTML ===Expression. Quietly alarmed, slight forward lean, mouth closed. Like she just saw the numbers and is letting you in on the implication.
Head position. Squared to camera, chin level, slight tilt left of frame to leave headline real estate on the right.
Wardrobe. Standard dark blazer, charcoal collared shirt, no jewelry that catches light.
Eye direction. Direct to camera. No alternate take.
Lighting. Key light upper-left at ~4800K. Subtle teal-green rim light from behind-right at 30% to motif the DeepSeek brand without naming it.
Scene setup. Near-black charcoal studio. Faint code-glyph pattern on the right shoulder side at 10% opacity. Single accent: a small SWE-bench-style chart silhouette in the upper-right at 15% opacity.
Position. Right two thirds of frame, two stacked lines.
Font. Heavy condensed sans, all caps, "80.6%" in oversized weight.
Color scheme. "80.6%" in #ffffff with a 6px charcoal stroke. "OPEN WEIGHTS" in DeepSeek teal #0c8d72. Subtle gold underline beneath.
Accent detail. Tiny "SWE-Bench" tag in 28pt under the percentage in #c8c8c8.
Position. Centered top third over Jane's shoulder.
Font. Heavy display serif, all caps, single line.
Color scheme. "CLAUDE" in Anthropic-orange #d97757, "TIED" in white with a thin red strikethrough.
Accent detail. Small "by an open model" subline in 22pt italic, #c8c8c8.
Position. Right two thirds, large numerals stacked over a divider.
Font. Mono numerals for credibility, sans subline.
Color scheme. "$3.48" in DeepSeek teal #0c8d72, "$25" in muted Anthropic-orange #d97757 with diagonal strike.
Accent detail. Small subline "per million output tokens" in 22pt #c8c8c8.