=== HEADLINE_SHORT === Claude Just Got Tied. By Open Source. === HOOK_SCRIPT === A Chinese AI lab just tied Claude Opus on the hardest coding benchmark in the industry. === SHORT_SCRIPT === A Chinese AI lab just tied Claude Opus on the hardest coding benchmark in the industry. Then they released the weights for FREE. And priced the API at one seventh of what Anthropic charges. Here is the number. Anthropic's Claude Opus 4.6 sits at 80.8 percent on SWE-bench Verified. That is the test that asks a model to actually fix real GitHub issues end to end. It is the benchmark every coding agent company has been chasing for a year. DeepSeek V4-Pro just hit 80.6 percent. Two tenths of one percentage point. And you can download the weights from Hugging Face right now, MIT licensed, run it on your own hardware, fine tune it however you want. V4-Pro also leads on LiveCodeBench, on Codeforces, on Terminal-Bench 2.0. Three out of four headline coding evals. The closed labs still win on world knowledge and the hardest reasoning. But coding is the use case paying the bills right now, and on coding the gap is GONE. This is the floor. Open weights at frontier. From here the only direction is down on price and up on capability. Stay sharp. === ANNOTATED_HTML_SHORT === [TIGHT CLOSE-UP — Jane]
A Chinese AI lab just tied Claude Opus on the hardest coding benchmark in the industry.
[ON-SCREEN TEXT: "80.6% SWE-Bench"]Then they released the weights for FREE. And priced the API at one seventh of what Anthropic charges.
[ZOOM IN]Here is the number. Anthropic's Claude Opus 4.6 sits at 80.8 percent on SWE-bench Verified. That is the test that asks a model to actually fix real GitHub issues end to end. It is the benchmark every coding agent company has been chasing for a year. DeepSeek V4-Pro just hit 80.6 percent. Two tenths of one percentage point. And you can download the weights from Hugging Face right now, MIT licensed, run it on your own hardware, fine tune it however you want.
[ON-SCREEN TEXT: "$1.74 / $3.48 per M tokens"]V4-Pro also leads on LiveCodeBench, on Codeforces, on Terminal-Bench 2.0. Three out of four headline coding evals. The closed labs still win on world knowledge and the hardest reasoning. But coding is the use case paying the bills right now, and on coding the gap is GONE.
[CUT]This is the floor. Open weights at frontier. From here the only direction is down on price and up on capability. Stay sharp.
=== THUMBNAIL_HTML_SHORT ===Framing. Tight head-and-shoulders, eyes in the upper third, mouth at center. Face fills ~60% of frame width.
Expression. Quietly alarmed, slight forward lean, mouth closed. Like she just saw the numbers and is letting you in on the implication.
Eye direction. Direct to camera, locked. No alternate take.
Background. Near-black charcoal, simple. Faint teal-green gradient (#0c8d72 at 12% opacity) bleeds in from the upper-right corner. No environmental detail — the vertical viewport is too narrow for it.
Lighting. Key light from upper-left at ~4800K. Soft fill on the right at 25%. Subtle teal rim light from behind-right at 30% to motif the DeepSeek brand without naming it.
Position. Top third, single line, oversized.
Font. Heavy condensed sans, all caps, max display weight.
Color. "TIED" in #ffffff with a 6px charcoal stroke. "CLAUDE" in DeepSeek teal #0c8d72. Subtle gold underline beneath the word "CLAUDE" only.
Accent detail. Tiny "80.6% SWE-Bench" tag in 28pt under the headline in #c8c8c8.
Position. Top third, mono numerals stacked over a thin divider.
Font. Mono numerals for credibility, sans subline.
Color. "$3.48" in DeepSeek teal #0c8d72, "$25" in muted Anthropic-orange #d97757 with a diagonal red strike.
Accent detail. Small subline "per M output tokens" in 22pt #c8c8c8.