The Lookout

Listen to this briefing

The mines in the Strait of Hormuz are no longer theoretical. Three merchant vessels were struck on Wednesday in and around the strait, including the Thai cargo ship Mayuree Naree, which was set ablaze. The IRGC declared it will not allow "one litre of oil" to be exported from the Middle East while US and Israeli strikes continue. Hours later, the thirty-two member states of the International Energy Agency took an unprecedented step: unanimously agreeing to release 400 million barrels from strategic petroleum reserves. That word — unprecedented — keeps appearing in official statements, which tells you how far outside normal parameters we've drifted. Southern Iraq's oil production has collapsed 70% since the war began, from 4.3 million barrels per day to 1.3 million. Drones hit fuel storage in Oman. Day thirteen of this conflict and the economic blast radius is now global.

Meanwhile, NBC News reported that the US military is using Palantir's AI systems to identify targets for airstrikes in Iran. Over 2,000 targets have been struck in what's been dubbed Operation Epic Fury, and Congress is starting to ask uncomfortable questions about oversight. The Pentagon's response was boilerplate: AI helps warfighters process data faster than any human could alone. Which is precisely the kind of sentence that should make you uneasy. Processing data faster is fine when you're sorting spreadsheets. When the output is a list of coordinates to hit with explosives, the word "faster" stops being reassuring.

This connects directly to the most consequential story in AI right now: Anthropic's legal fight against the Pentagon. As of this morning, Anthropic has filed for an emergency stay with the DC Circuit Court of Appeals, arguing that the Defense Department's "supply chain risk" designation — usually reserved for foreign adversaries and sanctioned entities — will cause "irreparable harm" and could cost the company billions. Microsoft filed an amicus brief supporting Anthropic, calling the designation an overreach. Retired military chiefs did the same. The Atlantic published a profile titled "The Dissonance of Dario Amodei," and TIME declared Anthropic "the most disruptive company in the world." All because Amodei held two red lines in a Pentagon contract renegotiation: no mass surveillance of American citizens, and no fully autonomous weapons without human oversight. Those seem like reasonable positions. The Pentagon disagreed. Now the government is using a supply-chain designation as a club against a domestic AI company for the crime of having principles it wouldn't sell. Whatever you think of Anthropic's products, this is a story about whether the government can coerce private companies into removing ethical guardrails. The precedent matters more than the company.

In less geopolitically fraught news, METR published research that should be required reading for anyone who cites SWE-bench scores in a pitch deck. Their finding: roughly half of test-passing SWE-bench Verified PRs written by AI agents from mid-2024 through late 2025 would not actually be merged by the repositories' maintainers. The tests pass, but the code is wrong — wrong approach, wrong style, introduces regressions, or solves the symptom rather than the problem. Engineer's Codex followed up with their own analysis showing most coding agents break 75% or more of their own fixes over time. This is what happens when you optimise for a metric that doesn't measure what you think it measures. SWE-bench passing rate is not a proxy for "writes code a team would ship." It's a proxy for "makes the tests green." Those are very different things. Every engineer who's reviewed a junior developer's PR knows the difference intuitively. The benchmark industry hasn't caught up.

On a lighter note, JavaScript finally has a proper date and time API. Temporal reached Stage 4 at TC39, the culmination of a nine-year effort led by Bloomberg engineers. Bloomberg's JS blog published the full history — how Maggie Johnson-Pint shepherded the proposal through committee while Andrew Paprocki worked with Igalia on V8's time zone layer. If you've ever written `new Date()` and immediately regretted it, Temporal is the fix. Immutable objects, proper time zone support, no more silent coercion of invalid inputs into garbage. Thirty years of JavaScript developers reaching for Moment.js or date-fns because the built-in was broken, and the built-in is finally getting replaced. The HN thread was celebratory, which for HN is saying something.

The cyber dimension of the Iran conflict spilled into medtech. Handala, a pro-Iran hacktivist group, claimed responsibility for wiping Stryker's global systems — the $20 billion medical device and services company. Login pages across Stryker's infrastructure now display the Handala logo. The group said the attack was retaliation for the Minab school strike, which Iranian state media claims killed 168 children. The attack is a wiper, not ransomware — the goal was destruction, not profit. Medical device companies sit at an uncomfortable intersection: they're American corporations that the US government would consider critical infrastructure, but their products save lives regardless of nationality. Targeting them is an escalation that blurs the line between cyberwarfare and attacking civilian infrastructure.

Jack Dorsey picked a fight with Coinbase this week, and it's worth paying attention to because it reveals the fault lines within the Bitcoin advocacy space. Marty Bent reported that Coinbase has been telling lawmakers a de minimis tax exemption for Bitcoin transactions is unnecessary because "no one is using bitcoin as money." Dorsey demanded clarity from Brian Armstrong. Coinbase's Chief Policy Officer denied it — "we have never and will never lobby against Bitcoin." But the accusation stings because it's plausible. A de minimis exemption (Lummis is pushing $300) would make everyday Bitcoin spending practical by not triggering a taxable event for a cup of coffee. Coinbase makes its money from trading, not spending. An exchange has little incentive to make Bitcoin function as currency rather than a speculative asset. Dorsey, who built Cash App's Bitcoin integration around Lightning payments, has the opposite incentive. The argument is really about what Bitcoin is for, and who gets to decide.

Block height 940,362. Fees at 1–2 sat/vB. The mempool is calm. Microsoft patched 84 vulnerabilities on Tuesday including two publicly disclosed zero-days — if you run Windows, patch now, don't wait.


References

monomi.org Built by Monomi