For the past several weeks, Anthropic’s Mythos has been regarded as the benchmark for AI‑driven cybersecurity. That advantage may already be eroding. A new Wall Street Journal report cites security researchers who say Chinese AI firm Z.ai’s GLM‑5.2 now matches Mythos in uncovering software security bugs, even though it still falls behind Anthropic and OpenAI on broader reasoning tasks.
GLM‑5.2 is narrowing the gap in a critical domain
According to the article, researchers observed that GLM‑5.2 performs on par with Mythos when it comes to spotting software vulnerabilities – a capability that is becoming ever more vital as companies scramble to patch flaws before attackers can exploit them. The model is also open‑source, allowing anyone to download, modify, and run it on local hardware without depending on a cloud service. This flexibility is appealing to enterprises, but it also raises the specter of cybercriminals adapting the tool for offensive use.
The report stresses that this does not imply China has overtaken the United States in AI overall. GLM‑5.2 still trails Anthropic and OpenAI on many general‑purpose benchmarks. However, in cybersecurity, where modest gains can have outsized real‑world impact, the performance disparity has shrunk dramatically. Benchmark data quoted by the Journal shows GLM‑5.2 even outperformed Claude Opus 4.8 in certain security tests, and researchers note that additional prompting can push it to Mythos‑level bug‑finding performance.
The larger narrative isn’t about a single winner; it’s about how quickly the gap is closing
This development comes at an awkward moment for the U.S. AI sector. While firms such as Anthropic and OpenAI have recently restricted access to their most advanced frontier models over national‑security concerns, Chinese labs have been moving in the opposite direction, releasing increasingly capable open‑weight models that anyone can download and run.
The debate has already been public. A few days ago, Elon Musk predicted Chinese AI labs could catch up to Anthropic’s flagship Fable 5 by the first quarter of 2027, at least on benchmark performance. Zhipu AI founder Tang Jie quickly responded, “won’t take that long.” Musk later clarified that while China might match Anthropic on benchmarks, achieving the same level of “true usefulness” would be a far tougher hurdle, crediting Anthropic’s emphasis on practical intelligence.
Now, the Wall Street Journal’s latest story gives Tang’s optimism more credibility. Rather than focusing on coding benchmarks, it suggests GLM‑5.2 is already on par with Anthropic’s Mythos at identifying security vulnerabilities – arguably one of today’s most valuable real‑world AI applications. This doesn’t instantly crown China as the leader in frontier AI, but it underscores a growing reality: the AI race is no longer a comfortable lead for the United States.

On benchmarks, yes, but as measured by true usefulness even Q1 would be very impressive.Anthropic has rightly focused on maximizing useful intelligence, which does not show up in benchmarks, but definitely shows up in revenue.
— Elon Musk (@elonmusk) June 18, 2026
