OpenAI Releases GPT-5.2-Codex to Bolster AI Coding Against Google Challenge

The new release represents OpenAI's most advanced agentic coding model yet, optimized for professional software engineering and defensive cybersecurity applications.

OpenAI on Thursday launched GPT-5.2-Codex on December 18, barely a week after unveiling its GPT-5.2 model series, as the company races to defend its position against Google's Gemini models. The new release represents OpenAI's most advanced agentic coding model yet, optimized for professional software engineering and defensive cybersecurity applications, combining enhanced long-horizon task handling with significantly stronger cybersecurity capabilities that raise both opportunities and dual-use risks.


AI Generated Image

AI Generated Image

The model is immediately available across all Codex surfaces for paid ChatGPT users, with API access planned for coming weeks. OpenAI said GPT-5.2-Codex excels at large-scale code changes including refactors and migrations, performs better in Windows environments, and introduces improved context compaction for sustained coding sessions.

Cybersecurity capabilities showed marked advancement. While GPT-5.2-Codex did not reach "High" capability under OpenAI's Preparedness Framework, the company said it expects models to cross that threshold soon. CEO Sam Altman announced plans for an invite-only trusted access pilot for vetted security professionals and organizations focused on defensive cybersecurity work.

OpenAI said deployment balances accessibility with safety as capability growth accelerates. The company has implemented specialized safety training for harmful tasks, agent sandboxing, and configurable network access at the product level.

State-of-the-Art Coding Performance Extends Earlier Gains

GPT-5.2-Codex builds on GPT-5.2's strengths unveiled December 11, when OpenAI claimed the model achieved "state-of-the-art agentic coding performance" and positioned GPT-5.2 Thinking as its best vision model. That earlier release scored 55.6% on SWE-Bench Pro, marking OpenAI's first model to reach or exceed human expert performance on professional tasks.

The new Codex variant pushes those metrics higher. GPT-5.2-Codex achieved 56.4% accuracy on SWE-Bench Pro, exceeding GPT-5.2's 55.6% and GPT-5.1's 50.8%. On Terminal-Bench 2.0, which tests AI agents in realistic terminal environments across tasks including code compilation and server setup, GPT-5.2-Codex scored 64.0% compared with GPT-5.2's 62.2% and GPT-5.1's 58.1%.

The model demonstrates improved long-context understanding, reliable tool calling, and native compaction, making it more effective at complex tasks like large refactors and code migrations over extended sessions. Stronger vision performance enables more accurate interpretation of screenshots, technical diagrams, and UI surfaces during coding sessions.

Developer platforms including Windsurf, Cognition, Warp, and JetBrains reported state-of-the-art agentic coding performance following GPT-5.2's December 11 release. Jeff Wang, CEO of Windsurf, said GPT-5.2 "represents the biggest leap for GPT models in agentic coding" and enabled collapsing fragile multi-agent systems into single mega-agents with over 20 tools.

Cybersecurity Capabilities Jump as React Vulnerability Surfaces

GPT-5.2-Codex delivers OpenAI's strongest cybersecurity capabilities to date, with the company tracking sharp jumps in performance starting with GPT-5-Codex and accelerating through subsequent releases. On OpenAI's Professional Capture-the-Flag evaluation measuring advanced multi-step challenges requiring professional-level cybersecurity skills, capability increased substantially with each iteration.

Real-world impact emerged before the new model's release. On December 11, security researcher Andrew MacPherson used GPT-5.1-Codex-Max with Codex CLI to discover three previously unknown vulnerabilities in React that the React team subsequently published. MacPherson was using the model to reproduce an earlier vulnerability known as React2Shell when the model surfaced unexpected behaviors leading to new discoveries.

The researcher guided Codex through standard defensive workflows including setting up test environments, reasoning through attack surfaces, and fuzzing with malformed inputs. OpenAI said the incident demonstrates how advanced AI systems can accelerate defensive security work while highlighting risks that capabilities helping defenders can be misused by malicious actors.

Altman posted on X: "Last week, a security researcher using our previous model found and disclosed a vulnerability in React that could lead to source code exposure. I believe these models will be a net win for cybersecurity, but we are in the 'real impact phase' as they improve."

Trusted Access Program Targets Defensive Security

OpenAI is developing an invite-only trusted access pilot to enable qualifying security professionals and organizations to use frontier AI cyber capabilities for defensive work. The program aims to remove restrictions that security teams encounter when emulating threat actors, analyzing malware for remediation, or stress testing critical infrastructure.

Initial participants will include vetted security professionals with track records of responsible vulnerability disclosure and organizations with clear professional cybersecurity use cases. Qualifying participants will receive access to OpenAI's most capable models for defensive applications to enable legitimate dual-use work.

Altman posted on X: "We are beginning to explore trusted-access programs for defensive cybersecurity work." He also said: "Codex is getting extremely good and will rapidly improve. If you want to help make it 100x better next year, the team is hiring. Crazy adventure guaranteed; success probable."

The company said its deployment approach considers future capability growth as it expects upcoming AI models to continue on the current trajectory toward High-level cybersecurity capability as measured by its Preparedness Framework. GPT-5.2-Codex includes additional model-level and product-level safeguards detailed in an updated system card.

OpenAI said gradual rollout paired with safeguards and close collaboration with the security community aims to maximize defensive impact while reducing misuse risk. The company plans to use learnings from this release to inform expanded access over time as software and cyber capabilities advance.

转载请注明出处、作者和本文链接
声明:文章内容仅供参考、交流、学习、不构成投资建议。
想和千万钛媒体用户分享你的新奇观点和发现,点击这里投稿 。创业或融资寻求报道,点击这里

敬原创,有钛度,得赞赏

赞赏支持
发表评论
0 / 300

根据《网络安全法》实名制要求,请绑定手机号后发表评论

登录后输入评论内容

快报

更多

2026-03-21 22:08

香港引进102家重点企业逾40家布局人工智能

2026-03-21 21:54

中国贸促会副会长刘健男会见丹麦诺和诺德公司全球执行副总裁林意明

2026-03-21 21:53

中东至少9国39座能源设施受损

2026-03-21 21:42

加拿大不列颠哥伦比亚省省长尹大卫将于今年晚些时候访华

2026-03-21 21:00

葛均波担任上海百汇医院首席科学家,明确表示“不领薪”

2026-03-21 20:30

伊朗称美以攻击波斯湾内私人船只及客运交通工具

2026-03-21 20:28

何立峰会见跨国公司负责人

2026-03-21 20:15

伊拉克称伊朗天然气供应量恢复至每日500万立方米

2026-03-21 20:09

3月21日新闻联播速览29条

2026-03-21 20:05

卢拉敦促巴西建立战略石油储备

2026-03-21 20:04

OpenAI计划在年底前将员工人数增加近一倍

2026-03-21 19:58

生态环境部召开支持民营企业绿色低碳发展座谈会

2026-03-21 19:57

中东冲突持续影响油价,悉尼部分加油站柴油售罄

2026-03-21 19:47

美知情人士:美向中东增兵意在“打通”霍尔木兹海峡或夺岛

2026-03-21 19:35

零跑汽车欧洲创新中心在德国慕尼黑正式开业

2026-03-21 19:16

苹果CEO库克:AI是对人能力的放大而非取代,不要惧怕要拥抱AI

2026-03-21 19:10

以防长称接下来的一周将加大对伊朗打击力度

2026-03-21 18:56

中国团队利用欧洲大型强子对撞机发现新粒子

2026-03-21 18:55

厦金“小三通”单日入境旅客量创复航新高

2026-03-21 18:45

马斯克:来自xAI的“Grok Computer”即将推出

扫描下载App