OpenAI Launches Codex Security Tool in Research Preview

OpenAI has introduced Codex Security, a tool designed to scan project architecture and build a custom threat model. Using this map, the agent targets potential security weaknesses in applications.

OpenAI Develops Bidirectional Audio Model

OpenAI is developing a bidirectional audio model that continuously processes sound in the background and can instantly recognize user interjections, adapting its responses on the fly. This technology enables natural...

Microsoft Releases Multimodal Phi-4 Reasoning-Vision Model

Microsoft has launched a multimodal version of its Phi-4 model, called Phi-4-reasoning-vision-15B, built on the SigLIP-2 encoder and Phi-4’s logical architecture. The model features a mixed inference mechanism that adapts...

Eon Systems Demonstrates Full Brain Emulation Controlling a Simulated Body

Eon Systems has unveiled what could be the first complete brain emulation system controlling a body. They created a full digital model of a fruit fly brain, consisting of about...

China Builds Accelerator-Driven Nuclear Reactor to Burn Waste and Generate Power

China is developing the world’s first megawatt-scale accelerator-driven reactor (ADS) in Guangdong province, designed to burn nuclear waste while producing energy. The reactor uses protons accelerated to about 80% of...

Berkeley Study Finds AI Increases Employee Workload Instead of Reducing It

Researchers from Berkeley conducted an eight-month study inside a tech company, observing how employees actually use AI at work. Contrary to expectations that AI would save time and reduce workload,...

Alibaba Tongyi Lab Open Sources GUI-Owl-1.5 and Mobile-Agent-v3.5

Alibaba Tongyi Lab has open-sourced its GUI-Owl-1.5 and Mobile-Agent-v3.5 model families, designed to autonomously interact with desktop, mobile, and browser interfaces. Built on the Qwen3-VL foundation, these models come in...

Google Research Teaches LLMs to Reason Like Bayesians

Google Research has developed a method to train large language models (LLMs) to reason more rationally by imitating Bayesian models. Instead of only generating text, these models learn to update...

Cortical Labs Human Neurons Play Doom Faster Than GPT-4

Australian company Cortical Labs has successfully connected lab-grown human neurons to a biocomputer and taught them to play the classic game Doom. These neurons, derived from adult donors’ skin and...

YouTube Accelerates LLM Recommendation Validation by 948x with New STATIC Framework

YouTube and Google DeepMind have released a new framework called STATIC that accelerates recommendation validation in large language models (LLMs) by 948 times. The breakthrough solves a common problem where...

OpenAI Releases GPT-5.3 Instant with Improved Accuracy and Communication

OpenAI has launched GPT-5.3 Instant, a major update to its most widely used model, focusing on enhanced communication quality. The model now declines safe requests less often and avoids overly...

Google Releases Gemini 3.1 Flash-Lite: Ultra-Fast and Cost-Efficient AI Model

Google has introduced the Gemini 3.1 Flash-Lite, the fastest and most affordable model in the Gemini 3 series. Priced at just $0.25 per million input tokens and $1.50 per million...

Microsoft Research and Salesforce Reveal Dialogue Reduces LLM Reliability

Microsoft Research and Salesforce have highlighted a rarely discussed issue: dialogue significantly lowers the reliability of large language models (LLMs). Testing 15 top models, including GPT-4.1, Gemini 2.5 Pro, and...

Chinese Robotaxi Firms Suspend Dubai Services Amid Regional Tensions

Chinese autonomous driving firms Baidu’s Apollo Go and WeRide have halted robotaxi operations in Dubai following Iran’s missile strikes that heightened regional tensions. While WeRide continues services in Abu Dhabi...

Sakana AI Introduces Text-to-LoRA and Doc-to-LoRA for Faster LLM Customization

Sakana AI has unveiled two new research advancements, Text-to-LoRA and Doc-to-LoRA, which significantly simplify and speed up the customization of large language models (LLMs). These methods allow models to instantly...

OpenAI to Launch Smart Speaker with Camera in 2027

OpenAI plans to release a smart speaker with a built-in camera and facial recognition capabilities in February 2027. The device, priced between $200 and $300, will analyze the surroundings and...

OpenAI Freezes Stargate Project Amid Challenges

OpenAI has halted its ambitious Stargate project, initially planned in partnership with SoftBank and Oracle. The suspension is due to internal corporate disagreements, a shortage of engineering talent, and investor...

Microsoft Sovereign Cloud Adds Governance, Productivity, and Support for Large AI Models Securely Running Even When Completely Disconnected

Microsoft has expanded its Sovereign Cloud offerings to enhance governance, productivity, and support for large AI models that can run securely even when fully disconnected from the cloud. The updates...

Claude Opus 4.6 Linked to $1.8M Moonwell Exploit

A recent exploit in the DeFi lending protocol Moonwell led to a loss of $1.78 million due to a smart contract vulnerability. The issue arose from an incorrect price setting...

Anthropic Proposes Persona Selection Model to Explain AI Assistant Behavior

Anthropic’s alignment team has introduced the Persona Selection Model (PSM) to explain why AI assistants behave like distinct personalities rather than mere algorithms. The model suggests that during pretraining, language...

AI Plugins Become New Vector for Attacks

A recent surge in attacks via AI plugins has raised significant security concerns. Over 1,100 malicious skills were found on the OpenClaw marketplace, with one attacker uploading 677 packages disguised...

Anthropic Launches Claude Code Security

Anthropic has launched Claude Code Security, a new tool that scans codebases and suggests patches for detected security issues. Currently available in limited preview for Enterprise and Team clients, repository...

SkillsBench Research Shows Real Impact of Skills on LLM Agents

SkillsBench, a new benchmark and research project, tested the impact of Skills on Large Language Model (LLM) agents across 84 tasks in 11 domains with 7 model configurations including Claude,...

VulnLLM-R-7B: New Reasoning Model for Code Security

A new reasoning language model, VulnLLM-R-7B, has been released for code security, designed to detect vulnerabilities like a pentester. Unlike traditional models that search for suspicious patterns, VulnLLM-R-7B analyzes data...

Microsoft Says Bug Causes Copilot to Summarize Confidential Emails

Microsoft has acknowledged a bug in its Microsoft 365 Copilot AI assistant that caused it to summarize confidential emails since late January, bypassing data loss prevention (DLP) policies. The issue...

Google Releases Gemini 3.1 Pro with Advanced Abstract Reasoning

Google has officially launched Gemini 3.1 Pro, showcasing a significant leap in AI intelligence with a 77.1% score on the challenging ARC-AGI-2 abstract reasoning test—nearly double the previous version’s result....

Anthropic Measures AI Agent Autonomy in Real-World Use

Anthropic has released an analysis of millions of interactions with its AI agent, Claude Code, revealing how agent autonomy evolves in practical settings. The study shows autonomous task durations nearly...

Strand-Rust-Coder-14B: Specialized AI Model for Rust Code Generation

The new AI model Strand-Rust-Coder-14B is specifically trained to generate Rust code with the expertise of an experienced developer. Unlike general coding assistants, this model focuses on idiomatic Rust, safe...

LLM Accuracy Significantly Improves by Repeating Prompt Twice

A recent study has revealed that simply repeating the same prompt twice can dramatically boost the accuracy of large language models (LLMs). In one test involving searching for an element...

Context Graphs, One Month In

A month after publishing their perspective on context graphs, Ashu Garg and Jaya Gupta have seen the concept become a major topic in AI. Context graphs serve as institutional memory,...

subscribe via RSS