#prompt injection
27 stories taggedprompt injection.

Twelve Controls That Actually Matter Once AI Ships to Production
Visibility into AI applications is a starting point, not a security posture. Here is what ongoing monitoring and defense of production AI systems looks like in practice.

AI Red Teaming Grew Up. The Job Description Is Still Being Written.
The tools broke when LLMs arrived. Now the discipline is rebuilding itself in real time — and the threat model includes teenagers with too much free time.

OpenAI's Lockdown Mode Admits the Problem It Can't Quite Fix
The new containment feature reduces AI-enabled data exfiltration — it doesn't stop it. Experts are divided on whether enterprises should even trust a vendor to police itself.

OpenAI Ships ChatGPT 'Lockdown Mode' to Blunt Prompt-Injection Data Theft
The opt-in setting strips connectors and browsing tools that attackers have used to siphon data from logged-in sessions.

One GitHub Issue Was Enough to Pwn Repos Running Claude Code Action
A bug in Anthropic's Claude Code GitHub Action turned issue triage into arbitrary code execution — including, briefly, against the action's own repo.

A Single Notification Could Hijack Gemini on Android
Researchers showed how a poisoned WhatsApp, Slack or SMS alert could weaponize Google's voice assistant — no malicious app required.

Someone Finally Tested 100 AI Agents for Security. Here's the Framework They Used.
A new evaluation methodology ranks AI agents by vulnerability, blast radius, and defensive posture. The results are a useful corrective to vendor claims.

ChatGPhish: When ChatGPT's Markdown Renderer Becomes a Phishing Vector
Permiso researchers show how implicit trust in Markdown links and images inside ChatGPT responses turns the assistant into a credible delivery surface for prompt injection and credential theft.

You Can't Audit What You Can't See: The Agent Governance Hole Nobody Wants to Talk About
Enterprises are shipping AI agents into production without inventories, without trace pipelines, and without a coherent answer to a basic question: what is this thing actually doing?

Anthropic Wires Claude Into 28 Enterprise Security Platforms
The AI company is pushing deeper into corporate security stacks, connecting Claude to vendors including CrowdStrike, Okta, and Zscaler.

Microsoft Open-Sources Rampart and Clarity to Embed AI Agent Safety Into Dev Pipelines
Two new tools shift AI red-teaming left, targeting prompt injection and privilege escalation before code ships.

Treat the Model Like a Threat: Why AI Agent Security Needs a Systems Overhaul
A paper from researchers at Google and two US universities argues that prompt-level defences and alignment tuning are structurally inadequate for securing autonomous AI agents — and that enterprises should start treating the model itself as an untrusted component.