Skip to content
AI Primer
release

OpenAI opens ChatGPT Lockdown Mode to all plans and limits outbound data exfiltration

OpenAI expanded Lockdown Mode from organizations to personal and self-serve Business accounts, adding an opt-in setting that limits outbound network requests. The feature is meant to block the final exfiltration step in prompt-injection attacks, though malicious instructions can still affect responses.

3 min read
OpenAI opens ChatGPT Lockdown Mode to all plans and limits outbound data exfiltration
OpenAI opens ChatGPT Lockdown Mode to all plans and limits outbound data exfiltration

TL;DR

  • cryps1s' rollout post says OpenAI has expanded ChatGPT Lockdown Mode from organizations to all logged-in users on all plans, and btibor91's changelog summary says the setting is opt-in.
  • cryps1s and Simon Willison's write-up both describe the core control the same way: Lockdown Mode limits outbound network requests so a prompt injection cannot easily complete the data exfiltration step.
  • According to btibor91's release-notes thread, the broader June 4 update also doubled memory capacity for Plus and Pro in the US and started a UK ads rollout for Free and Go users.
  • Simon Willison's note adds the key caveat: Lockdown Mode does not stop prompt injections from appearing in cached web content or uploaded files, it only narrows the channels that could send data back out.

OpenAI's Lockdown Mode help page is unusually direct about the tradeoff: the feature reduces exfiltration risk by cutting ChatGPT off from parts of the web and other external services. You can also trace the rollout across the June 4 release notes, OpenAI's earlier February announcement, and Simon Willison's write-up, which frames it as an attempt to break the exfiltration leg of the prompt-injection problem.

Lockdown Mode

The June 4 rollout turned Lockdown Mode from an organization feature into a general ChatGPT setting. cryps1s' post says OpenAI had shipped it for organizations a few months earlier, and the release notes say it is now rolling out to logged-in users across Free, Go, Plus, and Pro, plus self-serve ChatGPT Business accounts.

OpenAI's February announcement framed it as an advanced setting for higher-risk users. The public version keeps that framing: cryps1s says it is not meant for everyone and comes with utility tradeoffs.

Disabled surfaces

OpenAI Help: Lockdown Mode

OpenAI Help: Lockdown Mode OpenAI first teased this in February, but now it's live and "rolling out to eligible personal accounts, including Free, Go, Plus, and Pro, and self-serve ChatGPT Business accounts": Lockdown Mode is designed to help prevent the final stage of data exfiltration from a prompt injection attack by limiting outbound network requests that could transfer sensitive data to an attacker. Lockdown Mode does not prevent prompt injections from appearing in the content ChatGPT processes. For example, a prompt injection could appear in cached web content or in an uploaded file, and could still affect the behavior or accuracy of a response. This looks really good to me. The Lethal Trifecta occurs when an LLM system has access to all three of access to private data, exposure to untrusted content and a way to steal data and transmit it back to the attacker. The only way to solve the trifecta is to cut off one of the three legs, and by far the easiest leg to restrict without making your LLM systems far less useful is the exfiltration vectors to steal data. It looks to me like lockdown mode directly attacks that leg, using mechanisms that are deterministic and, crucially, are not evaluated by AI systems that themselves can be subverted by sufficiently devious attacks. The existence of lockdown mode does however imply that ChatGPT, in its default settings, does not provide robust protection against sufficiently determined data exfiltration attacks! Tags: security, ai, o

OpenAI describes Lockdown Mode as a way to limit access to the web and external services. Reporting that quotes the help page says the restrictions include:

  • outbound network requests that could transfer sensitive data, according to cryps1s
  • web and external service access more broadly, according to btibor91's summary
  • Deep Research and Agent Mode being disabled, according to Engadget's report
  • file downloads and image fetching from the internet being curtailed, while manual uploads still work, according to Engadget's report

That makes the feature less like a generic safety toggle and more like a hard reduction in ChatGPT's ability to touch outside systems.

Remaining risk

OpenAI Help: Lockdown Mode

OpenAI Help: Lockdown Mode OpenAI first teased this in February, but now it's live and "rolling out to eligible personal accounts, including Free, Go, Plus, and Pro, and self-serve ChatGPT Business accounts": Lockdown Mode is designed to help prevent the final stage of data exfiltration from a prompt injection attack by limiting outbound network requests that could transfer sensitive data to an attacker. Lockdown Mode does not prevent prompt injections from appearing in the content ChatGPT processes. For example, a prompt injection could appear in cached web content or in an uploaded file, and could still affect the behavior or accuracy of a response. This looks really good to me. The Lethal Trifecta occurs when an LLM system has access to all three of access to private data, exposure to untrusted content and a way to steal data and transmit it back to the attacker. The only way to solve the trifecta is to cut off one of the three legs, and by far the easiest leg to restrict without making your LLM systems far less useful is the exfiltration vectors to steal data. It looks to me like lockdown mode directly attacks that leg, using mechanisms that are deterministic and, crucially, are not evaluated by AI systems that themselves can be subverted by sufficiently devious attacks. The existence of lockdown mode does however imply that ChatGPT, in its default settings, does not provide robust protection against sufficiently determined data exfiltration attacks! Tags: security, ai, o

The most important sentence in the help page is the limitation Simon Willison surfaced: Lockdown Mode does not prevent malicious instructions from showing up in the content ChatGPT reads. A prompt injection can still live in cached web content or in an uploaded file, and it can still distort the model's behavior or the accuracy of its response.

Willison argues that the feature targets the last step in the attack chain, not the whole chain. His post ties that to the "Lethal Trifecta" model, where the easiest leg to cut is the outbound exfiltration channel rather than the model's exposure to untrusted content itself.

Further reading

Discussion across the web

Where this story is being discussed, in original context.

On X· 1 thread
Lockdown Mode1 post
Share on X