Skip to content
AI Primer
breaking

Photo AI adds Grok 4 sidebar control for photo packs and remix actions

Levelsio added a Cursor-style sidebar to Photo AI that uses Grok 4 to take photos, run packs, and remix uploads by operating the web app’s own controls. The demo stored chat history in localStorage and used on-screen context to translate requests into UI actions.

4 min read
Photo AI adds Grok 4 sidebar control for photo packs and remix actions
Photo AI adds Grok 4 sidebar control for photo packs and remix actions

TL;DR

  • Pieter Levels added a Cursor-style sidebar to Photo AI, and levelsio's demo shows the chat taking photos, running photo packs, and remixing uploads by driving the app's own controls.
  • According to levelsio's build notes, the chat history lives in localStorage, the assistant uses Grok 4 for responses, and JavaScript turns tool calls into UI actions across prompts, packs, and capture buttons.
  • levelsio's demo also claims the sidebar can use on-screen context, so requests like combining a blue jacket from one image with red hair from another can map to whatever is visible in the interface.
  • Dan Shipper's split-screen post and the Proof homepage point to the same product shape, a persistent agent on one side and the working app on the other.

You can watch levelsio's full demo issue commands inside Photo AI, skim the Photo AI homepage to see the app's existing photo packs and video tools, and compare that setup with Proof, the writing tool Dan Shipper linked as a codex-native example of the same left-agent, right-app layout.

Photo AI already sells itself as an "AI Photographer" that can generate photos, mocap videos, product videos, and themed packs from an AI model of you, according to the official homepage. What changed here is the control surface: instead of clicking through those flows manually, levelsio's post shows a chat sidebar executing them inside the live product.

The demo names the actions pretty plainly:

  • take a photo
  • run a photo pack
  • remix uploaded content
  • control anything already exposed in the interface

That makes the sidebar feel less like customer support chat and more like an operator layer sitting on top of the app.

Screen awareness

The most interesting detail in levelsio's implementation thread is not Grok 4 by itself, it is the wiring. Levels says the conversation history is stored in localStorage, Grok 4 returns responses plus tool choices, and front-end JavaScript then manipulates page elements such as prompt textareas, the photo-pack section, and the take-photo button.

He also says the agent "knows what's on your screen," which is how it can handle visual instructions like mixing the blue jacket from one photo with the red hair from another levelsio's implementation thread. Danny Limanseta's reply immediately asked the obvious follow-up, whether the site is taking periodic screenshots, but the thread does not answer that question.

Agent-first layouts

Dan Shipper framed the broader pattern as "agent running continuously on the left, application that you + the agent use on the right" in his post. His follow-up link to Proof shows the writing version: an agent pane that can review, nudge, and act beside a live document.

The Proof homepage describes the same mechanics in product terms:

  • create a document and get a shareable link
  • paste that link into Claude Code, Codex, ChatGPT, or another agent
  • let agents suggest edits and leave comments
  • track who wrote each character

Shipper's follow-up matters because it turns the layout from a one-off screenshot into a product claim. Levels is applying that same interface grammar to photo generation instead of writing.

Indie maker opening

Marc Kohlbrugge argued that this kind of agent-first product is where indie makers have an opening, partly because incumbents will move slowly and partly because the interface creates new markets instead of just automating old buttons.

That lines up with the company context. Photo AI's own FAQ says it was built and is still run independently by Pieter Levels, without investors, as a solo business Who created Photo AI?. The sidebar demo looks like exactly the kind of fast product mutation that structure makes possible.

Further reading

Discussion across the web

Where this story is being discussed, in original context.

On X· 3 threads
TL;DR1 post
Screen awareness1 post
Agent-first layouts1 post
Share on X