workflowApril 5, 2026

Indie builders add AI gateway wrappers for per-user limits, GCP key audits, and provider routing

Three builder threads shared reusable layers around model APIs: per-user usage gateways, audits for Gemini-enabled GCP keys, and config-driven routing that swaps providers without app rewrites. Wrapping rate limits, key scope, and model choice in one layer helps teams ship multi-user apps without scattering provider logic.

3 min read

Indie builders add AI gateway wrappers for per-user limits, GCP key audits, and provider routing

TL;DR

A Reddit builder's gateway post framed the boring problem most prototypes skip: once multiple users hit one model backend, per-user usage, rate limits, and cost visibility stop being obvious, which is why the linked DNC AI Gateway repo sits in front of provider APIs instead of inside app code.
Arzaan's keyguard thread turned the recent Gemini key scare into a tool: keyguard audits GCP projects for Gemini-enabled keys, while Google's API key docs note that standard keys can call any API that accepts keys unless restrictions were added.
A model-agnostic architecture thread argued for a single AI interface, config-driven routing, response normalization, and fallback chains, which lines up closely with the router patterns in LiteLLM's fallback docs.

The interesting bit is how similar these wrappers look even when they solve different messes. You can route traffic through a Rust proxy, scan live GCP projects for Gemini exposure, and compare that to Google's own restriction guidance, which explicitly says API keys should be locked to specific clients and APIs.

DNC AI Gateway

r/webdev

My OpenAI usage started getting messy fast — built this to control it (rate limits, usage tracking)

6 comments

The linked repo describes the project as a lightweight Rust proxy for AI APIs. Its pitch is simple: move rate limiting, usage tracking, latency monitoring, and estimated cost into one infrastructure layer instead of scattering provider calls across the app.

That overlaps with OpenAI's own usage dashboard, but the dashboard is org-owned. OpenAI's help docs say user breakdowns live inside the platform account, while the gateway post is about app-level per-user control.

keyguard

r/webdev

Built a tool to find which of your GCP API keys now have Gemini access

0 comments

The keyguard repo splits the problem into three commands: scan for source files and git history, audit for live GCP projects, and ci for GitHub Actions secrets. According to the original thread, audit checks Cloud Resource Manager, Service Usage, and API Keys APIs, then flags unrestricted keys and keys that explicitly allow Gemini.

That design follows Google's own baseline guidance. Google Cloud's restriction docs say keys should be restricted by both client and allowed APIs, and the authentication docs add that standard keys can hit any accepting API until those restrictions exist.

Config-driven routing

r/webdev

Architecture pattern that saved a client $9K/month without touching application code, model-agnostic AI design

0 comments

the architecture post broke the wrapper into four layers:

one internal callAI() interface
config-driven model routing
normalized responses across providers
fallback routing for outages or rate limits

That reads like the indie-builder version of a model router. LiteLLM's fallback docs describe ordered fallbacks, plus separate fallback chains for content policy and context window failures, which is a more formal version of the same idea.

The useful reveal is not a new model. It is that three separate builder threads landed on the same pattern: keep provider logic behind one boundary, then swap policies there instead of touching product code everywhere.