Tiiny claims its pocket-sized local AI server can run open models up to 120B and expose an OpenAI-compatible local API without token fees. Privacy-sensitive teams should validate throughput and model quality before deploying always-on local agents.

Tiiny's core claim is a pocket-size device that acts as a personal inference server for open-source AI models. In the launch thread, Paul Couvert says it can run models "up to 120B," stay "100% local and private," and serve workloads that normally sit behind a hosted API.
The linked Kickstarter page, as summarized in the project post, adds the implementation detail engineers will care about: Tiiny is presented as an OpenAI-compatible local API endpoint with "one-click deployment" and "no token fees." That positions it less like a standalone app and more like a small edge box that could slot into existing agent or chat stacks with minimal client-side changes.
The practical demos center on replacing hosted subscriptions with local inference. According to the demo summary, the box can run a local chat interface, support coding workflows, generate landing pages, and drive browser agents for scraping, form filling, and social posting. The same summary says it can also handle text-to-speech and text-to-image models, widening the pitch beyond a single LLM endpoint.
What the evidence does not establish is the performance envelope. Neither the thread nor the demo summary specifies tokens per second, quantization, concurrent request handling, power draw, or which 120B models were actually tested. For engineering teams, that leaves Tiiny as an interesting edge-serving claim with API-compatibility appeal, but without the benchmark detail needed to compare it against a Mac Studio, a local GPU box, or a managed inference endpoint.
You can run local AI models up to 120B without a $10,000 Mac Studio This phone-sized device is your own server for open-source models. 100% local and private. And you can use it to: - Power an agent like OpenClaw 24/7 - Completely replace a chatbot - Literally anything that Show more
Access to Tiiny Kickstarter page → kickstarter.com/projects/tiiny… I’ve been using it for weeks now and have gone from spending hundreds of dollars on APIs to literally zero... All while no longer giving any data to third-party servers. You own your intelligence!
Also available on YT if you prefer to watch there! youtu.be/Ew41f0B28T8