H Company launches Holo 3.1 with local computer use and 79.3% AndroidWorld
H Company released Holo 3.1, a local computer-use VLM family with function calling and AndroidWorld gains up to 79.3% on the 35B model. The update pushes computer-use agents toward local and mobile deployment instead of cloud-only runtimes.

TL;DR
- H Company shipped hcompany_ai's launch post with an official technical blog post that turns Holo 3.1 into a local-first computer-use family, not just a cloud browser agent.
- On mobile control, hcompany_ai's AndroidWorld claim says the 35B model moved from 67% to 79.3%, while the smaller 4B and 9B variants moved from 58% to 71%; the official blog post rounds that smaller-model figure to 72%.
- The main product change in the LocalLLaMA post and the 35B model card is scope: Holo 3.1 adds mobile support, native function calling, and quantized checkpoints for local deployment.
- According to the Quickstart docs, Holo 3.1 also becomes H Company's default 35B API target on June 15, when
holo3-35b-a3bis deprecated in favor ofholo3-1-35b-a3b.
You can jump from hcompany_ai's link roundup straight to the technical writeup, the Quickstart guide, and the API page. The oddest detail is how much of the release is about harness plumbing, not just model weights: the docs split Holo into an agent loop and a separate element-localization mode, while the launch post frames native function calling as the fix for third-party stack compatibility.
Mobile benchmarks
The headline number is mobile. In the official blog post, H Company says Holo 3.1 extends Holo 3 beyond browser and desktop control into Android environments, with AndroidWorld rising to 79.3% on the 35B-A3B model.
The smaller models are part of the story, too. hcompany_ai's benchmark post says the compact 4B and 9B models jumped from 58% to 71%, while the blog rounds that bucket to 72%, suggesting the company is treating the compact lineup as a practical local deployment tier rather than a stripped-down demo.
Function calling
Holo3.1 35B/9B/4B/0.8B (Qwen 3.5 finetunes)
0 comments
The launch post ties Holo 3.1 to three production axes: environment coverage, agent-framework compatibility, and deployment target. The new compatibility piece is native function calling, which the official blog post says now sits alongside the structured JSON outputs Holo 3 already supported.
H Company claims near-parity between function-calling and native execution across OSWorld plus its internal business-workflow suite, and says Holo 3.1 improves more than 25% over Holo 3 inside its Holotab harness. That makes this release feel as much like a harness update as a model refresh.
Quantized checkpoints
Holo3.1 35B/9B/4B/0.8B (Qwen 3.5 finetunes)
0 comments
Holo 3.1 ships in four sizes, 0.8B, 4B, 9B, and 35B-A3B, with the model card describing the family as Apache 2.0 licensed and based on the Qwen 3.5 family. The new deployment wrinkle is quantization: FP8, NVFP4, and Q4 GGUF are now first-class release artifacts.
The technical blog post gives the local-performance numbers. NVFP4 delivers 1.41x the total token throughput of FP8 and 1.74x that of BF16 on DGX Spark, while H Company says agent-harness optimizations plus NVFP4 cut average step time from 6.8 seconds to 3.3 seconds. Q4 GGUF is the consumer-hardware path, aimed at Windows and Mac local agents.
API surface
The Quickstart docs expose two distinct use patterns:
- Agent loop: multi-turn conversation plus screenshots, returning either
{note, thought, tool_call}or nativetool_calls - Element localization: single-turn screenshot plus target description, returning
{x, y}coordinates in a 0 to 1000 range
The API page lists holo3-1-35b-a3b at $0.25 per 1M input tokens and $1.80 per 1M output tokens, with a 65,536-token context window, up to 5 images, and free-tier access capped at 10 RPM. The Quickstart guide also sets a migration date: Holo3 35B is deprecated on June 15 in favor of Holo 3.1 35B.