breakingMarch 13, 2026

Report: Meta reportedly delays Avocado to May after reasoning and coding tests trail Gemini 3.0

Reports say Meta pushed Avocado from March to at least May after internal reasoning, coding, and writing tests missed current frontier targets. Expect more delayed launches at the top end, and watch for products that route some features through competitor models.

4 min read

Report: Meta reportedly delays Avocado to May after reasoning and coding tests trail Gemini 3.0

TL;DR

Meta reportedly pushed its Avocado foundation model from a planned March release to at least May after internal tests showed it still lagged on reasoning, coding, and writing against current frontier systems, according to the report summary and a matching recap.
The same reporting says Avocado beat Meta’s earlier models and older Gemini 2.5, but not Gemini 3.0, which turns this from a simple schedule slip into a frontier-gap problem for model quality the linked article the thread summary.
One unusually concrete implication is product routing: reports say Meta discussed temporarily using Google models in its own products, and a leaked system prompt screenshot suggests Gemini has already appeared in some Meta AI search testing.
Meta’s roadmap does not stop with Avocado: the initial post says a larger follow-on model, codenamed Watermelon, is planned after Avo, which raises the stakes for whether this delay is a one-model issue or a broader top-end training bottleneck.

What exactly slipped, and why does it matter?

The reported change is specific: Avocado moved from a March 2026 target to “at least May” because it fell short in internal tests for reasoning, coding, and writing, as described in the report summary. A separate recap says the model “outperformed Meta’s previous versions” and Google’s older Gemini 2.5, but still missed the bar set by Gemini 3.0 and other frontier systems that recap.

For engineers, the important signal is not just that a launch slipped, but where it slipped. The reported misses are in the capabilities that drive agent reliability and developer workflows: coding quality, multi-step reasoning, and general writing performance. According to the linked summary, Meta’s internal concern was not regression versus Llama-era baselines, but failure to match the latest models from Google, OpenAI, and Anthropic. That makes Avocado look less like a routine point upgrade and more like a model that may not yet clear the threshold for premium assistant, coding, and agent use cases.

Could Meta ship features on competitor models in the meantime?

The most operationally interesting claim is that Meta discussed licensing Google’s model technology temporarily, with the thread summary calling out possible use of Gemini inside Meta products. If accurate, that would be a notable shift for a company whose AI stack has been identified closely with in-house models.

There is also a product-side clue. TestingCatalog’s screenshot says Meta has been testing Gemini models to power search features in Meta AI, and the captured response includes the line “you are Gemini, a large language model built by Google.” That does not confirm broad deployment, but it does suggest model routing or experimentation was concrete enough to surface in a user-visible prompt path. Combined with the Avocado delay reporting, the practical takeaway is that frontier consumer AI products may increasingly mix first-party UX with third-party models when internal checkpoints slip.

What this says about frontier model releases

The Avocado report points to a harder release environment at the top end. One recap says internal testing showed gains over Llama 4 and Gemini 2.5, yet those gains were still insufficient once the target moved to Gemini 3.0-class performance. That is a reminder that “better than last gen” no longer guarantees a shippable flagship when the comparison set keeps moving during training and eval cycles.

The roadmap detail in the first report also matters: Meta is still said to have a larger model, Watermelon, coming after Avo. For teams watching the ecosystem, that means delays may not only affect one release window; they can cascade into API timing, product launch sequencing, and whether a company chooses to bridge gaps with external models while its next internal checkpoint catches up.

TL;DR

What exactly slipped, and why does it matter?

Could Meta ship features on competitor models in the meantime?

What this says about frontier model releases

Discussion across the web