Granite Embedding R2

Text embeddings for enterprise retrieval.

IBM's Granite Embedding R2 is a language model family for generating text embeddings for retrieval, semantic search, and similarity use cases.

Pricing

Official site · May 2, 2026, 6:30 AM

Input / 1M

$0.10

Output / 1M

$0.10

IBM publishes a single embedding-model rate of USD 0.10 per million tokens; the pricing page does not break out separate input and output token prices for this model.

IBM's official watsonx.ai pricing page states that all embedding models are available for USD 0.10 per million tokens. IBM's Granite Embedding English Reranker r2 model card identifies the r2 model as a text-embedding/reranking model in the Granite Embeddings collection. No separate public r2-specific price line was found, so the published embedding-model rate applies.

View source

Model Intelligence

Benchmarkable

Model level

family

Recent stories

1 linked story

releasePRIMARY2026-05-01

IBM releases Granite Embedding R2 with 32,768-token context and +11.8 MMTEB retrieval gain

IBM released 97M and 311M multilingual Granite Embedding R2 models under Apache 2.0, replacing XLM-RoBERTa with ModernBERT and extending context length from 512 to 32,768 tokens. The 311M model posts a +11.8 gain on MMTEB retrieval and ships with ONNX, OpenVINO, vLLM, and GGUF support.