Skip to content
AI Primer
TOPIC29 stories

Search

Search systems, retrieval quality, and query handling.

RELEASE8th May
Firecrawl adds Highlights to /scrape with 100x fewer tokens

Firecrawl added a Highlights mode to /scrape that returns matching text, code, or tables for a query instead of full-page payloads. The release matters because the company benchmarked the feature on 10,000 URLs against Exa Highlights and aims it at lower-token agent retrieval.

RELEASE1w ago
Perplexity adds Finance Search to Agent API with live data and FinSearchComp T1 lead

Perplexity added Finance Search to the Agent API with licensed real-time market data and cited web sources in one tool call. The company says it led FinSearchComp T1 on live-data accuracy and lowest cost per correct answer, so teams building finance agents should evaluate it against their current stack.

RELEASE1w ago
Firecrawl adds Question format to /scrape with grounded answers and 100x fewer tokens

Firecrawl introduced a /scrape mode that answers a question directly from a URL instead of returning chunks for a separate retrieval loop. It targets docs and pricing pages, and teams should use it when they want grounded answers with lower token usage.

RELEASE1w ago
TinyFish opens Search and Fetch for free with MCP, CLI, and <0.5 s p50

TinyFish opened its Search and Fetch features for free with generous rate limits across REST, MCP, CLI, and SDKs. The change gives agent builders cheaper web retrieval while returning structured search JSON or rendered markdown instead of raw HTML.

RELEASE2w ago
GitHub Copilot adds semantic indexing to all workspaces and cross-repo search in @code

GitHub expanded semantic indexing beyond GitHub and Azure DevOps remotes, so Copilot can search across more workspace types and repositories inside @code. That improves agent context retrieval in local workflows, while the same release also adds chat-history recall and prompt-eval tooling.

RELEASE2w ago
Google AI Studio adds multi-chat and web search to Build mode

Google AI Studio added multi-chat threads and web search grounding to Build mode, so Gemini coding sessions can branch while pulling live docs into the workspace. The feature improves in-browser prototyping loops, but it is currently scoped to AI Studio rather than the Gemini API itself.

NEWS2w ago
Gemini adds Grounding with Exa for websites, docs, people, and company search

Gemini models can now use Grounding with Exa to search websites, technical docs, papers, people, and companies through Exa's index. That gives Gemini a new agent-style grounding path alongside Google's first-party search tooling.

RELEASE3w ago
LightOn releases LateOn and DenseOn at 149M params with BEIR 57.22

LightOn open-sourced DenseOn and LateOn plus the training pipeline behind them, including 1.4 billion query-document pairs and decontaminated BEIR results. Teams can use the small open retrieval models and reproduced data mixtures instead of opaque closed-data baselines.

NEWS3w ago
OpenRouter adds Firecrawl web search with full-page markdown grounding

OpenRouter added Firecrawl as a search provider, letting models ground responses in scraped full web pages instead of snippet-only search. The launch folds crawling into the existing plugin settings flow and includes a capped free plan on the Firecrawl side.

WORKFLOW1mo ago
LongTracer opens local STS+NLI claim checks for RAG validation

LongTracer open-sourced local STS+NLI claim checks, while qi published a private search engine with a Claude Code plugin and LM Studio users shared MCP search configs for Qwen. Use these stacks to ground retrieval and verify answers without a second judge model.

RELEASE1mo ago
Chroma launches Context-1, a 20B search agent with Apache 2.0 weights

Chroma released Context-1, a 20B search agent it says pushes the speed-cost-accuracy frontier for agentic search, with open weights on Hugging Face. Benchmark it against your current search stack before wiring it into production.

RELEASE1mo ago
Firecrawl launches /interact for natural-language browser actions

Firecrawl’s new /interact endpoint lets agents click, fill, scroll, and keep live browser sessions right after /scrape. It shortens the path from page extraction to web automation, but Playwright remains the better fit when you need deterministic full-session control.

NEWS1mo ago
ChatGPT adds Library tab for reusable file uploads across conversations

ChatGPT now saves uploaded and generated files into an account-level Library that can be reused across conversations from the web sidebar or recent-files picker. It removes repetitive re-uploading and makes past PDFs, spreadsheets, and images part of a persistent working context.

RELEASE1mo ago
Cursor adds Instant Grep: 13ms regex search across millions of files

Cursor shipped Instant Grep, a local regex index built from n-grams, inverted indexes, and Bloom filters that drops large-repo searches from seconds to milliseconds. Faster candidate retrieval shortens the coding-agent loop, especially when ripgrep-style scans become the bottleneck.

NEWS1mo ago
Reason-ModernColBERT ranks 87.59 on BrowseComp-Plus

LightOn’s late-interaction retriever paired with GPT-5 reached 87.59 accuracy on BrowseComp-Plus while using fewer search calls than larger baselines. It suggests deep-research quality may now hinge more on retrieval architecture than on swapping in ever larger LLMs.

RELEASE1mo ago
Gemini API adds one-call tool chaining and Maps grounding for Gemini 3

Google now lets Gemini chain built-in tools like Search, Maps, File Search, and URL Context with custom functions inside a single API call. This removes orchestration glue for agent builders and brings Maps grounding into AI Studio for faster prototyping.

RELEASE1mo ago
Perplexity releases Comet on iOS with voice mode and agentic browsing

Perplexity released Comet for iPhone, bringing its AI-native browser, voice mode, and task-running assistant to mobile. Engineers tracking AI browser UX can now test how agentic browsing behaves as a default mobile browser rather than a desktop-only tool.

RELEASE1mo ago
Parallel launches Tempo MPP billing for per-search agent payments

Parallel integrated with Tempo and the Machine Payments Protocol so agents can buy search, content extraction, and multi-hop research on demand without API keys or account setup. This gives agent stacks a concrete pattern for per-use tool billing instead of preprovisioned subscriptions.

RELEASE1mo ago
Hugging Face Papers serves Markdown to agents and adds a paper-pages skill

Hugging Face now serves Markdown when agents fetch Papers pages and published a skill for searching papers plus linked models, datasets, and Spaces. Research agents can cut token waste and retrieve paper context in a format that is easier to parse and ground.

RELEASE1mo ago
Ollama releases 0.18.1 with OpenClaw web search plugin and headless launch mode

Ollama 0.18.1 added OpenClaw web search and fetch plugins plus non-interactive launch flows for CI, scripts, and container jobs. Pair it with Pi and Nemotron 3 Nano 4B if you want unattended agent jobs on constrained hardware.

RELEASE1mo ago
Perplexity Computer adds Android support and local Comet browser control

Perplexity expanded Computer to Android and added control of a local Comet browser session, including logged-in sites, from the agent. Try it if you want one agent workflow across mobile and browser surfaces without per-site connectors or custom MCP glue.

RELEASE1mo ago
Hyperbrowser releases HyperSkill to turn live docs into SKILL.md trees

Hyperbrowser open-sourced HyperSkill, which reads live documentation and emits a structured SKILL.md file or graph an agent can navigate. Try it to replace hand-written tool instructions with generated skill trees you can drop into an agent project.

NEWS2mo ago
Google adds Grounding with Google Maps to AI Studio UI for Gemini APIs

Google is adding Grounding with Google Maps to AI Studio’s UI, and a Google reply says the Maps grounding capability already exists in the API. If you build location-heavy Gemini apps, start designing around map lookups instead of stitching search and geocoding manually.

RELEASE2mo ago
Keep adds an in-app feed reader for saved bookmarks

Keep added an in-app feed reader so saved links can be read directly inside its bookmark store for agent workflows. Use it to turn bookmarks, RSS feeds, and markdown exports into reusable context instead of scattered tabs.

RELEASE2mo ago
Perplexity launches Computer on iOS with cross-device sync for long-running tasks

Perplexity brought Computer to iOS with cross-device sync so multi-step cloud tasks can keep running after you leave the screen. Try it if you want to start agent workflows from a phone instead of a desktop-only session.

RELEASE2mo ago
Mixedbread releases Wholembed v3 and claims large gains on multimodal retrieval benchmarks

Mixedbread introduced Wholembed v3 as a retrieval model for text, image, video, audio, and multilingual search. Benchmark it on fine-grained retrieval tasks if single-vector embeddings have been collapsing in your pipeline.

RELEASE2mo ago
Google releases Gemini Embedding 2 preview with one vector space for text, image, video, audio, and PDFs

Google launched Gemini Embedding 2 in preview, unifying multiple modalities and 100+ languages in one embedding space with flexible output dimensions. Try it to simplify cross-modal RAG and search pipelines, but compare it with late-interaction systems before committing.

RELEASE2mo ago
Gemini Embedding 2 enters preview with 8,192-token multimodal vectors and 3,072-dim outputs

Google put Gemini Embedding 2 into public preview with one vector space for text, images, video, audio, and PDFs, plus 3072, 1536, and 768 output sizes. Use it to replace multi-model retrieval pipelines with one API for RAG and cross-media search.

RELEASE2mo ago
Hyperbrowser launches HyperPlex to run parallel browser agents across models

Hyperbrowser released HyperPlex, an open-source research agent that splits a goal into subtasks, runs browser workers in parallel, and returns cited reports. Teams building deep-research products can study the repo for orchestration, live browsing, and report synthesis patterns.

AI PrimerAI Primer

Your daily guide to AI tools, workflows, and creative inspiration.

© 2026 AI Primer. All rights reserved.