Search
Search systems, retrieval quality, and query handling.
Stories
Filter storiesFirecrawl added a Highlights mode to /scrape that returns matching text, code, or tables for a query instead of full-page payloads. The release matters because the company benchmarked the feature on 10,000 URLs against Exa Highlights and aims it at lower-token agent retrieval.
Perplexity added Finance Search to the Agent API with licensed real-time market data and cited web sources in one tool call. The company says it led FinSearchComp T1 on live-data accuracy and lowest cost per correct answer, so teams building finance agents should evaluate it against their current stack.
Firecrawl introduced a /scrape mode that answers a question directly from a URL instead of returning chunks for a separate retrieval loop. It targets docs and pricing pages, and teams should use it when they want grounded answers with lower token usage.
TinyFish opened its Search and Fetch features for free with generous rate limits across REST, MCP, CLI, and SDKs. The change gives agent builders cheaper web retrieval while returning structured search JSON or rendered markdown instead of raw HTML.
GitHub expanded semantic indexing beyond GitHub and Azure DevOps remotes, so Copilot can search across more workspace types and repositories inside @code. That improves agent context retrieval in local workflows, while the same release also adds chat-history recall and prompt-eval tooling.
Google AI Studio added multi-chat threads and web search grounding to Build mode, so Gemini coding sessions can branch while pulling live docs into the workspace. The feature improves in-browser prototyping loops, but it is currently scoped to AI Studio rather than the Gemini API itself.
Gemini models can now use Grounding with Exa to search websites, technical docs, papers, people, and companies through Exa's index. That gives Gemini a new agent-style grounding path alongside Google's first-party search tooling.
LightOn open-sourced DenseOn and LateOn plus the training pipeline behind them, including 1.4 billion query-document pairs and decontaminated BEIR results. Teams can use the small open retrieval models and reproduced data mixtures instead of opaque closed-data baselines.
OpenRouter added Firecrawl as a search provider, letting models ground responses in scraped full web pages instead of snippet-only search. The launch folds crawling into the existing plugin settings flow and includes a capped free plan on the Firecrawl side.
LongTracer open-sourced local STS+NLI claim checks, while qi published a private search engine with a Claude Code plugin and LM Studio users shared MCP search configs for Qwen. Use these stacks to ground retrieval and verify answers without a second judge model.
Chroma released Context-1, a 20B search agent it says pushes the speed-cost-accuracy frontier for agentic search, with open weights on Hugging Face. Benchmark it against your current search stack before wiring it into production.
Firecrawl’s new /interact endpoint lets agents click, fill, scroll, and keep live browser sessions right after /scrape. It shortens the path from page extraction to web automation, but Playwright remains the better fit when you need deterministic full-session control.
ChatGPT now saves uploaded and generated files into an account-level Library that can be reused across conversations from the web sidebar or recent-files picker. It removes repetitive re-uploading and makes past PDFs, spreadsheets, and images part of a persistent working context.
Cursor shipped Instant Grep, a local regex index built from n-grams, inverted indexes, and Bloom filters that drops large-repo searches from seconds to milliseconds. Faster candidate retrieval shortens the coding-agent loop, especially when ripgrep-style scans become the bottleneck.
LightOn’s late-interaction retriever paired with GPT-5 reached 87.59 accuracy on BrowseComp-Plus while using fewer search calls than larger baselines. It suggests deep-research quality may now hinge more on retrieval architecture than on swapping in ever larger LLMs.
Google now lets Gemini chain built-in tools like Search, Maps, File Search, and URL Context with custom functions inside a single API call. This removes orchestration glue for agent builders and brings Maps grounding into AI Studio for faster prototyping.
Perplexity released Comet for iPhone, bringing its AI-native browser, voice mode, and task-running assistant to mobile. Engineers tracking AI browser UX can now test how agentic browsing behaves as a default mobile browser rather than a desktop-only tool.
Parallel integrated with Tempo and the Machine Payments Protocol so agents can buy search, content extraction, and multi-hop research on demand without API keys or account setup. This gives agent stacks a concrete pattern for per-use tool billing instead of preprovisioned subscriptions.
Hugging Face now serves Markdown when agents fetch Papers pages and published a skill for searching papers plus linked models, datasets, and Spaces. Research agents can cut token waste and retrieve paper context in a format that is easier to parse and ground.
Ollama 0.18.1 added OpenClaw web search and fetch plugins plus non-interactive launch flows for CI, scripts, and container jobs. Pair it with Pi and Nemotron 3 Nano 4B if you want unattended agent jobs on constrained hardware.
Perplexity expanded Computer to Android and added control of a local Comet browser session, including logged-in sites, from the agent. Try it if you want one agent workflow across mobile and browser surfaces without per-site connectors or custom MCP glue.
Hyperbrowser open-sourced HyperSkill, which reads live documentation and emits a structured SKILL.md file or graph an agent can navigate. Try it to replace hand-written tool instructions with generated skill trees you can drop into an agent project.
Google is adding Grounding with Google Maps to AI Studio’s UI, and a Google reply says the Maps grounding capability already exists in the API. If you build location-heavy Gemini apps, start designing around map lookups instead of stitching search and geocoding manually.
Keep added an in-app feed reader so saved links can be read directly inside its bookmark store for agent workflows. Use it to turn bookmarks, RSS feeds, and markdown exports into reusable context instead of scattered tabs.
Perplexity brought Computer to iOS with cross-device sync so multi-step cloud tasks can keep running after you leave the screen. Try it if you want to start agent workflows from a phone instead of a desktop-only session.
Mixedbread introduced Wholembed v3 as a retrieval model for text, image, video, audio, and multilingual search. Benchmark it on fine-grained retrieval tasks if single-vector embeddings have been collapsing in your pipeline.
Google launched Gemini Embedding 2 in preview, unifying multiple modalities and 100+ languages in one embedding space with flexible output dimensions. Try it to simplify cross-modal RAG and search pipelines, but compare it with late-interaction systems before committing.
Google put Gemini Embedding 2 into public preview with one vector space for text, images, video, audio, and PDFs, plus 3072, 1536, and 768 output sizes. Use it to replace multi-model retrieval pipelines with one API for RAG and cross-media search.
Hyperbrowser released HyperPlex, an open-source research agent that splits a goal into subtasks, runs browser workers in parallel, and returns cited reports. Teams building deep-research products can study the repo for orchestration, live browsing, and report synthesis patterns.