Multimodal Inference Optimization GPU Infrastructure

Moondream

Compact multimodal vision-language models

Visit site

A multimodal AI model family from M87 Labs, Inc. focused on vision-language understanding and visual reasoning.

Model Intelligence

Benchmarkable

Model level

family

Recent stories

1 linked story

releasePRIMARY2026-05-01

Moondream releases Photon 1.2.0 with Apple Silicon, native Windows CUDA, and 23 ms B200 latency

Moondream shipped Photon 1.2.0, expanding its inference engine to Apple Silicon, Windows CUDA, Blackwell, and Jetson Thor, then outlined how custom Metal kernels and fused ops made local vision practical without MLX. That broadens deployment options for edge and on-device vision workloads while keeping server-class latency on B200 systems.