Markov AI released Computer Use Large on Hugging Face with 48,478 screen recordings spanning about 12,300 hours across six professional apps. Use it to train and evaluate GUI agents on real software workflows with a large CC-BY dataset.

Computer Use Large is a new Hugging Face dataset for desktop-agent work, built from 48,478 screen recordings of professional software use and released under CC-BY-4.0 launch post. The Hugging Face listing at the dataset page positions it for “training & evaluating computer use agents,” not just passive video understanding, which matters because the source material is grounded in GUI workflows rather than synthetic trajectories dataset page.
The current coverage spans six applications: AutoCAD, Blender, Excel, Photoshop, Salesforce, and VS Code app coverage. That gives the corpus a mix of office, creative, CAD, CRM, and coding environments, which is broader than single-app desktop datasets and more relevant for benchmarking cross-domain computer-use behavior.
The dataset page says the videos were sourced from YouTube tutorials and then processed to keep only screen-centric segments: audio was stripped with ffmpeg, intros and outros were removed, frames were sampled every 10 seconds, and a vision-language model, Gemini Flash, was used to classify whether frames were genuine screen-recording content processing details. Videos with less than 10 seconds of screen activity were discarded, and the metadata includes fields such as original and trimmed duration, upload date, screen-content percentage, and segment counts metadata fields.
For engineers, the practical value is less about a new model release and more about data availability. A large, openly licensed corpus with per-video metadata and category splits can support pretraining, eval set construction, and comparisons across app domains using Hugging Face’s loading flow load_dataset API. The supporting repost reinforces the scale claim, calling it the “world’s largest open-source dataset of computer-use recordings” and highlighting 10,000-plus hours across enterprise and productivity software supporting repost.
World's largest open-source dataset of computer-use recordings just dropped on Huggingface, for training & evaluating computer use agents. 48,478 screen recording videos (~12,300 hours) of professional software being used. License - CC-BY-4.0