🤖 ML/AI

paddleocr-text-recognition

Name: paddleocr-text-recognition
Author: PaddlePaddle

by PaddlePaddle1 month ago83.1k

Use this skill whenever the user wants text extracted from images, photos, scans, screenshots, or scanned PDFs. Returns exact machine-readable strings with line-level text and optional bbox coordinates. Strong accuracy for CJK, small print, and handwritten text. Trigger terms: OCR, 文字识别, 图片转文字, 截图识字, 提取图中文字, 扫描识字, 识字, 纯文字, plain text extraction, 坐标, 检测框, bbox, bounding box, image to text, screenshot, photo scan, recognize text.

Install

npx skills add https://github.com/PaddlePaddle/PaddleOCR --skill paddleocr-text-recognition

Show step-by-step

1
Open your terminal
- Mac: Press ⌘ Space, type "Terminal", press Enter
- Windows: Press Win R, type "cmd", press Enter
2
Paste the command above and press Enter
Use the Copy command button, then paste in your terminal (Mac: ⌘V, Windows: Ctrl V).
3
Restart Claude Code
Close and reopen Claude Code, or start a new session, so it picks up the new skill.

Where it lives

~/.claude/skills/paddlepaddle--paddleocr--skills--paddleocr-text-recognition/
├── SKILL.md
└── ... (skill resource files)

View on GitHub

Comments

X@PaddlePaddle

"🤯 Wait… your OpenClaw still can’t work? Let’s fix that. The lobster is everywhere 🦞 — but running it ≠ making it useful. Now with PaddlePaddle AI Studio..."

Related skills

🤖 ML/AI

comfyui

Generate images, video, and audio with ComfyUI — install, launch, manage nodes/models, run workflows with parameter injection. Uses the official comfy-cli for lifecycle and direct REST/WebSocket API for execution.

by NousResearch · 1 month ago198k

🤖 ML/AI

hyperframes

Create HTML-based video compositions, animated title cards, social overlays, captioned talking-head videos, audio-reactive visuals, and shader transitions using HyperFrames. HTML is the source of truth for video. Use when the user wants a rendered MP4/WebM from an HTML composition, wants to animate text/logos/charts over media, needs captions synced to audio, wants TTS narration, or wants to convert a website into a video.

by NousResearch · 1 month ago198k

💻 Developer Tools

claude-api

Reference for the Claude API / Anthropic SDK — model ids, pricing, params, streaming, tool use, MCP, agents, caching, token counting, model migration. TRIGGER — read BEFORE opening the target file; don't skip because it "looks like a one-liner" — whenever: the prompt names Claude/Anthropic in any form (Claude, Anthropic, Fable, Opus, Sonnet, Haiku, `anthropic`, `@anthropic-ai`, `claude-*`, `us.anthropic.*`, `[1m]`); the user asks about an LLM (pricing/model choice/limits/caching) — never answer from memory; OR the task is LLM-shaped with provider unstated (agent/MCP/tool-definition/multi-agent/RAG/LLM-judge/computer-use; generate/summarize/extract/classify/rewrite/converse over NL; debugging refusals/cutoffs/streaming/tool-calls/tokens). SKIP only when another provider is being worked on (overrides all triggers): OpenAI/GPT/Gemini/Llama/Mistral/Cohere/Ollama named in the query; OR `grep -rE 'openai|langchain_openai|google.generativeai|genai|mistralai|cohere|ollama'` over the project hits (run this grep FIRST if no provider named — don't Read the file).

by anthropics · 1 month ago153.1k

💻 Developer Tools

New

ponytail

Forces the laziest solution that actually works, simplest, shortest, most minimal. Channels a senior dev who has seen everything: question whether the task needs to exist at all (YAGNI), reach for the standard library before custom code, native platform features before dependencies, one line before fifty. Supports intensity levels: lite, full (default), ultra. Use whenever the user says "ponytail", "be lazy", "lazy mode", "simplest solution", "minimal solution", "yagni", "do less", or "shortest path", and whenever they complain about over-engineering, bloat, boilerplate, or unnecessary dependencies.

by DietrichGebert · 3 days ago41.5k