TacoSkill LAB
TacoSkill LAB
HomeSkillHubCreatePlaygroundSkillKit
© 2026 TacoSkill LAB
AboutPrivacyTerms
  1. Home
  2. /
  3. SkillHub
  4. /
  5. VLM_Expert
Improve

VLM_Expert

3.1

by majiayu000

157Favorites
104Upvotes
0Downvotes

实现基于视觉的 AI 对话能力,支持分析图像、描述视觉内容并进行多模态交互。

visionmultimodalVLM

3.1

Rating

0

Installs

AI & LLM

Category

Quick Review

The skill addresses a useful vision-language capability but lacks critical implementation details. The description mentions image analysis and multi-modal interaction but provides minimal guidance on how a CLI agent would actually invoke or integrate this skill. The CLI example shows syntax but no clear explanation of parameters, expected outputs, error handling, or how to handle multiple images. Task knowledge is insufficient—no code, API endpoints, model specifications, or procedural steps are provided. Structure is acceptable given the brevity, but the skill would benefit from concrete implementation guidance. Novelty is moderate as VLM capabilities do reduce token costs for vision tasks compared to manual description, though the skill itself doesn't demonstrate complexity.

LLM Signals

Description coverage3
Task knowledge2
Structure5
Novelty4

GitHub Signals

49
7
1
1
Last commit 0 days ago

Publisher

majiayu000

majiayu000

Skill Author

Related Skills

mcp-developerprompt-engineerfine-tuning-expert

Loading SKILL.md…

Try onlineView on GitHub

Publisher

majiayu000 avatar
majiayu000

Skill Author

Related Skills

mcp-developer

Jeffallan

6.4

prompt-engineer

Jeffallan

7.0

fine-tuning-expert

Jeffallan

6.4

rag-architect

Jeffallan

7.0
Try online