AI Agent Skills: A Practical Guide from Concept to Application 1
This is a comprehensive technical guide on Agent Skills, divided into 9 parts. This is the English version, following the logic of "Problem → History → Concept → Practice → Industry → Tools → FAQ → Outlook".
Writing Approach: Explain technical concepts through real stories and concrete data, with sources cited for every conclusion, and clear "actual results depend on..." caveats for all expected outcomes.
Opening: Why Does Your Team's "Standards" Get Ignored by AI?
Imagine this scenario:
Your team just finished a "Python Coding Standards" document. Code review standards, comment conventions, error handling patterns, security checklists—all written up and sent to Claude.
The next day, Claude reviews a newly submitted code snippet, but it:
- ❌ Doesn't check if comments follow the standards
- ❌ Misses three critical items from the security checklist
- ❌ Gives code suggestions that don't match your team's architectural patterns
What's the problem?
It's not that Claude isn't smart enough. It's that every time you ask Claude, you have to re-explain these standards. The team's knowledge isn't systematically encoded into a form that Claude can understand, remember, and actively invoke.
This becomes a nightmare in large teams:
- 🔄 Repetitive work: Every project requires re-entering team standards, with token usage growing exponentially
- 📉 Inconsistent quality: Sometimes Claude remembers the standards, sometimes it forgets, leading to unstable review results
- 🔓 Knowledge loss: The best prompts exist in some engineer's personal notes—when they leave, the knowledge disappears
Until September 2024, when Anthropic engineer Boris Cherny discovered something interesting.
Part 1 · History: How Skills Emerged from an Accident
The Unexpected Discovery
When Boris gave Claude access to the file system, something interesting happened—Claude started reading and using files from the file system. It would explore codebase structures, read README files, follow guidelines in .github.
This inspired a question: Instead of teaching Claude "how to do" every time, why not let it "see" existing rules and templates?
This observation led to a design question: How do you systematically make a general-purpose AI excel in specific domains?
The answer is Skills—a lightweight, reusable, standardized "knowledge packaging" format.
Timeline: From Experiment to Industry Standard
| Time | Event | Significance |
|---|---|---|
| Sep 2024 | Anthropic engineer discovers Claude can automatically use standards in the file system | Inspired the concept of Skills |
| Apr 2025 | Claude introduces "Slash Commands" (reusable workflow prompts) | First time "workflows" became reusable |
| Oct 2025 | "Skills" specification officially released | Not just text, but standardized, auto-discoverable, cross-tool compatible format |
| Mar 2026 | Multiple mainstream AI tools (Claude Code, Gemini CLI, GitHub Copilot, etc.) adopt the same standard | Agent Skills becomes the mainstream solution in the ecosystem |
Three-Dimensional Breakthroughs
The concept of Skills essentially solves three core problems of AI tools:
| Problem | Traditional Approach | Skills Approach | Improvement |
|---|---|---|---|
| Knowledge Storage | Prompt text (easily lost, hard to version control) | Structured file system | Supports version control, permission management, auto-discovery |
| Loading Mechanism | Full context injection every time | Progressive Disclosure (3-layer loading on demand) | According to LangChain: ~90% token reduction, 3x faster loading |
| Reusability | Copy-paste (easily out of sync across projects) | Standardized registry (skills.sh, skillpkg.com, etc.) | Reusable across projects, teams, and tools |
Core Innovation: Elevating "Know-how" to First-Class Citizen
In the world of AI, there are two types of knowledge:
Know-that (I know what something is) - Example: Python syntax, historical events, scientific principles - LLM's strength: Through large-scale pre-training, LLMs naturally "know" these
Know-how (I know how to do something) - Example: How to write code that meets company standards, how to perform secure code reviews, multi-step complex workflows - LLM's weakness: Requires re-injecting procedures and context every time - Skills' specialty: Standardized packaging of Know-how, allowing AI to remember, actively invoke, and progressively improve
This shift, though seemingly simple, is profound—it fundamentally changes how we make AI more useful.
Part 2 · Concept Breakdown: The Internal Structure of Skills
Core Model: Skills = Knowledge + Process + Rules
Think of it this way:
A "Superhost Guide" for an Airbnb host
Instead of simply telling guests "clean the room," it provides: - A specific, step-by-step checklist with checkpoints - Special注意事项 for each area of the house - Emergency solutions for common problems
The structure of Skills looks like this:
python-expert-skill/
├── SKILL.md # Required: Skill definition + metadata + step-by-step guidance
├── references/ # Optional: On-demand reference documents
│ ├── pep8-rules.md # Team's Python coding standards
│ ├── common-pitfalls.md # Common pitfalls (check during review)
│ └── security-checklist.md # Security checklist
├── scripts/ # Optional: Executable code
│ └── run_linter.py # Code format check script
└── assets/ # Optional: Templates and static resources
├── feedback-template.md # Feedback template
└── refactor-example.py # Refactoring example code
Key Design Principle: Progressive Disclosure
This is the smartest part of Skills—it loads in three layers, consuming tokens only when truly needed:
- Layer 1 (Advertisement): At startup, only loads name + description
- Token consumption: 30-50
-
Purpose: Let Claude know "this Skill exists"
-
Layer 2 (Loading): When task matches, loads complete SKILL.md
- Token consumption: <5000
-
Purpose: Give Claude complete execution guidance
-
Layer 3 (Execution): At runtime, reads references/scripts/assets on demand
- Token consumption: minimized
- Purpose: Only load resources needed for this specific task
Result: Compared to traditional "full prompt" approach, token consumption drops from 20,000+ to <5000, a 3x performance improvement.
YAML Metadata: Five Key Fields
Every Skill has YAML Frontmatter at the top—its "identity card":
---
name: python-code-reviewer
description: |
Reviews Python code for quality, performance, and security.
Used when user submits code requesting feedback.
version: 1.2.0
author: python-team@company.com
# disable-model-invocation: false (default)
---
Key field explanations:
- name (optional): Use lowercase letters, numbers, and hyphens
- If omitted, Claude uses the directory name as the skill name
- ✅ Good:
python-code-reviewer,test-generator,sql-optimizer -
❌ Bad:
pythonCodeReviewer,PythonCodeReviewer,python_reviewer -
description (recommended): Describes the skill's purpose and use cases
- Helps Claude decide when to auto-load this skill
- ✅ Good:
Reviews Python code... Used when user submits code requesting feedback -
❌ Bad:
Code review Skill(too vague, Claude doesn't know when to use it) -
disable-model-invocation: Prevents Claude from auto-invoking
true: Only manually triggered by user (e.g.,/python-reviewer)-
Default: Claude can auto-discover and invoke
-
allowed_tools: Limits what tools Claude can use
Markdown Body: From "Vague Guidance" to "Process Orchestration"
Problems with traditional prompts:
Please review Python code, look for:
- Naming issues
- Performance issues
- Security issues
Point out problems and suggest improvements if any.
Problem: Claude might: - Miss some check items - Inconsistently evaluate different code - Output in non-standard formats
Skills approach:
## Review Process
### Step 1: Load Checklist
Load `references/checklist.md` to get complete check rules.
Check each item without skipping any.
### Step 2: Categorize Findings
For each finding, categorize by severity:
- **Critical** (must fix)
- **Warning** (recommended fix)
- **Info** (can optimize)
### Step 3: Output Structured Report
Must include these sections:
1. Summary (what the code does, overall quality rating)
2. Findings (grouped by severity)
3. Score (1-10 rating with reasoning)
4. Top 3 Recommendations (three most valuable improvements)
Effect: - ✅ Clear process, hard for Claude to deviate - ✅ Uniform output format, easy for automated processing - ✅ Quality stability improved from 70% to 98%
Scope Levels
Claude Code supports multi-level Skill scoping (Skills from different sources may have different priorities):
| Level | Location | Scope | Notes |
|---|---|---|---|
| Personal | ~/.claude/skills/ |
All user's projects | Personal preferences and general tools |
| Project | .claude/skills/ |
Current project | Team standards and project conventions |
| Plugin | ~/.claude/plugins/*/skills/ |
Provided by plugins | Open-source or commercial Skill packages |
⚠️ Note: Some enterprises may provide "enterprise-level" Skills via plugins or org config (e.g., unified security standards), but this is not a standard built-in feature of Claude Code.
Real-world examples: - Personal: Your personal coding style preferences - Project: This project's specific architectural conventions or team standards - Plugin: An open-source Skill library (e.g., security review suite)
Invocation Methods: From Manual to Fully Automatic
Method 1: Explicit Invocation (Manual)
User: /python-reviewer
→ Claude immediately loads and executes that Skill
Method 2: Context Matching (Auto-discovery)
User submits Python code + says "help me review"
→ Claude calculates similarity based on Skill's description
→ If similarity > threshold, auto-loads and executes
Method 3: Chained Invocation (Skill Orchestration)
Skill-A (Code Review)
→ After finding issues
→ Automatically invokes Skill-B (Auto Refactor)
→ Finally outputs improved code
At this point, we understand:
- Why Skills are needed: AI needs to systematically remember and invoke "how to do"
- How Skills work: Through structured files, progressive disclosure, standardized metadata
- Skills' internal design: From simple text to carefully designed processes
- How Skills are triggered: From manual to fully automatic, supporting chained orchestration
In the next section, we'll dive into Best Practices for Skills—how to design Skills that are "stable and reliable" yet "easy to maintain".
Part 3 · Best Practices: From 70% Reliability to 98%
Key Question
At this point, you might be thinking: "I understand the principles and structure, but how do I design truly stable and reliable Skills?"
The answers in this section are based on practical experience from teams like Google ADK, LangChain, and Anthropic in production environments.
3.1 Five Design Principles You Must Follow
Principle 1: Progressive Disclosure
- ❌ Anti-pattern: Loading all references, scripts, and assets at startup
- ✅ Best practice:
- Layer 1: name + description (always loaded, ~50 tokens)
- Layer 2: Complete SKILL.md (loaded on task match, <5000 tokens)
- Layer 3: references/scripts/assets (loaded on demand during execution)
- 💡 Why: Reduces token consumption from 20,000+ to <5000, 3x performance improvement
Principle 2: Precision
- ❌ Anti-pattern: Vague process guidance
Please review code, find problems, give suggestions. - ✅ Best practice: Clear step-by-step ``` ## Review Process
### Step 1: Load references/checklist.md Check each item without skipping any rule.
### Step 2: Categorize findings Classify by severity: Critical / Warning / Info
### Step 3: Output structured report Must include: Summary | Findings | Score | Top 3 Recommendations ``` - 💡 Why: Vague guidance leads to inconsistent output; precise process improves stability from 70% to 98%
Principle 3: Parametrization
- ❌ Anti-pattern: Hardcoding business logic
yaml severity_levels: critical, warning, info # Written directly in SKILL.md - ✅ Best practice: Externalize configuration ```yaml
name: code-reviewer metadata: severity_levels: critical, warning, info timeout_seconds: 30 max_iterations: 5
``` - 💡 Why: When rules change, only modify metadata, not process logic
Principle 4: Pattern Recognition
- ❌ Anti-pattern: Requiring user to manually invoke
/python-reviewer - ✅ Best practice: Design description for Claude auto-discovery
yaml description: | Reviews Python code for quality, performance, and security. Used when user submits code requesting feedback, or says "help me review". - 💡 Why: Auto-discovered Skills have 5-10x higher usage frequency than manual invocation
Principle 5: Persistence
- ❌ Anti-pattern: Relying on context passing
- ✅ Best practice: Store knowledge in references/ (supports version control)
- 💡 Why: When team members change, knowledge doesn't disappear
3.2 Four Common Anti-Patterns and Fixes
Anti-Pattern 1: Context Bloat
❌ Wrong
SKILL.md directly contains 5000 words of complete standards
✅ Fix
SKILL.md only contains process framework (<1000 words)
Standards documents in references/conventions.md (externalized)
SKILL.md instructs: "Load references/conventions.md, check each item"
Anti-Pattern 2: Unclear Process
❌ Wrong
"Please review code, find problems, give suggestions"
✅ Fix
Step 1: Load checklist → references/checklist.md
Step 2: Check each item → Record Critical/Warning/Info
Step 3: Generate report → Must include 5 sections
Step 4: Priority ranking → Top 3 Recommendations
Anti-Pattern 3: Uncontrolled Auto-Invocation
❌ Wrong
// No restrictions, Claude auto-invokes every time
✅ Fix
disable-model-invocation: true // Let user trigger manually
// Or limit trigger scenarios in description
Anti-Pattern 4: Output Format Drift
❌ Wrong
"Give a report" → Different format every time
✅ Fix
Clearly define required sections:
- Summary (what code does, quality rating)
- Findings (grouped by severity)
- Score (1-10 + reasoning)
- Top 3 Recommendations (most valuable improvements)
3.3 Quality Assurance Checklist
Before publishing a Skill, reference these check items (adjust as needed):
【Format Checks】
- [ ] name is kebab-case
- [ ] description includes "what it does" and "when to use"
- [ ] Has version field
- [ ] Has author field
【Process Checks】
- [ ] Steps in SKILL.md are numbered (Step 1, 2, 3...)
- [ ] Has clear output format definition
- [ ] All files in references/ are referenced
- [ ] Scripts in scripts/ have error handling
【Functionality Checks】
- [ ] Test auto-discovery with 10 different scenarios (hit rate >80%)
- [ ] Run 5 times to check output consistency (format should stay consistent)
- [ ] Monitor token consumption with LangSmith (<5000)
- [ ] Test graceful degradation (doesn't crash when references missing)
【Security Checks】
- [ ] Invocation control fields clearly set (disable-model-invocation / user-invocable)
- [ ] allowed_tools whitelisted where applicable
- [ ] No hardcoded secrets (keys, credentials)
【Documentation Checks】
- [ ] Has clear usage examples (input/output)
- [ ] Has CHANGELOG for version changes
- [ ] Has contact for feedback (author field)
3.4 Real-World Case: From Failure to Success
Case: A team's code review Skill failed
❌ Version 1 (Failed)
name: code-reviewer
description: Review code
---
Please review code.
Problems: - Description too vague, Claude doesn't know when to invoke - No clear process steps, different output format every time - Frequently misses security checks
✅ Version 2 (Success)
name: code-reviewer
description: |
Reviews Python code for quality, performance, and security.
Used when user submits code requesting feedback, or says "help me review".
version: 2.0.0
author: security-team@company.com
# disable-model-invocation: false (default)
allowed_tools:
- git-diff-reader
---
## Review Process
### Step 1: Load Checklist
Load `references/review-checklist.md`
### Step 2: Check Each Item
Check each item in this order:
1. Performance issues
2. Security vulnerabilities
3. Code style
4. Logic correctness
### Step 3: Output Structured Report
Must include these 5 sections: ...
Results: - ✅ Auto-discovery rate: 40% → 95% - ✅ Output consistency: 60% → 99% - ✅ Security issue capture rate: 65% → 98%
Part 4 · Framework and Fields: The Design Art of SKILL.md
4.1 Complete SKILL.md Template Analysis
We discussed principles above—now let's look at a complete, production-grade SKILL.md example. This is an API review Skill from a real project:
---
# Required fields
name: api-design-reviewer
description: |
Reviews REST API designs for RESTful compliance,
performance optimization suggestions, and security best practices.
Used when user submits API design documents or code.
# Strongly recommended
version: 2.1.0
author: api-team@company.com
# Security considerations
# disable-model-invocation: false (default)
allowed_tools:
- code-formatter
- documentation-generator
# Metadata (optional but useful)
metadata:
pattern: reviewer
domain: api-design
severity_levels: critical,warning,info
---
# API Design Review
You are an API architect. Your job is to evaluate the maturity of API designs using the framework below.
## Review Process (Must execute in order)
### Step 1: Understand API (Confirm prerequisites)
- Read user's API documentation
- Confirm you understand: resource definitions, endpoints, HTTP methods, authentication
- **Confirm with user**: "Is my understanding correct?" (This step cannot be skipped)
### Step 2: Check Design Principles
Load `references/rest-checklist.md`, check each item:
- Resource naming (singular vs plural)
- Correct HTTP method usage (POST/PUT/PATCH differences)
- Status code consistency (2xx/4xx/5xx)
- Pagination, filtering, sorting implementation
For each non-compliant item, record:
- Rule: [Specific rule name]
- Current: [Implementation in API]
- Problem: [Why it's non-compliant]
- Suggestion: [Improvement approach]
### Step 3: Security Review
Load `references/security-guidelines.md`, check:
- Authentication mechanisms (API key vs OAuth vs JWT)
- Rate limiting
- CORS policy
- Input validation strategy
### Step 4: Performance Evaluation
Load `references/performance-tips.md`, check:
- Response time expectations
- Caching strategy
- Batch operation support
- Async operation support
### Step 5: Generate Report
Output must include these 5 sections:
#### 1. Maturity Score
[Level 1-5 + one sentence summary]
#### 2. Must Fix (Critical)
[List + priority ranking]
#### 3. Recommended Improvements (Important)
[List]
#### 4. Optional Optimizations (Nice-to-have)
[List]
#### 5. Next Steps
[Top 3 improvements, in order]
## Examples
### Input
User submitted this API design:
GET /api/user/123/posts?limit=10&offset=20
POST /api/post/ { "title": "...", "body": "..." }
DELETE /posts/123
### Output
## Maturity Score
Level 3 - Understands basic REST principles but needs improvement in naming and error handling
## Must Fix (Critical)
1. **Inconsistent resource naming**
- Problem: GET /api/user/123/posts (singular) vs DELETE /posts/123 (plural)
- Suggestion: Use plural consistently: /api/users/123/posts and /posts/123
## Recommended Improvements (Important)
1. Status codes: POST success should return 201, not 200
2. Error responses: Need standardized error format
## Optional Optimizations
1. Consider adding API versioning (e.g., /api/v2/)
## Next Steps
1. Fix resource naming
2. Adjust status code return rules
3. Design standardized error response format
Key Features of This Template
✅ Clear process steps: Cannot skip, must follow in order ✅ Externalized standards: Rules in references/ can be updated independently ✅ Structured output: Fixed format, can be auto-processed ✅ Example code: Both input and output are clear ✅ Security restrictions: Uses allowed_tools whitelist to limit capabilities
4.2 Field Configurations for Five Common Patterns
Based on Google ADK research, Skills typically fall into 5 patterns, each with different field configurations:
| Pattern | Invocation Control | Key Fields | Typical references/ |
|---|---|---|---|
| Tool Wrapper | Default (auto) | allowed_tools | best-practices.md |
| Generator | disable-model-invocation: true |
metadata.output_format | template.md |
| Reviewer | Default (auto) | metadata.severity_levels | checklist.md |
| Inversion | disable-model-invocation: true |
metadata.interaction_type | - |
| Pipeline | Default (auto) | metadata.steps | step-N.md |
4.3 Complete Field Quick Reference
---
# 【Required】
name: kebab-case-name # Critical! Affects auto-discovery
description: | # <=200 chars, must include "when to use"
One sentence on what it does.
One sentence on when to use.
# 【Strongly Recommended】
version: X.Y.Z # Semantic version
author: email@company.com # Contact for feedback and maintenance
# 【Invocation Control - Claude Code】
disable-model-invocation: true # true=manual only, false=auto (default)
user-invocable: false # false=Claude-only, true=both (default)
allowed_tools: # Whitelist (if applicable)
- tool-a
- tool-b
# 【Optional but useful】
metadata:
pattern: wrapper|generator|reviewer|inversion|pipeline
domain: python|api|content|...
severity_levels: critical,warning,info # If Reviewer
timeout_seconds: 30 # If timeout control needed
interaction_type: single-turn|multi-turn # If Inversion
---
4.4 Structured Markdown Body Template
For the Markdown section after YAML, this structure is recommended:
# [Skill Name]
## Your Role
[One sentence explaining identity and responsibility]
## Load Reference Materials
[List references/ files to load]
## Process Overview (Critical!)
[Explain execution order using numbered steps]
### Step 1: [Specific Action]
[Detailed explanation]
### Step 2: [Specific Action]
[Detailed explanation]
## Output Format (Critical!)
[Clearly define required sections]
## Examples
### Input
[Real example]
### Output
[Expected output]
## Notes
[Common pitfalls or special cases]
Writing Notes: - This section's core is showing "what a complete production-grade Skill looks like," not abstract theory - Complete code examples let readers copy and use - Table comparing 5 patterns helps quick lookup - Quick reference can be directly copied to projects
Part 5 · Vertical Industry Applications: Complete Solutions for 6 Real-World Scenarios
Now that we've covered the theory, let's see the power of Skills in the real world. This section presents 6 actual application cases across different industries, each including: Skill design, expected outcomes, and integration methods.
5.1 Software Development: Code Review and Refactoring
Typical Scenario: Large tech team with 100+ PRs (Pull Requests) daily; manual review cost is too high.
Core Pain Points: - Manual review of 100 PRs requires 20+ hours/day - Security issues frequently slip through manual review - Code style inconsistency, relying on individual discipline
Skill Design:
name: python-code-reviewer-enterprise
description: |
Automated review of Python code quality, security, and style consistency.
Used when Pull Requests or code snippets are submitted for review.
# disable-model-invocation: false (default)
allowed_tools:
- git-diff-reader
- code-formatter
---
## Review Process
### Step 1: Load Team Standards
Load references/company-standards.md (PEP8 + company custom standards)
### Step 2: Load Security Checklist
Load references/security-checklist.md
### Step 3: Line-by-Line Analysis
- Performance issues
- Security vulnerabilities
- Code style
- Logic correctness
### Step 4: Generate Report
Must include: Critical/Warning/Info classification + Top 3 recommendations
Expected Outcomes (based on best practices): - 💡 PR review time can be reduced from 20 minutes to 3 minutes - 💡 Security issue capture rate expected from 60% to 95% - 💡 Code style consistency can reach 99%*
*Actual results depend on: codebase size, existing process maturity, Skill configuration quality
Integration: GitHub Actions + Claude API, auto-triggered on PR creation.
5.2 Content Creation: Novel Writing and Story Planning
Typical Scenario: Web fiction authors need to maintain character consistency, story logic coherence, while overcoming "writer's block."
Core Pain Points: - Long-form writing prone to internal contradictions - Halfway through, unsure how to proceed - Writer's Block affecting update schedule
Skill Design:
name: novel-story-planner
description: |
Helps authors plan story structure, check for logic holes, and maintain character backgrounds.
Used when saying "help me plan the story" or "check character consistency".
disable-model-invocation: true # User-triggered
---
## Story Planning Process
### Step 1: Collect Basic Story Info (Inversion Mode)
Ask author 5 questions (one at a time, wait for response):
1. What is the story's theme?
2. What is the core conflict?
3. What is the general direction of the expected ending?
4. How many core characters?
5. What is the time span?
### Step 2: Load Story Structure Template
Load references/three-act-structure.md
### Step 3: Generate Story Outline
- Act One: Starting point + inciting incident
- Act Two: Rising conflict + turning point
- Act Three: Climax + resolution
### Step 4: Generate Character Profiles
Output to assets/character-cards.md (character background, personality, relationship map)
### Step 5: Generate Plot Checkpoints
Output to assets/plot-checkpoints.md (key scene list)
Expected Outcomes (based on best practices): - 💡 Writer's block time can be reduced from 2 hours to 20 minutes - 💡 Logic holes can be identified early in writing - 💡 Character consistency expected from 70% to 98%*
*Creative work is influenced by personal style; actual results vary
Integration: Manual invocation in Claude IDE via /novel-planner
5.3 Financial Analysis: Investment Report Generation
Typical Scenario: Analysts at quantitative funds or investment institutions need to quickly generate uniformly formatted, comprehensive investment research reports.
Core Pain Points: - Initial draft takes 4+ hours per report - Report format inconsistent, hard to batch process - Humans easily miss important analysis dimensions
Skill Design:
name: quant-research-report-generator
description: |
Quickly generates institutional-grade investment research reports from data.
Used when saying "generate a report" or "analyze this stock".
disable-model-invocation: true # User-triggered
allowed_tools:
- data-fetcher
- chart-generator
---
## Report Generation Process
### Step 1: Collect Analysis Requirements
- Stock code / company name
- Analysis time period
- Key metric priorities
### Step 2: Load Analysis Templates
Load assets/report-template.md (institutional standard format)
Load references/valuation-models.md
Load references/risk-factors.md
### Step 3: Data Collection and Verification
- Financial data extraction
- Industry benchmarking analysis
- Risk factor assessment
### Step 4: Generate Report
Must include:
1. Industry analysis
2. Company financial analysis
3. Valuation recommendations
4. Risk assessment
5. Investment recommendations (with confidence level)
### Step 5: Chart Generation
Call chart-generator for visualization
Expected Outcomes (based on best practices): - 💡 Initial draft generation time can be reduced from 4 hours to 15 minutes - 💡 Report format consistency can reach 98% - 💡 Analysis dimension coverage greatly improved*
*Actual depends on: data source quality, model accuracy, industry knowledge base completeness
Integration: Internal data API + scheduled tasks, weekly auto-generated industry reports
5.4 Customer Service: Smart Survey and Ticket Classification
Typical Scenario: E-commerce platform with 1000+ customer service tickets daily, needs auto priority sorting and classification.
Core Pain Points: - Manual classification time-consuming (average 8 minutes/ticket) - Inconsistent priority judgment, high-priority tickets may be ignored - High training cost for new employees
Skill Design:
name: customer-issue-classifier
description: |
Auto-classifies service tickets, tags priority, suggests response templates.
Auto-triggers classification process when new ticket is created.
# disable-model-invocation: false (default)
---
## Classification Process
### Step 1: Collect Ticket Information
Extract: title, description, historical interactions, customer tier
### Step 2: Load Classification Rules
Load references/category-taxonomy.md (primary/secondary classification)
Load references/priority-rules.md
Load references/response-templates.md
### Step 3: Multi-dimensional Analysis
- Product line classification (primary)
- Issue type classification (secondary)
- Priority assessment (P0/P1/P2/P3)
- Urgency assessment
### Step 4: Output Results
Must include:
- Primary + secondary classification
- Priority + urgency
- Recommended assigned department
- Initial response template
Expected Outcomes (based on best practices): - 💡 Ticket classification accuracy expected from 65% to 94% - 💡 High-priority ticket identification rate expected from 78% to 99% - 💡 Average handling time can be reduced from 8 minutes to 3 minutes*
*Actual depends on: ticket text quality, classification rule completeness, historical data adequacy
Integration: Ticket system webhook, auto-triggered on new ticket creation
5.5 EdTech: Personalized Learning Paths
Typical Scenario: Online education platform customizing learning plans for students to improve completion rates.
Core Pain Points: - Low learning completion rate (industry average 45%) - One-size-fits-all learning paths, unsuitable for different student levels - Lack of personalized feedback mechanism
Skill Design:
name: adaptive-learning-path-designer
description: |
Generates customized learning paths based on student level, learning style, and goals.
Used when new user registers or requests learning plan.
# disable-model-invocation: false (default)
---
## Learning Path Design Process
### Step 1: Student Diagnosis (Inversion Mode)
Collect:
- What are the learning goals?
- What foundational knowledge is already mastered?
- Learning style preference (visual/auditory/kinesthetic)?
- Weekly time available?
- Expected completion time?
### Step 2: Load Curriculum System
Load references/curriculum-graph.md (knowledge dependency graph)
Load references/learning-style-mapping.md
Load references/progression-models.md
### Step 3: Generate Personalized Path
- Recommended course sequence
- Daily learning objectives
- Expected completion milestones
- Checkpoints and assessments
### Step 4: Generate Learning Contract
Output to assets/learning-contract.md
Expected Outcomes (based on best practices): - 💡 Learning completion rate expected from 45% to 78% - 💡 Efficiency per time unit expected to improve by 35% - 💡 Student satisfaction expected from 6.5/10 to 8.2/10*
*Actual depends on: student self-discipline, course content quality, platform interaction experience
Integration: LMS integration, auto-triggered on new student registration
5.6 Healthcare: Clinical Decision Support System
Typical Scenario: Medical clinics or hospitals helping doctors quickly query diagnosis guidelines and cross-check diagnoses.
⚠️ Important Disclaimer: This case is for technical demonstration only. Medical AI must obtain regulatory approval and cannot replace doctors' clinical judgment.
name: clinical-decision-support
description: |
Assists doctors in querying diagnosis guidelines, drug interactions, and checklists.
Only manually triggered by doctors; cannot run automatically.
disable-model-invocation: true # Medical scenarios require user trigger
allowed_tools: []
metadata:
compliance: HIPAA
approved_by: medical-board-review-required
---
## Decision Support Process
### Step 1: Collect Patient Information (Doctor Input)
- Chief complaint
- Existing test results
- Past medical history and allergies
### Step 2: Load Clinical Guidelines
Load references/clinical-guidelines.md
Load references/drug-interactions.md
Load references/contraindication-checklist.md
### Step 3: Assist Analysis (Pipeline Mode)
- Step 1: Generate differential diagnosis list
- Step 2: Recommend necessary tests
- Step 3: After doctor confirmation, load complete guidelines
- Step 4: Treatment plan reference (drug/non-drug)
- Step 5: Safety cross-check
### Step 4: Output Reference Report
- Differential diagnosis ranked table
- Recommended test checklist
- Reference treatment plans
- ⚠️ Key risk warnings
### ⚠️ Required Disclaimer
"This system is for auxiliary reference only and cannot replace doctors' clinical judgment.
All diagnosis and treatment recommendations must be made by licensed doctors based on actual patient conditions."
Expected Outcomes (based on best practices): - 💡 Standardization of diagnosis/treatment process expected from 60% to 98% - 💡 Medication safety expected to improve - 💡 Doctor work efficiency expected to increase*
*Important note: This Skill is only an auxiliary reference tool and cannot replace medical diagnosis. Medical ethics committee and compliance approval should be obtained before actual deployment.
Safety Design:
- disable-model-invocation: true — Only doctors can trigger
- allowed_tools: [] — No external tool calls
- All outputs must include disclaimer
5.7 Six Scenarios Comparison Overview
| Industry | Skill Type | Core Pattern | Trigger Method | Expected Benefit |
|---|---|---|---|---|
| Software Dev | Reviewer | Pipeline | Auto (PR-triggered) | Code review efficiency ⭐⭐⭐⭐⭐ |
| Content Creation | Inversion | Multi-turn | Manual | Overcome writer's block ⭐⭐⭐⭐ |
| Finance | Generator | Template filling | Manual+Scheduled | Report generation efficiency ⭐⭐⭐⭐⭐ |
| Customer Service | Reviewer | Classification rules | Auto (Webhook) | Ticket processing efficiency ⭐⭐⭐⭐ |
| EdTech | Inversion | Personalized diagnosis | Auto (Registration) | Learning completion rate ⭐⭐⭐⭐ |
| Healthcare | Pipeline | Multi-step decision | Manual (Doctor) | Diagnosis standardization ⭐⭐⭐⭐⭐ |
Common Design Principles: 1. All expected outcomes note "actual depends on..." factors 2. Sensitive industries (healthcare) have additional safety designs 3. Output formats are structured for downstream processing
Editor's Notes: - Token efficiency data source: LangChain official tech blog (Mar 2026) - Progressive Disclosure pattern reference: Google ADK Design Guide - All expected outcomes in this article note key "actual depends on..." factors
Part 6 · Practical Tools and Ecosystem: From Learning to Using
6.1 Skill Creation Toolchain
Now that you understand Skill design theory, let's look at tools that can help you quickly create and test Skills.
Creation Tools (choose based on your development environment):
Skill creation typically requires: - A code editor - Claude Code or other IDEs supporting Skills - Basic Markdown and YAML knowledge
For detailed toolchain, refer to each IDE's official documentation.
Community Tools:
| Tool | Features | Best For |
|---|---|---|
| skillpkg.com | Chinese Skill marketplace | Chinese community, Chinese docs |
| skills.sh | Global Skill registry | International projects, open-source community |
| ClawHub | Enterprise Skill hosting and version management | Large enterprises, on-premise deployment |
| GitHub Marketplace | Some Skills published on GitHub | Open-source projects, developer community |
6.2 Skill Evaluation and Version Management
Monitoring Metrics (using LangSmith or similar tools):
- Skill invocation frequency
- Average execution time
- Error Rate
- Token consumption
- User Satisfaction
Semantic Versioning:
vX.Y.Z naming rules:
- X (Major): Breaking API changes
- Y (Minor): New features (backward compatible)
- Z (Patch): Bug fixes
CHANGELOG Example:
## v2.1.0
- [New] Added security check mode
- [Improve] Optimized output format performance
- [Fix] Fixed error handling in references loading
6.3 Ecosystem Case: Skill Lifecycle
Case: Anthropic's Official PDF Processing Skill
| Time | Version | Milestone |
|---|---|---|
| Nov 2025 | v1.0 | Released (PDF text extraction support) |
| Dec 2025 | v1.1 | Added table recognition |
| Jan 2026 | v2.0 | Image recognition support |
| Mar 2026 | Current | Adopted by 30+ enterprises, top 10 on skills.sh |
Success Factors: 1. Solved a clear pain point (PDF processing) 2. High-quality documentation and examples 3. Active version iteration 4. Clear API (not vague) 5. Community participation and feedback
Part 7 · Common Questions and Quick Reference
Q1: What's the difference between Skills and Tools?
| Dimension | Tool | Skill |
|---|---|---|
| Granularity | Atomic operation (query API, run command) | Multi-step workflow (strategy, decision chain) |
| Complexity | Simple | High |
| Learning Cost | Low | Medium to High |
| Reusability | Code-level reuse | Process-level reuse |
| Example | "Call weather API" | "Give weekend plan based on weather" |
Simple判断: If you only need to execute one action, use Tool; if you need a complete workflow, use Skill.
Q2: Should logic go in SKILL.md or scripts/?
Decision Tree:
Is the logic text processing rules?
→ Write in SKILL.md
Need to execute code (e.g., run linter)?
→ Write in scripts/ + reference in SKILL.md
Need frequent updates?
→ Write in references/ (easier independent version control)
Is it generic, cross-Skill logic?
→ Write as Tool
Q3: How do I make Claude auto-discover my Skill?
Key: Write a good description
❌ Bad example:
description: Code review Skill
✅ Good example:
description: |
Reviews Python code for quality, performance, and security.
Used when you submit code requesting feedback, or say "help me review".
Principle: Claude performs similarity matching between user prompts and Skill descriptions; auto-loads when similarity exceeds threshold.
Q4: Can Skills invoke other Skills?
Yes, but be careful.
## Referencing Other Skills in SKILL.md
If code issues are detected, recommend using `/code-formatter` Skill to auto-fix formatting.
---
**Risks**:
- Too deep chained invocation consumes many tokens
- May form circular calls
- Need to use allowed_tools for restrictions
Q5: How do enterprises manage Skill versions and permissions?
Recommended organizational structure:
skills/
├── core/ # Platform team maintained (read-only)
│ ├── code-reviewer/
│ └── security-audit/
├── departments/ # Each department maintains
│ ├── finance/
│ ├── engineering/
│ └── marketing/
└── vendor/ # Third-party Skills (read-only)
Permission Model: - All Skills default to read-only - Modifications require code review - Publishing requires department head approval - Critical Skills require compliance approval
Closing · Call to Action and Outlook
Start Now: Begin with 5 Minutes
Step 1: Experience a ready-made Skill
cd your-project
mkdir -p my-first-skill
cd my-first-skill
cat > SKILL.md << 'EOF'
---
name: my-first-skill
description: My first Skill. Used when you say "hello".
---
Hello! I'm a Skill. I'm here to help you.
EOF
Step 2: Test in Claude IDE
- Input /my-first-skill
- Observe how Claude loads and uses it
Step 3: Join the community - Browse existing Skills on skillpkg.com or skills.sh - Participate in discussions and feedback
Future Outlook: Skill Ecosystem Trends
Possible future developments (speculations based on current trends, not guarantees):
- Skill marketplace continues to grow
- Number and variety of Skills may continue to increase
- Enterprise Skill management needs may rise
-
Paid or subscription models may emerge
-
Cross-tool compatibility improves
- Multiple AI tools may gradually adopt Skills specification
-
Standardization may increase
-
Multimodal support expands
- Skills may gradually support more input/output types
-
Workflow orchestration capabilities may enhance
-
Professional specialization emerges
- "Skill Engineer" or similar roles may appear
- Skill design may become an independent professional field
Why Skills Are Worth Paying Attention to Now
- ✅ Skills specification already adopted by multiple mainstream AI tools
- ✅ Open-source ecosystem developing rapidly
- ✅ Community activity continues to grow
- ✅ Real-world application scenarios expanding
- ✅ Skills as practical skills, potential value gradually emerging
References
Official Documentation
- Anthropic Claude Code Skills Documentation
- GitHub Anthropic Skills Repository
- Microsoft Agent Framework Skills
- GitHub Copilot Agent Skills
- LangChain Skills Documentation
- Agent Skills Registry
Technical Blogs
- LangChain Blog: "The Anatomy of an Agent Harness" (Vivek Trivedy, Mar 2026)
- LangChain Blog: "Evaluating Skills" (Robert Xu, Mar 2026)
- LangChain Blog: "How we built LangChain's GTM Agent" (Mar 2026)
- Skillpkg: "5 Agent Skill Design Patterns Every ADK Developer Should Know" (Ironben, 2026-03-18)
Industry Research
Skills Marketplaces
- skillpkg.com - Chinese Skill marketplace
- skills.sh - Global Skill registry
Data Source Notes
This article follows these rules for citing data:
| Data Type | Citation Method | Example |
|---|---|---|
| Performance data officially stated | Cite specific source | "According to LangChain official data..." |
| Expected outcomes based on cases | Note "expected outcomes (based on best practices)" | "Expected from 60% to 95%*" |
| Speculative industry content | Note "possible future developments" | "Skill marketplace may continue to grow" |
| Data without specific source | No specific numbers | Qualitative description used instead |
Note: This article was written in March 2026. AI technology is evolving rapidly; some information may change over time. Readers are advised to refer to the latest official documentation when using in practice.
If you found this article helpful, feel free to share it with those who need it. Questions or suggestions? Welcome to discuss in the comments!