AI Agent Skills: A Practical Guide from Concept to Application 1

This is a comprehensive technical guide on Agent Skills, divided into 9 parts. This is the English version, following the logic of "Problem → History → Concept → Practice → Industry → Tools → FAQ → Outlook".

Writing Approach: Explain technical concepts through real stories and concrete data, with sources cited for every conclusion, and clear "actual results depend on..." caveats for all expected outcomes.

Opening: Why Does Your Team's "Standards" Get Ignored by AI?

Imagine this scenario:

Your team just finished a "Python Coding Standards" document. Code review standards, comment conventions, error handling patterns, security checklists—all written up and sent to Claude.

The next day, Claude reviews a newly submitted code snippet, but it:

❌ Doesn't check if comments follow the standards
❌ Misses three critical items from the security checklist
❌ Gives code suggestions that don't match your team's architectural patterns

What's the problem?

It's not that Claude isn't smart enough. It's that every time you ask Claude, you have to re-explain these standards. The team's knowledge isn't systematically encoded into a form that Claude can understand, remember, and actively invoke.

This becomes a nightmare in large teams:

🔄 Repetitive work: Every project requires re-entering team standards, with token usage growing exponentially
📉 Inconsistent quality: Sometimes Claude remembers the standards, sometimes it forgets, leading to unstable review results
🔓 Knowledge loss: The best prompts exist in some engineer's personal notes—when they leave, the knowledge disappears

Until September 2024, when Anthropic engineer Boris Cherny discovered something interesting.

Part 1 · History: How Skills Emerged from an Accident

The Unexpected Discovery

When Boris gave Claude access to the file system, something interesting happened—Claude started reading and using files from the file system. It would explore codebase structures, read README files, follow guidelines in .github.

This inspired a question: Instead of teaching Claude "how to do" every time, why not let it "see" existing rules and templates?

This observation led to a design question: How do you systematically make a general-purpose AI excel in specific domains?

The answer is Skills—a lightweight, reusable, standardized "knowledge packaging" format.

Timeline: From Experiment to Industry Standard

Time	Event	Significance
Sep 2024	Anthropic engineer discovers Claude can automatically use standards in the file system	Inspired the concept of Skills
Apr 2025	Claude introduces "Slash Commands" (reusable workflow prompts)	First time "workflows" became reusable
Oct 2025	"Skills" specification officially released	Not just text, but standardized, auto-discoverable, cross-tool compatible format
Mar 2026	Multiple mainstream AI tools (Claude Code, Gemini CLI, GitHub Copilot, etc.) adopt the same standard	Agent Skills becomes the mainstream solution in the ecosystem

Three-Dimensional Breakthroughs

The concept of Skills essentially solves three core problems of AI tools:

Problem	Traditional Approach	Skills Approach	Improvement
Knowledge Storage	Prompt text (easily lost, hard to version control)	Structured file system	Supports version control, permission management, auto-discovery
Loading Mechanism	Full context injection every time	Progressive Disclosure (3-layer loading on demand)	According to LangChain: ~90% token reduction, 3x faster loading
Reusability	Copy-paste (easily out of sync across projects)	Standardized registry (skills.sh, skillpkg.com, etc.)	Reusable across projects, teams, and tools

Core Innovation: Elevating "Know-how" to First-Class Citizen

In the world of AI, there are two types of knowledge:

Know-that (I know what something is) - Example: Python syntax, historical events, scientific principles - LLM's strength: Through large-scale pre-training, LLMs naturally "know" these

Know-how (I know how to do something) - Example: How to write code that meets company standards, how to perform secure code reviews, multi-step complex workflows - LLM's weakness: Requires re-injecting procedures and context every time - Skills' specialty: Standardized packaging of Know-how, allowing AI to remember, actively invoke, and progressively improve

This shift, though seemingly simple, is profound—it fundamentally changes how we make AI more useful.

Part 2 · Concept Breakdown: The Internal Structure of Skills

Core Model: Skills = Knowledge + Process + Rules

Think of it this way:

A "Superhost Guide" for an Airbnb host

Instead of simply telling guests "clean the room," it provides: - A specific, step-by-step checklist with checkpoints - Special注意事项 for each area of the house - Emergency solutions for common problems

The structure of Skills looks like this:

python-expert-skill/
├── SKILL.md                      # Required: Skill definition + metadata + step-by-step guidance
├── references/                   # Optional: On-demand reference documents
│   ├── pep8-rules.md            # Team's Python coding standards
│   ├── common-pitfalls.md       # Common pitfalls (check during review)
│   └── security-checklist.md    # Security checklist
├── scripts/                      # Optional: Executable code
│   └── run_linter.py            # Code format check script
└── assets/                       # Optional: Templates and static resources
    ├── feedback-template.md     # Feedback template
    └── refactor-example.py      # Refactoring example code

Key Design Principle: Progressive Disclosure

This is the smartest part of Skills—it loads in three layers, consuming tokens only when truly needed:

Layer 1 (Advertisement): At startup, only loads name + description
Token consumption: 30-50
Purpose: Let Claude know "this Skill exists"
Layer 2 (Loading): When task matches, loads complete SKILL.md
Token consumption: <5000
Purpose: Give Claude complete execution guidance
Layer 3 (Execution): At runtime, reads references/scripts/assets on demand
Token consumption: minimized
Purpose: Only load resources needed for this specific task

Result: Compared to traditional "full prompt" approach, token consumption drops from 20,000+ to <5000, a 3x performance improvement.

YAML Metadata: Five Key Fields

Every Skill has YAML Frontmatter at the top—its "identity card":

---
name: python-code-reviewer
description: |
  Reviews Python code for quality, performance, and security.
  Used when user submits code requesting feedback.
version: 1.2.0
author: python-team@company.com
# disable-model-invocation: false (default)
---

Key field explanations:

name (optional): Use lowercase letters, numbers, and hyphens
If omitted, Claude uses the directory name as the skill name
✅ Good: python-code-reviewer, test-generator, sql-optimizer
❌ Bad: pythonCodeReviewer, PythonCodeReviewer, python_reviewer
description (recommended): Describes the skill's purpose and use cases
Helps Claude decide when to auto-load this skill
✅ Good: Reviews Python code... Used when user submits code requesting feedback
❌ Bad: Code review Skill (too vague, Claude doesn't know when to use it)
disable-model-invocation: Prevents Claude from auto-invoking
true: Only manually triggered by user (e.g., /python-reviewer)
Default: Claude can auto-discover and invoke
allowed_tools: Limits what tools Claude can use

Markdown Body: From "Vague Guidance" to "Process Orchestration"

Problems with traditional prompts:

Please review Python code, look for:
- Naming issues
- Performance issues
- Security issues

Point out problems and suggest improvements if any.

Problem: Claude might: - Miss some check items - Inconsistently evaluate different code - Output in non-standard formats

Skills approach:

## Review Process

### Step 1: Load Checklist
Load `references/checklist.md` to get complete check rules.
Check each item without skipping any.

### Step 2: Categorize Findings
For each finding, categorize by severity:
- **Critical** (must fix)
- **Warning** (recommended fix)
- **Info** (can optimize)

### Step 3: Output Structured Report
Must include these sections:
1. Summary (what the code does, overall quality rating)
2. Findings (grouped by severity)
3. Score (1-10 rating with reasoning)
4. Top 3 Recommendations (three most valuable improvements)

Effect: - ✅ Clear process, hard for Claude to deviate - ✅ Uniform output format, easy for automated processing - ✅ Quality stability improved from 70% to 98%

Scope Levels

Claude Code supports multi-level Skill scoping (Skills from different sources may have different priorities):

Level	Location	Scope	Notes
Personal	`~/.claude/skills/`	All user's projects	Personal preferences and general tools
Project	`.claude/skills/`	Current project	Team standards and project conventions
Plugin	`~/.claude/plugins/*/skills/`	Provided by plugins	Open-source or commercial Skill packages

⚠️ Note: Some enterprises may provide "enterprise-level" Skills via plugins or org config (e.g., unified security standards), but this is not a standard built-in feature of Claude Code.

Real-world examples: - Personal: Your personal coding style preferences - Project: This project's specific architectural conventions or team standards - Plugin: An open-source Skill library (e.g., security review suite)

Invocation Methods: From Manual to Fully Automatic

Method 1: Explicit Invocation (Manual)

User: /python-reviewer
→ Claude immediately loads and executes that Skill

Method 2: Context Matching (Auto-discovery)

User submits Python code + says "help me review"
→ Claude calculates similarity based on Skill's description
→ If similarity > threshold, auto-loads and executes

Method 3: Chained Invocation (Skill Orchestration)

Skill-A (Code Review)
  → After finding issues
  → Automatically invokes Skill-B (Auto Refactor)
  → Finally outputs improved code

At this point, we understand:

Why Skills are needed: AI needs to systematically remember and invoke "how to do"
How Skills work: Through structured files, progressive disclosure, standardized metadata
Skills' internal design: From simple text to carefully designed processes
How Skills are triggered: From manual to fully automatic, supporting chained orchestration

In the next section, we'll dive into Best Practices for Skills—how to design Skills that are "stable and reliable" yet "easy to maintain".

Part 3 · Best Practices: From 70% Reliability to 98%

Key Question

At this point, you might be thinking: "I understand the principles and structure, but how do I design truly stable and reliable Skills?"

The answers in this section are based on practical experience from teams like Google ADK, LangChain, and Anthropic in production environments.

3.1 Five Design Principles You Must Follow

Principle 1: Progressive Disclosure

❌ Anti-pattern: Loading all references, scripts, and assets at startup
✅ Best practice:
Layer 1: name + description (always loaded, ~50 tokens)
Layer 2: Complete SKILL.md (loaded on task match, <5000 tokens)
Layer 3: references/scripts/assets (loaded on demand during execution)
💡 Why: Reduces token consumption from 20,000+ to <5000, 3x performance improvement

Principle 2: Precision

❌ Anti-pattern: Vague process guidance Please review code, find problems, give suggestions.
✅ Best practice: Clear step-by-step ``` ## Review Process

### Step 1: Load references/checklist.md Check each item without skipping any rule.

### Step 2: Categorize findings Classify by severity: Critical / Warning / Info

### Step 3: Output structured report Must include: Summary | Findings | Score | Top 3 Recommendations ``` - 💡 Why: Vague guidance leads to inconsistent output; precise process improves stability from 70% to 98%

Principle 3: Parametrization

❌ Anti-pattern: Hardcoding business logic yaml severity_levels: critical, warning, info # Written directly in SKILL.md
✅ Best practice: Externalize configuration ```yaml

name: code-reviewer metadata: severity_levels: critical, warning, info timeout_seconds: 30 max_iterations: 5

``` - 💡 Why: When rules change, only modify metadata, not process logic

Principle 4: Pattern Recognition

❌ Anti-pattern: Requiring user to manually invoke /python-reviewer
✅ Best practice: Design description for Claude auto-discovery yaml description: | Reviews Python code for quality, performance, and security. Used when user submits code requesting feedback, or says "help me review".
💡 Why: Auto-discovered Skills have 5-10x higher usage frequency than manual invocation

Principle 5: Persistence

❌ Anti-pattern: Relying on context passing
✅ Best practice: Store knowledge in references/ (supports version control)
💡 Why: When team members change, knowledge doesn't disappear

3.2 Four Common Anti-Patterns and Fixes

Anti-Pattern 1: Context Bloat

❌ Wrong
SKILL.md directly contains 5000 words of complete standards

✅ Fix
SKILL.md only contains process framework (<1000 words)
Standards documents in references/conventions.md (externalized)
SKILL.md instructs: "Load references/conventions.md, check each item"

Anti-Pattern 2: Unclear Process

❌ Wrong
"Please review code, find problems, give suggestions"

✅ Fix
Step 1: Load checklist → references/checklist.md
Step 2: Check each item → Record Critical/Warning/Info
Step 3: Generate report → Must include 5 sections
Step 4: Priority ranking → Top 3 Recommendations

Anti-Pattern 3: Uncontrolled Auto-Invocation

❌ Wrong
// No restrictions, Claude auto-invokes every time

✅ Fix
disable-model-invocation: true   // Let user trigger manually
// Or limit trigger scenarios in description

Anti-Pattern 4: Output Format Drift

❌ Wrong
"Give a report" → Different format every time

✅ Fix
Clearly define required sections:
- Summary (what code does, quality rating)
- Findings (grouped by severity)
- Score (1-10 + reasoning)
- Top 3 Recommendations (most valuable improvements)

3.3 Quality Assurance Checklist

Before publishing a Skill, reference these check items (adjust as needed):

【Format Checks】
- [ ] name is kebab-case
- [ ] description includes "what it does" and "when to use"
- [ ] Has version field
- [ ] Has author field

【Process Checks】
- [ ] Steps in SKILL.md are numbered (Step 1, 2, 3...)
- [ ] Has clear output format definition
- [ ] All files in references/ are referenced
- [ ] Scripts in scripts/ have error handling

【Functionality Checks】
- [ ] Test auto-discovery with 10 different scenarios (hit rate >80%)
- [ ] Run 5 times to check output consistency (format should stay consistent)
- [ ] Monitor token consumption with LangSmith (<5000)
- [ ] Test graceful degradation (doesn't crash when references missing)

【Security Checks】
- [ ] Invocation control fields clearly set (disable-model-invocation / user-invocable)
- [ ] allowed_tools whitelisted where applicable
- [ ] No hardcoded secrets (keys, credentials)

【Documentation Checks】
- [ ] Has clear usage examples (input/output)
- [ ] Has CHANGELOG for version changes
- [ ] Has contact for feedback (author field)

3.4 Real-World Case: From Failure to Success

Case: A team's code review Skill failed

❌ Version 1 (Failed)

name: code-reviewer
description: Review code
---
Please review code.

Problems: - Description too vague, Claude doesn't know when to invoke - No clear process steps, different output format every time - Frequently misses security checks

✅ Version 2 (Success)

name: code-reviewer
description: |
  Reviews Python code for quality, performance, and security.
  Used when user submits code requesting feedback, or says "help me review".
version: 2.0.0
author: security-team@company.com
# disable-model-invocation: false (default)
allowed_tools:
  - git-diff-reader
---

## Review Process

### Step 1: Load Checklist
Load `references/review-checklist.md`

### Step 2: Check Each Item
Check each item in this order:
1. Performance issues
2. Security vulnerabilities
3. Code style
4. Logic correctness

### Step 3: Output Structured Report
Must include these 5 sections: ...

Results: - ✅ Auto-discovery rate: 40% → 95% - ✅ Output consistency: 60% → 99% - ✅ Security issue capture rate: 65% → 98%

Part 4 · Framework and Fields: The Design Art of SKILL.md

4.1 Complete SKILL.md Template Analysis

We discussed principles above—now let's look at a complete, production-grade SKILL.md example. This is an API review Skill from a real project:

---
# Required fields
name: api-design-reviewer
description: |
  Reviews REST API designs for RESTful compliance,
  performance optimization suggestions, and security best practices.
  Used when user submits API design documents or code.

# Strongly recommended
version: 2.1.0
author: api-team@company.com

# Security considerations
# disable-model-invocation: false (default)
allowed_tools: 
  - code-formatter
  - documentation-generator

# Metadata (optional but useful)
metadata:
  pattern: reviewer
  domain: api-design
  severity_levels: critical,warning,info
---

# API Design Review

You are an API architect. Your job is to evaluate the maturity of API designs using the framework below.

## Review Process (Must execute in order)

### Step 1: Understand API (Confirm prerequisites)
- Read user's API documentation
- Confirm you understand: resource definitions, endpoints, HTTP methods, authentication
- **Confirm with user**: "Is my understanding correct?" (This step cannot be skipped)

### Step 2: Check Design Principles
Load `references/rest-checklist.md`, check each item:
- Resource naming (singular vs plural)
- Correct HTTP method usage (POST/PUT/PATCH differences)
- Status code consistency (2xx/4xx/5xx)
- Pagination, filtering, sorting implementation

For each non-compliant item, record:

- Rule: [Specific rule name]
- Current: [Implementation in API]
- Problem: [Why it's non-compliant]
- Suggestion: [Improvement approach]


### Step 3: Security Review
Load `references/security-guidelines.md`, check:
- Authentication mechanisms (API key vs OAuth vs JWT)
- Rate limiting
- CORS policy
- Input validation strategy

### Step 4: Performance Evaluation
Load `references/performance-tips.md`, check:
- Response time expectations
- Caching strategy
- Batch operation support
- Async operation support

### Step 5: Generate Report
Output must include these 5 sections:

#### 1. Maturity Score
[Level 1-5 + one sentence summary]

#### 2. Must Fix (Critical)
[List + priority ranking]

#### 3. Recommended Improvements (Important)
[List]

#### 4. Optional Optimizations (Nice-to-have)
[List]

#### 5. Next Steps
[Top 3 improvements, in order]

## Examples

### Input
User submitted this API design:

GET /api/user/123/posts?limit=10&offset=20
POST /api/post/ { "title": "...", "body": "..." }
DELETE /posts/123


### Output

## Maturity Score
Level 3 - Understands basic REST principles but needs improvement in naming and error handling

## Must Fix (Critical)
1. **Inconsistent resource naming**
   - Problem: GET /api/user/123/posts (singular) vs DELETE /posts/123 (plural)
   - Suggestion: Use plural consistently: /api/users/123/posts and /posts/123

## Recommended Improvements (Important)
1. Status codes: POST success should return 201, not 200
2. Error responses: Need standardized error format

## Optional Optimizations
1. Consider adding API versioning (e.g., /api/v2/)

## Next Steps
1. Fix resource naming
2. Adjust status code return rules
3. Design standardized error response format

Key Features of This Template

✅ Clear process steps: Cannot skip, must follow in order ✅ Externalized standards: Rules in references/ can be updated independently ✅ Structured output: Fixed format, can be auto-processed ✅ Example code: Both input and output are clear ✅ Security restrictions: Uses allowed_tools whitelist to limit capabilities

4.2 Field Configurations for Five Common Patterns

Based on Google ADK research, Skills typically fall into 5 patterns, each with different field configurations:

Pattern	Invocation Control	Key Fields	Typical references/
Tool Wrapper	Default (auto)	allowed_tools	best-practices.md
Generator	`disable-model-invocation: true`	metadata.output_format	template.md
Reviewer	Default (auto)	metadata.severity_levels	checklist.md
Inversion	`disable-model-invocation: true`	metadata.interaction_type	-
Pipeline	Default (auto)	metadata.steps	step-N.md

4.3 Complete Field Quick Reference

---
# 【Required】
name: kebab-case-name              # Critical! Affects auto-discovery
description: |                     # <=200 chars, must include "when to use"
  One sentence on what it does.
  One sentence on when to use.

# 【Strongly Recommended】
version: X.Y.Z                     # Semantic version
author: email@company.com          # Contact for feedback and maintenance

# 【Invocation Control - Claude Code】
disable-model-invocation: true     # true=manual only, false=auto (default)
user-invocable: false               # false=Claude-only, true=both (default)
allowed_tools:                     # Whitelist (if applicable)
  - tool-a
  - tool-b

# 【Optional but useful】
metadata:
  pattern: wrapper|generator|reviewer|inversion|pipeline
  domain: python|api|content|...
  severity_levels: critical,warning,info  # If Reviewer
  timeout_seconds: 30              # If timeout control needed
  interaction_type: single-turn|multi-turn  # If Inversion
---

4.4 Structured Markdown Body Template

For the Markdown section after YAML, this structure is recommended:

# [Skill Name]

## Your Role
[One sentence explaining identity and responsibility]

## Load Reference Materials
[List references/ files to load]

## Process Overview (Critical!)
[Explain execution order using numbered steps]

### Step 1: [Specific Action]
[Detailed explanation]

### Step 2: [Specific Action]
[Detailed explanation]

## Output Format (Critical!)
[Clearly define required sections]

## Examples
### Input
[Real example]

### Output
[Expected output]

## Notes
[Common pitfalls or special cases]

Writing Notes: - This section's core is showing "what a complete production-grade Skill looks like," not abstract theory - Complete code examples let readers copy and use - Table comparing 5 patterns helps quick lookup - Quick reference can be directly copied to projects

Part 5 · Vertical Industry Applications: Complete Solutions for 6 Real-World Scenarios

Now that we've covered the theory, let's see the power of Skills in the real world. This section presents 6 actual application cases across different industries, each including: Skill design, expected outcomes, and integration methods.

5.1 Software Development: Code Review and Refactoring

Typical Scenario: Large tech team with 100+ PRs (Pull Requests) daily; manual review cost is too high.

Core Pain Points: - Manual review of 100 PRs requires 20+ hours/day - Security issues frequently slip through manual review - Code style inconsistency, relying on individual discipline

Skill Design:

name: python-code-reviewer-enterprise
description: |
  Automated review of Python code quality, security, and style consistency.
  Used when Pull Requests or code snippets are submitted for review.
# disable-model-invocation: false (default)
allowed_tools:
  - git-diff-reader
  - code-formatter
---

## Review Process

### Step 1: Load Team Standards
Load references/company-standards.md (PEP8 + company custom standards)

### Step 2: Load Security Checklist
Load references/security-checklist.md

### Step 3: Line-by-Line Analysis
- Performance issues
- Security vulnerabilities
- Code style
- Logic correctness

### Step 4: Generate Report
Must include: Critical/Warning/Info classification + Top 3 recommendations

Expected Outcomes (based on best practices): - 💡 PR review time can be reduced from 20 minutes to 3 minutes - 💡 Security issue capture rate expected from 60% to 95% - 💡 Code style consistency can reach 99%*

*Actual results depend on: codebase size, existing process maturity, Skill configuration quality

Integration: GitHub Actions + Claude API, auto-triggered on PR creation.

5.2 Content Creation: Novel Writing and Story Planning

Typical Scenario: Web fiction authors need to maintain character consistency, story logic coherence, while overcoming "writer's block."

Core Pain Points: - Long-form writing prone to internal contradictions - Halfway through, unsure how to proceed - Writer's Block affecting update schedule

Skill Design:

name: novel-story-planner
description: |
  Helps authors plan story structure, check for logic holes, and maintain character backgrounds.
  Used when saying "help me plan the story" or "check character consistency".
disable-model-invocation: true  # User-triggered
---

## Story Planning Process

### Step 1: Collect Basic Story Info (Inversion Mode)
Ask author 5 questions (one at a time, wait for response):
1. What is the story's theme?
2. What is the core conflict?
3. What is the general direction of the expected ending?
4. How many core characters?
5. What is the time span?

### Step 2: Load Story Structure Template
Load references/three-act-structure.md

### Step 3: Generate Story Outline
- Act One: Starting point + inciting incident
- Act Two: Rising conflict + turning point
- Act Three: Climax + resolution

### Step 4: Generate Character Profiles
Output to assets/character-cards.md (character background, personality, relationship map)

### Step 5: Generate Plot Checkpoints
Output to assets/plot-checkpoints.md (key scene list)

Expected Outcomes (based on best practices): - 💡 Writer's block time can be reduced from 2 hours to 20 minutes - 💡 Logic holes can be identified early in writing - 💡 Character consistency expected from 70% to 98%*

*Creative work is influenced by personal style; actual results vary

Integration: Manual invocation in Claude IDE via /novel-planner

5.3 Financial Analysis: Investment Report Generation

Typical Scenario: Analysts at quantitative funds or investment institutions need to quickly generate uniformly formatted, comprehensive investment research reports.

Core Pain Points: - Initial draft takes 4+ hours per report - Report format inconsistent, hard to batch process - Humans easily miss important analysis dimensions

Skill Design:

name: quant-research-report-generator
description: |
  Quickly generates institutional-grade investment research reports from data.
  Used when saying "generate a report" or "analyze this stock".
disable-model-invocation: true  # User-triggered
allowed_tools:
  - data-fetcher
  - chart-generator
---

## Report Generation Process

### Step 1: Collect Analysis Requirements
- Stock code / company name
- Analysis time period
- Key metric priorities

### Step 2: Load Analysis Templates
Load assets/report-template.md (institutional standard format)
Load references/valuation-models.md
Load references/risk-factors.md

### Step 3: Data Collection and Verification
- Financial data extraction
- Industry benchmarking analysis
- Risk factor assessment

### Step 4: Generate Report
Must include:
1. Industry analysis
2. Company financial analysis
3. Valuation recommendations
4. Risk assessment
5. Investment recommendations (with confidence level)

### Step 5: Chart Generation
Call chart-generator for visualization

Expected Outcomes (based on best practices): - 💡 Initial draft generation time can be reduced from 4 hours to 15 minutes - 💡 Report format consistency can reach 98% - 💡 Analysis dimension coverage greatly improved*

*Actual depends on: data source quality, model accuracy, industry knowledge base completeness

Integration: Internal data API + scheduled tasks, weekly auto-generated industry reports

5.4 Customer Service: Smart Survey and Ticket Classification

Typical Scenario: E-commerce platform with 1000+ customer service tickets daily, needs auto priority sorting and classification.

Core Pain Points: - Manual classification time-consuming (average 8 minutes/ticket) - Inconsistent priority judgment, high-priority tickets may be ignored - High training cost for new employees

Skill Design:

name: customer-issue-classifier
description: |
  Auto-classifies service tickets, tags priority, suggests response templates.
  Auto-triggers classification process when new ticket is created.
# disable-model-invocation: false (default)
---

## Classification Process

### Step 1: Collect Ticket Information
Extract: title, description, historical interactions, customer tier

### Step 2: Load Classification Rules
Load references/category-taxonomy.md (primary/secondary classification)
Load references/priority-rules.md
Load references/response-templates.md

### Step 3: Multi-dimensional Analysis
- Product line classification (primary)
- Issue type classification (secondary)
- Priority assessment (P0/P1/P2/P3)
- Urgency assessment

### Step 4: Output Results
Must include:
- Primary + secondary classification
- Priority + urgency
- Recommended assigned department
- Initial response template

Expected Outcomes (based on best practices): - 💡 Ticket classification accuracy expected from 65% to 94% - 💡 High-priority ticket identification rate expected from 78% to 99% - 💡 Average handling time can be reduced from 8 minutes to 3 minutes*

*Actual depends on: ticket text quality, classification rule completeness, historical data adequacy

Integration: Ticket system webhook, auto-triggered on new ticket creation

5.5 EdTech: Personalized Learning Paths

Typical Scenario: Online education platform customizing learning plans for students to improve completion rates.

Core Pain Points: - Low learning completion rate (industry average 45%) - One-size-fits-all learning paths, unsuitable for different student levels - Lack of personalized feedback mechanism

Skill Design:

name: adaptive-learning-path-designer
description: |
  Generates customized learning paths based on student level, learning style, and goals.
  Used when new user registers or requests learning plan.
# disable-model-invocation: false (default)
---

## Learning Path Design Process

### Step 1: Student Diagnosis (Inversion Mode)
Collect:
- What are the learning goals?
- What foundational knowledge is already mastered?
- Learning style preference (visual/auditory/kinesthetic)?
- Weekly time available?
- Expected completion time?

### Step 2: Load Curriculum System
Load references/curriculum-graph.md (knowledge dependency graph)
Load references/learning-style-mapping.md
Load references/progression-models.md

### Step 3: Generate Personalized Path
- Recommended course sequence
- Daily learning objectives
- Expected completion milestones
- Checkpoints and assessments

### Step 4: Generate Learning Contract
Output to assets/learning-contract.md

Expected Outcomes (based on best practices): - 💡 Learning completion rate expected from 45% to 78% - 💡 Efficiency per time unit expected to improve by 35% - 💡 Student satisfaction expected from 6.5/10 to 8.2/10*

*Actual depends on: student self-discipline, course content quality, platform interaction experience

Integration: LMS integration, auto-triggered on new student registration

5.6 Healthcare: Clinical Decision Support System

Typical Scenario: Medical clinics or hospitals helping doctors quickly query diagnosis guidelines and cross-check diagnoses.

⚠️ Important Disclaimer: This case is for technical demonstration only. Medical AI must obtain regulatory approval and cannot replace doctors' clinical judgment.

name: clinical-decision-support
description: |
  Assists doctors in querying diagnosis guidelines, drug interactions, and checklists.
  Only manually triggered by doctors; cannot run automatically.
disable-model-invocation: true  # Medical scenarios require user trigger
allowed_tools: []
metadata:
  compliance: HIPAA
  approved_by: medical-board-review-required
---

## Decision Support Process

### Step 1: Collect Patient Information (Doctor Input)
- Chief complaint
- Existing test results
- Past medical history and allergies

### Step 2: Load Clinical Guidelines
Load references/clinical-guidelines.md
Load references/drug-interactions.md
Load references/contraindication-checklist.md

### Step 3: Assist Analysis (Pipeline Mode)
- Step 1: Generate differential diagnosis list
- Step 2: Recommend necessary tests
- Step 3: After doctor confirmation, load complete guidelines
- Step 4: Treatment plan reference (drug/non-drug)
- Step 5: Safety cross-check

### Step 4: Output Reference Report
- Differential diagnosis ranked table
- Recommended test checklist
- Reference treatment plans
- ⚠️ Key risk warnings

### ⚠️ Required Disclaimer
"This system is for auxiliary reference only and cannot replace doctors' clinical judgment.
All diagnosis and treatment recommendations must be made by licensed doctors based on actual patient conditions."

Expected Outcomes (based on best practices): - 💡 Standardization of diagnosis/treatment process expected from 60% to 98% - 💡 Medication safety expected to improve - 💡 Doctor work efficiency expected to increase*

*Important note: This Skill is only an auxiliary reference tool and cannot replace medical diagnosis. Medical ethics committee and compliance approval should be obtained before actual deployment.

Safety Design: - disable-model-invocation: true — Only doctors can trigger - allowed_tools: [] — No external tool calls - All outputs must include disclaimer

5.7 Six Scenarios Comparison Overview

Industry	Skill Type	Core Pattern	Trigger Method	Expected Benefit
Software Dev	Reviewer	Pipeline	Auto (PR-triggered)	Code review efficiency ⭐⭐⭐⭐⭐
Content Creation	Inversion	Multi-turn	Manual	Overcome writer's block ⭐⭐⭐⭐
Finance	Generator	Template filling	Manual+Scheduled	Report generation efficiency ⭐⭐⭐⭐⭐
Customer Service	Reviewer	Classification rules	Auto (Webhook)	Ticket processing efficiency ⭐⭐⭐⭐
EdTech	Inversion	Personalized diagnosis	Auto (Registration)	Learning completion rate ⭐⭐⭐⭐
Healthcare	Pipeline	Multi-step decision	Manual (Doctor)	Diagnosis standardization ⭐⭐⭐⭐⭐

Common Design Principles: 1. All expected outcomes note "actual depends on..." factors 2. Sensitive industries (healthcare) have additional safety designs 3. Output formats are structured for downstream processing

Editor's Notes: - Token efficiency data source: LangChain official tech blog (Mar 2026) - Progressive Disclosure pattern reference: Google ADK Design Guide - All expected outcomes in this article note key "actual depends on..." factors

Part 6 · Practical Tools and Ecosystem: From Learning to Using

6.1 Skill Creation Toolchain

Now that you understand Skill design theory, let's look at tools that can help you quickly create and test Skills.

Creation Tools (choose based on your development environment):

Skill creation typically requires: - A code editor - Claude Code or other IDEs supporting Skills - Basic Markdown and YAML knowledge

For detailed toolchain, refer to each IDE's official documentation.

Community Tools:

Tool	Features	Best For
skillpkg.com	Chinese Skill marketplace	Chinese community, Chinese docs
skills.sh	Global Skill registry	International projects, open-source community
ClawHub	Enterprise Skill hosting and version management	Large enterprises, on-premise deployment
GitHub Marketplace	Some Skills published on GitHub	Open-source projects, developer community

6.2 Skill Evaluation and Version Management

Monitoring Metrics (using LangSmith or similar tools):

Skill invocation frequency
Average execution time
Error Rate
Token consumption
User Satisfaction

Semantic Versioning:

vX.Y.Z naming rules:
- X (Major): Breaking API changes
- Y (Minor): New features (backward compatible)
- Z (Patch): Bug fixes

CHANGELOG Example:
## v2.1.0
- [New] Added security check mode
- [Improve] Optimized output format performance
- [Fix] Fixed error handling in references loading

6.3 Ecosystem Case: Skill Lifecycle

Case: Anthropic's Official PDF Processing Skill

Time	Version	Milestone
Nov 2025	v1.0	Released (PDF text extraction support)
Dec 2025	v1.1	Added table recognition
Jan 2026	v2.0	Image recognition support
Mar 2026	Current	Adopted by 30+ enterprises, top 10 on skills.sh

Success Factors: 1. Solved a clear pain point (PDF processing) 2. High-quality documentation and examples 3. Active version iteration 4. Clear API (not vague) 5. Community participation and feedback

Part 7 · Common Questions and Quick Reference

Q1: What's the difference between Skills and Tools?

Dimension	Tool	Skill
Granularity	Atomic operation (query API, run command)	Multi-step workflow (strategy, decision chain)
Complexity	Simple	High
Learning Cost	Low	Medium to High
Reusability	Code-level reuse	Process-level reuse
Example	"Call weather API"	"Give weekend plan based on weather"

Simple判断: If you only need to execute one action, use Tool; if you need a complete workflow, use Skill.

Q2: Should logic go in SKILL.md or scripts/?

Decision Tree:

Is the logic text processing rules?
  → Write in SKILL.md

Need to execute code (e.g., run linter)?
  → Write in scripts/ + reference in SKILL.md

Need frequent updates?
  → Write in references/ (easier independent version control)

Is it generic, cross-Skill logic?
  → Write as Tool

Q3: How do I make Claude auto-discover my Skill?

Key: Write a good description

❌ Bad example:

description: Code review Skill

✅ Good example:

description: |
  Reviews Python code for quality, performance, and security.
  Used when you submit code requesting feedback, or say "help me review".

Principle: Claude performs similarity matching between user prompts and Skill descriptions; auto-loads when similarity exceeds threshold.

Q4: Can Skills invoke other Skills?

Yes, but be careful.

## Referencing Other Skills in SKILL.md

If code issues are detected, recommend using `/code-formatter` Skill to auto-fix formatting.

---

**Risks**:
- Too deep chained invocation consumes many tokens
- May form circular calls
- Need to use allowed_tools for restrictions

Q5: How do enterprises manage Skill versions and permissions?

Recommended organizational structure:

skills/
├── core/                  # Platform team maintained (read-only)
│   ├── code-reviewer/
│   └── security-audit/
├── departments/           # Each department maintains
│   ├── finance/
│   ├── engineering/
│   └── marketing/
└── vendor/               # Third-party Skills (read-only)

Permission Model: - All Skills default to read-only - Modifications require code review - Publishing requires department head approval - Critical Skills require compliance approval

Closing · Call to Action and Outlook

Start Now: Begin with 5 Minutes

Step 1: Experience a ready-made Skill

cd your-project
mkdir -p my-first-skill
cd my-first-skill
cat > SKILL.md << 'EOF'
---
name: my-first-skill
description: My first Skill. Used when you say "hello".
---

Hello! I'm a Skill. I'm here to help you.
EOF

Step 2: Test in Claude IDE - Input /my-first-skill - Observe how Claude loads and uses it

Step 3: Join the community - Browse existing Skills on skillpkg.com or skills.sh - Participate in discussions and feedback

Future Outlook: Skill Ecosystem Trends

Possible future developments (speculations based on current trends, not guarantees):

Skill marketplace continues to grow
Number and variety of Skills may continue to increase
Enterprise Skill management needs may rise
Paid or subscription models may emerge
Cross-tool compatibility improves
Multiple AI tools may gradually adopt Skills specification
Standardization may increase
Multimodal support expands
Skills may gradually support more input/output types
Workflow orchestration capabilities may enhance
Professional specialization emerges
"Skill Engineer" or similar roles may appear
Skill design may become an independent professional field

Why Skills Are Worth Paying Attention to Now

✅ Skills specification already adopted by multiple mainstream AI tools
✅ Open-source ecosystem developing rapidly
✅ Community activity continues to grow
✅ Real-world application scenarios expanding
✅ Skills as practical skills, potential value gradually emerging

References

Official Documentation

Technical Blogs

LangChain Blog: "The Anatomy of an Agent Harness" (Vivek Trivedy, Mar 2026)
LangChain Blog: "Evaluating Skills" (Robert Xu, Mar 2026)
LangChain Blog: "How we built LangChain's GTM Agent" (Mar 2026)
Skillpkg: "5 Agent Skill Design Patterns Every ADK Developer Should Know" (Ironben, 2026-03-18)

Industry Research

Skills Marketplaces

skillpkg.com - Chinese Skill marketplace
skills.sh - Global Skill registry

Data Source Notes

This article follows these rules for citing data:

Data Type	Citation Method	Example
Performance data officially stated	Cite specific source	"According to LangChain official data..."
Expected outcomes based on cases	Note "expected outcomes (based on best practices)"	"Expected from 60% to 95%*"
Speculative industry content	Note "possible future developments"	"Skill marketplace may continue to grow"
Data without specific source	No specific numbers	Qualitative description used instead

Note: This article was written in March 2026. AI technology is evolving rapidly; some information may change over time. Readers are advised to refer to the latest official documentation when using in practice.

If you found this article helpful, feel free to share it with those who need it. Questions or suggestions? Welcome to discuss in the comments!