prompt-engineering
💡 摘要
一套用于提升大语言模型交互与智能体性能的高级提示词工程模式与最佳实践综合指南与工具包。
🎯 适合人群
🤖 AI 吐槽: “这是一本编排精良的提示词工程教科书,却忘了包含'工程'部分——真正的构建工具在哪里?”
README 描述了生成和执行代码的模式(例如模板中的 Python 示例)。这意味着如果技能输出未经沙箱处理即被执行,则存在间接代码注入风险。缓解措施:将此技能生成的所有 LLM 代码视为不可信代码;仅在隔离的临时环境中执行。
name: prompt-engineering description: Use this skill when you writing commands, hooks, skills for Agent, or prompts for sub agents or any other LLM interaction, including optimizing prompts, improving LLM outputs, or designing production prompt templates.
Prompt Engineering Patterns
Advanced prompt engineering techniques to maximize LLM performance, reliability, and controllability.
Core Capabilities
1. Few-Shot Learning
Teach the model by showing examples instead of explaining rules. Include 2-5 input-output pairs that demonstrate the desired behavior. Use when you need consistent formatting, specific reasoning patterns, or handling of edge cases. More examples improve accuracy but consume tokens—balance based on task complexity.
Example:
Extract key information from support tickets: Input: "My login doesn't work and I keep getting error 403" Output: {"issue": "authentication", "error_code": "403", "priority": "high"} Input: "Feature request: add dark mode to settings" Output: {"issue": "feature_request", "error_code": null, "priority": "low"} Now process: "Can't upload files larger than 10MB, getting timeout"
2. Chain-of-Thought Prompting
Request step-by-step reasoning before the final answer. Add "Let's think step by step" (zero-shot) or include example reasoning traces (few-shot). Use for complex problems requiring multi-step logic, mathematical reasoning, or when you need to verify the model's thought process. Improves accuracy on analytical tasks by 30-50%.
Example:
Analyze this bug report and determine root cause. Think step by step: 1. What is the expected behavior? 2. What is the actual behavior? 3. What changed recently that could cause this? 4. What components are involved? 5. What is the most likely root cause? Bug: "Users can't save drafts after the cache update deployed yesterday"
3. Prompt Optimization
Systematically improve prompts through testing and refinement. Start simple, measure performance (accuracy, consistency, token usage), then iterate. Test on diverse inputs including edge cases. Use A/B testing to compare variations. Critical for production prompts where consistency and cost matter.
Example:
Version 1 (Simple): "Summarize this article" → Result: Inconsistent length, misses key points Version 2 (Add constraints): "Summarize in 3 bullet points" → Result: Better structure, but still misses nuance Version 3 (Add reasoning): "Identify the 3 main findings, then summarize each" → Result: Consistent, accurate, captures key information
4. Template Systems
Build reusable prompt structures with variables, conditional sections, and modular components. Use for multi-turn conversations, role-based interactions, or when the same pattern applies to different inputs. Reduces duplication and ensures consistency across similar tasks.
Example:
# Reusable code review template template = """ Review this {language} code for {focus_area}. Code: {code_block} Provide feedback on: {checklist} """ # Usage prompt = template.format( language="Python", focus_area="security vulnerabilities", code_block=user_code, checklist="1. SQL injection\n2. XSS risks\n3. Authentication" )
5. System Prompt Design
Set global behavior and constraints that persist across the conversation. Define the model's role, expertise level, output format, and safety guidelines. Use system prompts for stable instructions that shouldn't change turn-to-turn, freeing up user message tokens for variable content.
Example:
System: You are a senior backend engineer specializing in API design. Rules: - Always consider scalability and performance - Suggest RESTful patterns by default - Flag security concerns immediately - Provide code examples in Python - Use early return pattern Format responses as: 1. Analysis 2. Recommendation 3. Code example 4. Trade-offs
Key Patterns
Progressive Disclosure
Start with simple prompts, add complexity only when needed:
-
Level 1: Direct instruction
- "Summarize this article"
-
Level 2: Add constraints
- "Summarize this article in 3 bullet points, focusing on key findings"
-
Level 3: Add reasoning
- "Read this article, identify the main findings, then summarize in 3 bullet points"
-
Level 4: Add examples
- Include 2-3 example summaries with input-output pairs
Instruction Hierarchy
[System Context] → [Task Instruction] → [Examples] → [Input Data] → [Output Format]
Error Recovery
Build prompts that gracefully handle failures:
- Include fallback instructions
- Request confidence scores
- Ask for alternative interpretations when uncertain
- Specify how to indicate missing information
Best Practices
- Be Specific: Vague prompts produce inconsistent results
- Show, Don't Tell: Examples are more effective than descriptions
- Test Extensively: Evaluate on diverse, representative inputs
- Iterate Rapidly: Small changes can have large impacts
- Monitor Performance: Track metrics in production
- Version Control: Treat prompts as code with proper versioning
- Document Intent: Explain why prompts are structured as they are
Common Pitfalls
- Over-engineering: Starting with complex prompts before trying simple ones
- Example pollution: Using examples that don't match the target task
- Context overflow: Exceeding token limits with excessive examples
- Ambiguous instructions: Leaving room for multiple interpretations
- Ignoring edge cases: Not testing on unusual or boundary inputs
Integration Patterns
With RAG Systems
# Combine retrieved context with prompt engineering prompt = f"""Given the following context: {retrieved_context} {few_shot_examples} Question: {user_question} Provide a detailed answer based solely on the context above. If the context doesn't contain enough information, explicitly state what's missing."""
With Validation
# Add self-verification step prompt = f"""{main_task_prompt} After generating your response, verify it meets these criteria: 1. Answers the question directly 2. Uses only information from provided context 3. Cites specific sources 4. Acknowledges any uncertainty If verification fails, revise your response."""
Performance Optimization
Token Efficiency
- Remove redundant words and phrases
- Use abbreviations consistently after first definition
- Consolidate similar instructions
- Move stable content to system prompts
Latency Reduction
- Minimize prompt length without sacrificing quality
- Use streaming for long-form outputs
- Cache common prompt prefixes
- Batch similar requests when possible
Agent Prompting Best Practices
Based on Anthropic's official best practices for agent prompting.
Core principles
Context Window
The “context window” refers to the entirety of the amount of text a language model can look back on and reference when generating new text plus the new text it generates. This is different from the large corpus of data the language model was trained on, and instead represents a “working memory” for the model. A larger context window allows the model to understand and respond to more complex and lengthy prompts, while a smaller context window may limit the model’s ability to handle longer prompts or maintain coherence over extended conversations.
- Progressive token accumulation: As the conversation advances through turns, each user message and assistant response accumulates within the context window. Previous turns are preserved completely.
- Linear growth pattern: The context usage grows linearly with each turn, with previous turns preserved completely.
- 200K token capacity: The total available context window (200,000 tokens) represents the maximum capacity for storing conversation history and generating new output from Claude.
- Input-output flow: Each turn consists of:
- Input phase: Contains all previous conversation history plus the current user message
- Output phase: Generates a text response that becomes part of a future input
Concise is key
The context window is a public good. Your prompt, command, skill shares the context window with everything else Claude needs to know, including:
- The system prompt
- Conversation history
- Other commands, skills, hooks, metadata
- Your actual request
Default assumption: Claude is already very smart
Only add context Claude doesn't already have. Challenge each piece of information:
- "Does Claude really need this explanation?"
- "Can I assume Claude knows this?"
- "Does this paragraph justify its token cost?"
Good example: Concise (approximately 50 tokens):
## Extract PDF text Use pdfplumber for text extraction: ```python import pdfplumber with pdfplumber.open("file.pdf") as pdf: text = pdf.pages[0].extract_text() ```
Bad example: Too verbose (approximately 150 tokens):
## Extract PDF text PDF (Portable Document Format) files are a common file format that contains text, images, and other content. To extract text from a PDF, you'll need to use a library. There are many libraries available for PDF processing, but we recommend pdfplumber because it's easy to use and handles most cases well. First, you'll need to install it using pip. Then you can use the code below...
The concise version assumes Claude knows what PDFs are and how libraries work.
Set appropriate degrees of freedom
Match the level of specificity to the task's fragility and variability.
High freedom (text-based instructions):
Use when:
- Multiple approaches are valid
- Decisions depend on context
- Heuristics guide the approach
Example:
## Code review process 1. Analyze the code structure and organization 2. Check for potential bugs or edge cases 3. Su
优点
- 提供结构化、可操作的模式(例如少样本学习、思维链)。
- 强调简洁性和迭代测试等最佳实践。
- 包含与 RAG 和验证结合的实用集成示例。
- 提供关于令牌效率和上下文管理的清晰指导。
缺点
- 主要是文档/指南,而非可执行的库或工具。
- 缺乏用于直接自动化的具体可运行代码。
- 新颖性有限,因其汇编的是已公开的既定知识。
- 缺少安装或 API 细节,完整性不足。
相关技能
免责声明:本内容来源于 GitHub 开源项目,仅供展示和评分分析使用。
版权归原作者所有 NeoLabHQ.
