Co-Pilot / 辅助式
更新于 a month ago

terraform-skill

Aantonbabenko
0.7k
antonbabenko/terraform-skill
86
Agent 评分

💡 摘要

Terraform 和 OpenTofu 最佳实践综合指南,涵盖测试策略、模块架构、CI/CD 和生产模式。

🎯 适合人群

基础设施工程师DevOps 从业者平台团队管理 IaC 的 SRE云架构师

🤖 AI 吐槽:这个技能就像一个经验丰富的 DevOps 工程师,总是不停地谈论最佳实践,但至少它们都是正确的。

安全分析中风险

该技能推荐使用安全扫描工具(trivy、checkov),但未强制要求使用。主要风险是用户可能遵循架构模式却未实施推荐的安全扫描,导致不安全的 IaC 部署。缓解措施:始终将安全扫描作为强制关卡集成到 CI/CD 流水线中。


name: terraform-skill description: Use when working with Terraform or OpenTofu - creating modules, writing tests (native test framework, Terratest), setting up CI/CD pipelines, reviewing configurations, choosing between testing approaches, debugging state issues, implementing security scanning (trivy, checkov), or making infrastructure-as-code architecture decisions license: Apache-2.0 metadata: author: Anton Babenko version: 1.5.0

Terraform Skill for Claude

Comprehensive Terraform and OpenTofu guidance covering testing, modules, CI/CD, and production patterns. Based on terraform-best-practices.com and enterprise experience.

When to Use This Skill

Activate this skill when:

  • Creating new Terraform or OpenTofu configurations or modules
  • Setting up testing infrastructure for IaC code
  • Deciding between testing approaches (validate, plan, frameworks)
  • Structuring multi-environment deployments
  • Implementing CI/CD for infrastructure-as-code
  • Reviewing or refactoring existing Terraform/OpenTofu projects
  • Choosing between module patterns or state management approaches

Don't use this skill for:

  • Basic Terraform/OpenTofu syntax questions (Claude knows this)
  • Provider-specific API reference (link to docs instead)
  • Cloud platform questions unrelated to Terraform/OpenTofu

Core Principles

1. Code Structure Philosophy

Module Hierarchy:

| Type | When to Use | Scope | |------|-------------|-------| | Resource Module | Single logical group of connected resources | VPC + subnets, Security group + rules | | Infrastructure Module | Collection of resource modules for a purpose | Multiple resource modules in one region/account | | Composition | Complete infrastructure | Spans multiple regions/accounts |

Hierarchy: Resource → Resource Module → Infrastructure Module → Composition

Directory Structure:

environments/        # Environment-specific configurations
├── prod/
├── staging/
└── dev/

modules/            # Reusable modules
├── networking/
├── compute/
└── data/

examples/           # Module usage examples (also serve as tests)
├── complete/
└── minimal/

Key principle from terraform-best-practices.com:

  • Separate environments (prod, staging) from modules (reusable components)
  • Use examples/ as both documentation and integration test fixtures
  • Keep modules small and focused (single responsibility)

For detailed module architecture, see: Code Patterns: Module Types & Hierarchy

2. Naming Conventions

Resources:

# Good: Descriptive, contextual resource "aws_instance" "web_server" { } resource "aws_s3_bucket" "application_logs" { } # Good: "this" for singleton resources (only one of that type) resource "aws_vpc" "this" { } resource "aws_security_group" "this" { } # Avoid: Generic names for non-singletons resource "aws_instance" "main" { } resource "aws_s3_bucket" "bucket" { }

Singleton Resources:

Use "this" when your module creates only one resource of that type:

✅ DO:

resource "aws_vpc" "this" {} # Module creates one VPC resource "aws_security_group" "this" {} # Module creates one SG

❌ DON'T use "this" for multiple resources:

resource "aws_subnet" "this" {} # If creating multiple subnets

Use descriptive names when creating multiple resources of the same type.

Variables:

# Prefix with context when needed var.vpc_cidr_block # Not just "cidr" var.database_instance_class # Not just "instance_class"

Files:

  • main.tf - Primary resources
  • variables.tf - Input variables
  • outputs.tf - Output values
  • versions.tf - Provider versions
  • data.tf - Data sources (optional)

Testing Strategy Framework

Decision Matrix: Which Testing Approach?

| Your Situation | Recommended Approach | Tools | Cost | |----------------|---------------------|-------|------| | Quick syntax check | Static analysis | terraform validate, fmt | Free | | Pre-commit validation | Static + lint | validate, tflint, trivy, checkov | Free | | Terraform 1.6+, simple logic | Native test framework | Built-in terraform test | Free-Low | | Pre-1.6, or Go expertise | Integration testing | Terratest | Low-Med | | Security/compliance focus | Policy as code | OPA, Sentinel | Free | | Cost-sensitive workflow | Mock providers (1.7+) | Native tests + mocking | Free | | Multi-cloud, complex | Full integration | Terratest + real infra | Med-High |

Testing Pyramid for Infrastructure

        /\
       /  \          End-to-End Tests (Expensive)
      /____\         - Full environment deployment
     /      \        - Production-like setup
    /________\
   /          \      Integration Tests (Moderate)
  /____________\     - Module testing in isolation
 /              \    - Real resources in test account
/________________\   Static Analysis (Cheap)
                     - validate, fmt, lint
                     - Security scanning

Native Test Best Practices (1.6+)

Before generating test code:

  1. Validate schemas with Terraform MCP:

    Search provider docs → Get resource schema → Identify block types
    
  2. Choose correct command mode:

    • command = plan - Fast, for input validation
    • command = apply - Required for computed values and set-type blocks
  3. Handle set-type blocks correctly:

    • Cannot index with [0]
    • Use for expressions to iterate
    • Or use command = apply to materialize

Common patterns:

  • S3 encryption rules: set (use for expressions)
  • Lifecycle transitions: set (use for expressions)
  • IAM policy statements: set (use for expressions)

For detailed testing guides, see:

Code Structure Standards

Resource Block Ordering

Strict ordering for consistency:

  1. count or for_each FIRST (blank line after)
  2. Other arguments
  3. tags as last real argument
  4. depends_on after tags (if needed)
  5. lifecycle at the very end (if needed)
# ✅ GOOD - Correct ordering resource "aws_nat_gateway" "this" { count = var.create_nat_gateway ? 1 : 0 allocation_id = aws_eip.this[0].id subnet_id = aws_subnet.public[0].id tags = { Name = "${var.name}-nat" } depends_on = [aws_internet_gateway.this] lifecycle { create_before_destroy = true } }

Variable Block Ordering

  1. description (ALWAYS required)
  2. type
  3. default
  4. validation
  5. nullable (when setting to false)
variable "environment" { description = "Environment name for resource tagging" type = string default = "dev" validation { condition = contains(["dev", "staging", "prod"], var.environment) error_message = "Environment must be one of: dev, staging, prod." } nullable = false }

For complete structure guidelines, see: Code Patterns: Block Ordering & Structure

Count vs For_Each: When to Use Each

Quick Decision Guide

| Scenario | Use | Why | |----------|-----|-----| | Boolean condition (create or don't) | count = condition ? 1 : 0 | Simple on/off toggle | | Simple numeric replication | count = 3 | Fixed number of identical resources | | Items may be reordered/removed | for_each = toset(list) | Stable resource addresses | | Reference by key | for_each = map | Named access to resources | | Multiple named resources | for_each | Better maintainability |

Common Patterns

Boolean conditions:

# ✅ GOOD - Boolean condition resource "aws_nat_gateway" "this" { count = var.create_nat_gateway ? 1 : 0 # ... }

Stable addressing with for_each:

# ✅ GOOD - Removing "us-east-1b" only affects that subnet resource "aws_subnet" "private" { for_each = toset(var.availability_zones) availability_zone = each.key # ... } # ❌ BAD - Removing middle AZ recreates all subsequent subnets resource "aws_subnet" "private" { count = length(var.availability_zones) availability_zone = var.availability_zones[count.index] # ... }

For migration guides and detailed examples, see: Code Patterns: Count vs For_Each

Locals for Dependency Management

Use locals to ensure correct resource deletion order:

# Problem: Subnets might be deleted after CIDR blocks, causing errors # Solution: Use try() in locals to hint deletion order locals { # References secondary CIDR first, falling back to VPC # Forces Terraform to delete subnets before CIDR association vpc_id = try( aws_vpc_ipv4_cidr_block_association.this[0].vpc_id, aws_vpc.this.id, "" ) } resource "aws_vpc" "this" { cidr_block = "10.0.0.0/16" } resource "aws_vpc_ipv4_cidr_block_association" "this" { count = var.add_secondary_cidr ? 1 : 0 vpc_id = aws_vpc.this.id cidr_block = "10.1.0.0/16" } resource "aws_subnet" "public" { vpc_id = local.vpc_id # Uses local, not direct reference cidr_block = "10.1.0.0/24" }

Why this matters:

  • Prevents deletion errors when destroying infrastructure
  • Ensures correct dependency order without explicit depends_on
  • Particularly useful for VPC configurations with secondary CIDR blocks

For detailed examples, see: Code Patterns: Locals for Dependency Management

Module Development

Standard Module Structure

my-module/
├── README.md           # Usage documentation
├── main.tf             # Primary resources
├── variables.tf        # Input variables with descriptions
├── outputs.tf          # Output values
├── versions.tf    
五维分析
清晰度9/10
创新性7/10
实用性10/10
完整性9/10
可维护性8/10
优缺点分析

优点

  • 全面覆盖真实世界的 IaC 挑战
  • 清晰的测试和架构决策框架
  • 基于 terraform-best-practices.com 的成熟最佳实践
  • 提供实用的示例并突出反模式

缺点

  • 假设用户具备中高级 Terraform 知识
  • 对初学者可能过于复杂
  • 严重依赖 AWS 示例
  • 需要用户在多个参考文档间导航

相关技能

infra-skills

A
toolCo-Pilot / 辅助式
80/ 100

“看起来很能打,但别让配置把人劝退。”

hosted-agents

B
toolAuto-Pilot / 全自动
76/ 100

“看起来很能打,但别让配置把人劝退。”

pytorch

S
toolCode Lib / 代码库
92/ 100

“它是深度学习的瑞士军刀,但祝你好运能从47种安装方法里找到那个不会搞崩你系统的那一个。”

免责声明:本内容来源于 GitHub 开源项目,仅供展示和评分分析使用。

版权归原作者所有 antonbabenko.