Co-Pilot / 辅助式

更新于 4 months ago

terraform-skill

Name: terraform-skill
Rating: 4.3 (653 reviews)
Author: antonbabenko

Aantonbabenko

0.7k

antonbabenko/terraform-skill

Agent 评分

💡 摘要

Terraform 和 OpenTofu 最佳实践综合指南，涵盖测试策略、模块架构、CI/CD 和生产模式。

🎯 适合人群

基础设施工程师DevOps 从业者平台团队管理 IaC 的 SRE云架构师

🤖 AI 吐槽: “这个技能就像一个经验丰富的 DevOps 工程师，总是不停地谈论最佳实践，但至少它们都是正确的。”

安全分析中风险

该技能推荐使用安全扫描工具（trivy、checkov），但未强制要求使用。主要风险是用户可能遵循架构模式却未实施推荐的安全扫描，导致不安全的 IaC 部署。缓解措施：始终将安全扫描作为强制关卡集成到 CI/CD 流水线中。

name: terraform-skill description: Use when working with Terraform or OpenTofu - creating modules, writing tests (native test framework, Terratest), setting up CI/CD pipelines, reviewing configurations, choosing between testing approaches, debugging state issues, implementing security scanning (trivy, checkov), or making infrastructure-as-code architecture decisions license: Apache-2.0 metadata: author: Anton Babenko version: 1.5.0

Terraform Skill for Claude

Comprehensive Terraform and OpenTofu guidance covering testing, modules, CI/CD, and production patterns. Based on terraform-best-practices.com and enterprise experience.

When to Use This Skill

Activate this skill when:

Creating new Terraform or OpenTofu configurations or modules
Setting up testing infrastructure for IaC code
Deciding between testing approaches (validate, plan, frameworks)
Structuring multi-environment deployments
Implementing CI/CD for infrastructure-as-code
Reviewing or refactoring existing Terraform/OpenTofu projects
Choosing between module patterns or state management approaches

Don't use this skill for:

Basic Terraform/OpenTofu syntax questions (Claude knows this)
Provider-specific API reference (link to docs instead)
Cloud platform questions unrelated to Terraform/OpenTofu

Core Principles

1. Code Structure Philosophy

Module Hierarchy:

| Type | When to Use | Scope | |------|-------------|-------| | Resource Module | Single logical group of connected resources | VPC + subnets, Security group + rules | | Infrastructure Module | Collection of resource modules for a purpose | Multiple resource modules in one region/account | | Composition | Complete infrastructure | Spans multiple regions/accounts |

Hierarchy: Resource → Resource Module → Infrastructure Module → Composition

Directory Structure:

environments/        # Environment-specific configurations
├── prod/
├── staging/
└── dev/

modules/            # Reusable modules
├── networking/
├── compute/
└── data/

examples/           # Module usage examples (also serve as tests)
├── complete/
└── minimal/

Key principle from terraform-best-practices.com:

Separate environments (prod, staging) from modules (reusable components)
Use examples/ as both documentation and integration test fixtures
Keep modules small and focused (single responsibility)

For detailed module architecture, see: Code Patterns: Module Types & Hierarchy

2. Naming Conventions

Resources:

# Good: Descriptive, contextual
resource "aws_instance" "web_server" { }
resource "aws_s3_bucket" "application_logs" { }

# Good: "this" for singleton resources (only one of that type)
resource "aws_vpc" "this" { }
resource "aws_security_group" "this" { }

# Avoid: Generic names for non-singletons
resource "aws_instance" "main" { }
resource "aws_s3_bucket" "bucket" { }

Singleton Resources:

Use "this" when your module creates only one resource of that type:

✅ DO:

resource "aws_vpc" "this" {}           # Module creates one VPC
resource "aws_security_group" "this" {}  # Module creates one SG

❌ DON'T use "this" for multiple resources:

resource "aws_subnet" "this" {}  # If creating multiple subnets

Use descriptive names when creating multiple resources of the same type.

Variables:

# Prefix with context when needed
var.vpc_cidr_block          # Not just "cidr"
var.database_instance_class # Not just "instance_class"

Files:

main.tf - Primary resources
variables.tf - Input variables
outputs.tf - Output values
versions.tf - Provider versions
data.tf - Data sources (optional)

Testing Strategy Framework

Decision Matrix: Which Testing Approach?

| Your Situation | Recommended Approach | Tools | Cost | |----------------|---------------------|-------|------| | Quick syntax check | Static analysis | terraform validate, fmt | Free | | Pre-commit validation | Static + lint | validate, tflint, trivy, checkov | Free | | Terraform 1.6+, simple logic | Native test framework | Built-in terraform test | Free-Low | | Pre-1.6, or Go expertise | Integration testing | Terratest | Low-Med | | Security/compliance focus | Policy as code | OPA, Sentinel | Free | | Cost-sensitive workflow | Mock providers (1.7+) | Native tests + mocking | Free | | Multi-cloud, complex | Full integration | Terratest + real infra | Med-High |

Testing Pyramid for Infrastructure

        /\
       /  \          End-to-End Tests (Expensive)
      /____\         - Full environment deployment
     /      \        - Production-like setup
    /________\
   /          \      Integration Tests (Moderate)
  /____________\     - Module testing in isolation
 /              \    - Real resources in test account
/________________\   Static Analysis (Cheap)
                     - validate, fmt, lint
                     - Security scanning

Native Test Best Practices (1.6+)

Before generating test code:

Validate schemas with Terraform MCP:

Search provider docs → Get resource schema → Identify block types

Choose correct command mode:
- command = plan - Fast, for input validation
- command = apply - Required for computed values and set-type blocks
Handle set-type blocks correctly:
- Cannot index with [0]
- Use for expressions to iterate
- Or use command = apply to materialize

Common patterns:

S3 encryption rules: set (use for expressions)
Lifecycle transitions: set (use for expressions)
IAM policy statements: set (use for expressions)

For detailed testing guides, see:

Testing Frameworks Guide - Deep dive into static analysis, native tests, and Terratest
Quick Reference - Decision flowchart and command cheat sheet

Code Structure Standards

Resource Block Ordering

Strict ordering for consistency:

count or for_each FIRST (blank line after)
Other arguments
tags as last real argument
depends_on after tags (if needed)
lifecycle at the very end (if needed)

# ✅ GOOD - Correct ordering
resource "aws_nat_gateway" "this" {
  count = var.create_nat_gateway ? 1 : 0

  allocation_id = aws_eip.this[0].id
  subnet_id     = aws_subnet.public[0].id

  tags = {
    Name = "${var.name}-nat"
  }

  depends_on = [aws_internet_gateway.this]

  lifecycle {
    create_before_destroy = true
  }
}

Variable Block Ordering

description (ALWAYS required)
type
default
validation
nullable (when setting to false)

variable "environment" {
  description = "Environment name for resource tagging"
  type        = string
  default     = "dev"

  validation {
    condition     = contains(["dev", "staging", "prod"], var.environment)
    error_message = "Environment must be one of: dev, staging, prod."
  }

  nullable = false
}

For complete structure guidelines, see: Code Patterns: Block Ordering & Structure

Count vs For_Each: When to Use Each

Quick Decision Guide

| Scenario | Use | Why | |----------|-----|-----| | Boolean condition (create or don't) | count = condition ? 1 : 0 | Simple on/off toggle | | Simple numeric replication | count = 3 | Fixed number of identical resources | | Items may be reordered/removed | for_each = toset(list) | Stable resource addresses | | Reference by key | for_each = map | Named access to resources | | Multiple named resources | for_each | Better maintainability |

Common Patterns

Boolean conditions:

# ✅ GOOD - Boolean condition
resource "aws_nat_gateway" "this" {
  count = var.create_nat_gateway ? 1 : 0
  # ...
}

Stable addressing with for_each:

# ✅ GOOD - Removing "us-east-1b" only affects that subnet
resource "aws_subnet" "private" {
  for_each = toset(var.availability_zones)

  availability_zone = each.key
  # ...
}

# ❌ BAD - Removing middle AZ recreates all subsequent subnets
resource "aws_subnet" "private" {
  count = length(var.availability_zones)

  availability_zone = var.availability_zones[count.index]
  # ...
}

For migration guides and detailed examples, see: Code Patterns: Count vs For_Each

Locals for Dependency Management

Use locals to ensure correct resource deletion order:

# Problem: Subnets might be deleted after CIDR blocks, causing errors
# Solution: Use try() in locals to hint deletion order

locals {
  # References secondary CIDR first, falling back to VPC
  # Forces Terraform to delete subnets before CIDR association
  vpc_id = try(
    aws_vpc_ipv4_cidr_block_association.this[0].vpc_id,
    aws_vpc.this.id,
    ""
  )
}

resource "aws_vpc" "this" {
  cidr_block = "10.0.0.0/16"
}

resource "aws_vpc_ipv4_cidr_block_association" "this" {
  count = var.add_secondary_cidr ? 1 : 0

  vpc_id     = aws_vpc.this.id
  cidr_block = "10.1.0.0/16"
}

resource "aws_subnet" "public" {
  vpc_id     = local.vpc_id  # Uses local, not direct reference
  cidr_block = "10.1.0.0/24"
}

Why this matters:

Prevents deletion errors when destroying infrastructure
Ensures correct dependency order without explicit depends_on
Particularly useful for VPC configurations with secondary CIDR blocks

For detailed examples, see: Code Patterns: Locals for Dependency Management

Module Development

Standard Module Structure

my-module/
├── README.md           # Usage documentation
├── main.tf             # Primary resources
├── variables.tf        # Input variables with descriptions
├── outputs.tf          # Output values
├── versions.tf

五维分析

清晰度9/10

创新性7/10

实用性10/10

完整性9/10

可维护性8/10

优缺点分析

优点

全面覆盖真实世界的 IaC 挑战
清晰的测试和架构决策框架
基于 terraform-best-practices.com 的成熟最佳实践
提供实用的示例并突出反模式

缺点

假设用户具备中高级 Terraform 知识
对初学者可能过于复杂
严重依赖 AWS 示例
需要用户在多个参考文档间导航