Auto-Pilot
Updated 12 hours ago

bedrock

Iitsmostafa
1.0k
itsmostafa/aws-agent-skills/skills/bedrock
72
Agent Score

💡 Summary

English summary.

🎯 Target Audience

Persona 1Persona 2Persona 3

🤖 AI Roast:Powerful, but the setup might scare off the impatient.

Security AnalysisMedium Risk

Risk: Medium. Review: shell/CLI command execution; outbound network access (SSRF, data egress); API keys/tokens handling and storage. Run with least privilege and audit before enabling in production.


name: bedrock description: AWS Bedrock foundation models for generative AI. Use when invoking foundation models, building AI applications, creating embeddings, configuring model access, or implementing RAG patterns. last_updated: "2026-01-07" doc_source: https://docs.aws.amazon.com/bedrock/latest/userguide/

AWS Bedrock

Amazon Bedrock provides access to foundation models (FMs) from AI companies through a unified API. Build generative AI applications with text generation, embeddings, and image generation capabilities.

Table of Contents

Core Concepts

Foundation Models

Pre-trained models available through Bedrock:

  • Claude (Anthropic): Text generation, analysis, coding
  • Titan (Amazon): Text, embeddings, image generation
  • Llama (Meta): Open-weight text generation
  • Mistral: Efficient text generation
  • Stable Diffusion (Stability AI): Image generation

Model Access

Models must be enabled in your account before use:

  • Request access in Bedrock console
  • Some models require acceptance of EULAs
  • Access is region-specific

Inference Types

| Type | Use Case | Pricing | |------|----------|---------| | On-Demand | Variable workloads | Per token | | Provisioned Throughput | Consistent high-volume | Hourly commitment | | Batch Inference | Async large-scale | Discounted per token |

Common Patterns

Invoke Model (Text Generation)

AWS CLI:

# Invoke Claude aws bedrock-runtime invoke-model \ --model-id anthropic.claude-3-sonnet-20240229-v1:0 \ --content-type application/json \ --accept application/json \ --body '{ "anthropic_version": "bedrock-2023-05-31", "max_tokens": 1024, "messages": [ {"role": "user", "content": "Explain AWS Lambda in 3 sentences."} ] }' \ response.json cat response.json | jq -r '.content[0].text'

boto3:

import boto3 import json bedrock = boto3.client('bedrock-runtime') def invoke_claude(prompt, max_tokens=1024): response = bedrock.invoke_model( modelId='anthropic.claude-3-sonnet-20240229-v1:0', contentType='application/json', accept='application/json', body=json.dumps({ 'anthropic_version': 'bedrock-2023-05-31', 'max_tokens': max_tokens, 'messages': [ {'role': 'user', 'content': prompt} ] }) ) result = json.loads(response['body'].read()) return result['content'][0]['text'] # Usage response = invoke_claude('What is Amazon S3?') print(response)

Streaming Response

import boto3 import json bedrock = boto3.client('bedrock-runtime') def stream_claude(prompt): response = bedrock.invoke_model_with_response_stream( modelId='anthropic.claude-3-sonnet-20240229-v1:0', contentType='application/json', accept='application/json', body=json.dumps({ 'anthropic_version': 'bedrock-2023-05-31', 'max_tokens': 1024, 'messages': [ {'role': 'user', 'content': prompt} ] }) ) for event in response['body']: chunk = json.loads(event['chunk']['bytes']) if chunk['type'] == 'content_block_delta': yield chunk['delta'].get('text', '') # Usage for text in stream_claude('Write a haiku about cloud computing.'): print(text, end='', flush=True)

Generate Embeddings

import boto3 import json bedrock = boto3.client('bedrock-runtime') def get_embedding(text): response = bedrock.invoke_model( modelId='amazon.titan-embed-text-v2:0', contentType='application/json', accept='application/json', body=json.dumps({ 'inputText': text, 'dimensions': 1024, 'normalize': True }) ) result = json.loads(response['body'].read()) return result['embedding'] # Usage embedding = get_embedding('AWS Lambda is a serverless compute service.') print(f'Embedding dimension: {len(embedding)}')

Conversation with History

import boto3 import json bedrock = boto3.client('bedrock-runtime') class Conversation: def __init__(self, system_prompt=None): self.messages = [] self.system = system_prompt def chat(self, user_message): self.messages.append({ 'role': 'user', 'content': user_message }) body = { 'anthropic_version': 'bedrock-2023-05-31', 'max_tokens': 1024, 'messages': self.messages } if self.system: body['system'] = self.system response = bedrock.invoke_model( modelId='anthropic.claude-3-sonnet-20240229-v1:0', contentType='application/json', accept='application/json', body=json.dumps(body) ) result = json.loads(response['body'].read()) assistant_message = result['content'][0]['text'] self.messages.append({ 'role': 'assistant', 'content': assistant_message }) return assistant_message # Usage conv = Conversation(system_prompt='You are an AWS solutions architect.') print(conv.chat('What database should I use for a chat application?')) print(conv.chat('What about for time-series data?'))

List Available Models

# List all foundation models aws bedrock list-foundation-models \ --query 'modelSummaries[*].[modelId,modelName,providerName]' \ --output table # Filter by provider aws bedrock list-foundation-models \ --by-provider anthropic \ --query 'modelSummaries[*].modelId' # Get model details aws bedrock get-foundation-model \ --model-identifier anthropic.claude-3-sonnet-20240229-v1:0

Request Model Access

# List model access status aws bedrock list-foundation-model-agreement-offers \ --model-id anthropic.claude-3-sonnet-20240229-v1:0

CLI Reference

Bedrock (Control Plane)

| Command | Description | |---------|-------------| | aws bedrock list-foundation-models | List available models | | aws bedrock get-foundation-model | Get model details | | aws bedrock list-custom-models | List fine-tuned models | | aws bedrock create-model-customization-job | Start fine-tuning | | aws bedrock list-provisioned-model-throughputs | List provisioned capacity |

Bedrock Runtime (Data Plane)

| Command | Description | |---------|-------------| | aws bedrock-runtime invoke-model | Invoke model synchronously | | aws bedrock-runtime invoke-model-with-response-stream | Invoke with streaming | | aws bedrock-runtime converse | Multi-turn conversation API | | aws bedrock-runtime converse-stream | Streaming conversation |

Bedrock Agent Runtime

| Command | Description | |---------|-------------| | aws bedrock-agent-runtime invoke-agent | Invoke a Bedrock agent | | aws bedrock-agent-runtime retrieve | Query knowledge base | | aws bedrock-agent-runtime retrieve-and-generate | RAG query |

Best Practices

Cost Optimization

  • Use appropriate models: Smaller models for simple tasks
  • Set max_tokens: Limit output length when possible
  • Cache responses: For repeated identical queries
  • Batch when possible: Use batch inference for bulk processing
  • Monitor usage: Set up CloudWatch alarms for cost

Performance

  • Use streaming: For better user experience with long outputs
  • Connection pooling: Reuse boto3 clients
  • Regional deployment: Use closest region to reduce latency
  • Provisioned throughput: For consistent high-volume workloads

Security

  • Least privilege IAM: Only grant needed model access
  • VPC endpoints: Keep traffic private
  • Guardrails: Implement content filtering
  • Audit with CloudTrail: Track model invocations

IAM Permissions

{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "bedrock:InvokeModel", "bedrock:InvokeModelWithResponseStream" ], "Resource": [ "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0", "arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v2:0" ] } ] }

Troubleshooting

AccessDeniedException

Causes:

  • Model access not enabled in console
  • IAM policy missing bedrock:InvokeModel
  • Wrong model ID or region

Debug:

# Check model access status aws bedrock list-foundation-models \ --query 'modelSummaries[?modelId==`anthropic.claude-3-sonnet-20240229-v1:0`]' # Test IAM permissions aws iam simulate-principal-policy \ --policy-source-arn arn:aws:iam::123456789012:role/my-role \ --action-names bedrock:InvokeModel \ --resource-arns "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0"

ModelNotReadyException

Cause: Model is still being provisioned or temporarily unavailable.

Solution: Implement retry with exponential backoff:

import time from botocore.exceptions import ClientError def invoke_with_retry(bedrock, body, max_retries=3): for attempt in range(max_retries): try: return bedrock.invoke_model( modelId='anthropic.claude-3-sonnet-20240229-v1:0', body=json.dumps(body) ) except ClientError as e: if e.response['Error']['Code'] == 'ModelNotReadyException': time.sleep(2 ** attempt) else: raise raise Exception('Max retries exceeded')

ThrottlingException

Causes:

  • Exceeded on-demand quota
  • Too many concurrent requests

Solutions:

  • Request quota increase
  • Implement exponential backoff
  • Consider provisioned throughput

ValidationException

Common issues:

  • Invalid model ID
  • Malformed request body
  • max_tokens exceeds model limit

Debug:

# Check model-specific requirements aws bedrock get-foundation-model \ --model-identifier anthropic.claude-3-sonnet-20240229-v1:0 \ --query 'modelDetails.inferenceTypesSupported'

References

5-Dim Analysis
Clarity8/10
Novelty6/10
Utility8/10
Completeness7/10
Maintainability7/10
Pros & Cons

Pros

  • p1
  • p2

Cons

  • c1
  • c2

Disclaimer: This content is sourced from GitHub open source projects for display and rating purposes only.

Copyright belongs to the original author itsmostafa.

bedrock