Co-Pilot / 辅助式
更新于 24 days ago

video-agent-skill

Ddonghaozhang
0.0k
donghaozhang/video-agent-skill
80
Agent 评分

💡 摘要

一个全面的AI内容生成套件,提供多种文本、图像和视频处理模型。

🎯 适合人群

希望自动化视频制作的内容创作者将AI模型集成到应用程序中的开发者需要快速视觉内容生成的营销人员创建引人入胜的多媒体材料的教育工作者探索AI在媒体生成中能力的研究人员

🤖 AI 吐槽:这个README就像一把瑞士军刀——有用,但有时有点让人不知所措。

安全分析中风险

README中提到使用多个服务的API密钥,如果管理不当会带来风险。确保使用环境变量,并避免硬编码敏感信息。

AI Content Generation Suite

A comprehensive AI content generation package with multiple providers and services, consolidated into a single installable package.

Python 3.10+ License: MIT Code style: black PyPI

⚡ Production-ready Python package with comprehensive CLI, parallel execution, and enterprise-grade architecture

🎬 Demo Video

AI Content Generation Suite Demo

Click to watch the complete demo of AI Content Generation Suite in action

🎨 Available AI Models

40+ AI models across 8 categories - showing top picks below. See full models reference for complete list.

Text-to-Image (Top Picks)

| Model | Cost | Best For | |-------|------|----------| | nano_banana_pro | $0.002 | Fast & high-quality | | gpt_image_1_5 | $0.003 | GPT-powered generation |

Image-to-Video (Top Picks)

| Model | Cost | Best For | |-------|------|----------| | sora_2 | $0.40-1.20 | OpenAI quality | | kling_2_6_pro | $0.50-1.00 | Professional quality |

Text-to-Video (Top Picks)

| Model | Cost | Best For | |-------|------|----------| | sora_2 | $0.40-1.20 | OpenAI quality | | kling_2_6_pro | $0.35-1.40 | Quality + audio |

💡 Cost-Saving Tip: Use --mock flag for FREE validation: ai-content-pipeline generate-image --text "test" --mock

📚 View all 40+ models →

🏷️ Latest Release

PyPI Version GitHub Release

What's New in v1.0.18

  • ✅ Automated PyPI publishing via GitHub Actions
  • 🔧 Consolidated setup files for cleaner package structure
  • 🎯 All 40+ AI models with comprehensive parallel processing support
  • 📦 Improved CI/CD workflow with skip-existing option

🚀 FLAGSHIP: AI Content Pipeline

The unified AI content generation pipeline with parallel execution support, multi-model integration, and YAML-based configuration.

Core Capabilities

  • 🔄 Unified Pipeline Architecture - YAML/JSON-based configuration for complex multi-step workflows
  • ⚡ Parallel Execution Engine - 2-3x performance improvement with thread-based parallel processing
  • 🎯 Type-Safe Configuration - Pydantic models with comprehensive validation
  • 💰 Cost Management - Real-time cost estimation and tracking across all services
  • 📊 Rich Logging - Beautiful console output with progress tracking and performance metrics

AI Service Integrations

  • 🖼️ FAL AI - Text-to-image, image-to-image, text-to-video, video generation, avatar creation
  • 🗣️ ElevenLabs - Professional text-to-speech with 20+ voice options
  • 🎥 Google Vertex AI - Veo video generation and Gemini text generation
  • 🔗 OpenRouter - Alternative TTS and chat completion services

Developer Experience

  • 🛠️ Professional CLI - Comprehensive command-line interface with Click
  • 📦 Modular Architecture - Clean separation of concerns with extensible design
  • 🧪 Comprehensive Testing - Unit and integration tests with pytest
  • 📚 Type Hints - Full type coverage for excellent IDE support

📦 Installation

Quick Start

# Install from PyPI pip install video-ai-studio # Or install in development mode pip install -e .

🔑 API Keys Setup

After installation, you need to configure your API keys:

  1. Download the example configuration:

    # Option 1: Download from GitHub curl -o .env https://raw.githubusercontent.com/donghaozhang/video-agent-skill/main/.env.example # Option 2: Create manually touch .env
  2. Add your API keys to .env:

    # Required for most functionality FAL_KEY=your_fal_api_key_here # Optional - add as needed GEMINI_API_KEY=your_gemini_api_key_here OPENROUTER_API_KEY=your_openrouter_api_key_here ELEVENLABS_API_KEY=your_elevenlabs_api_key_here
  3. Get API keys from:

    • FAL AI: https://fal.ai/dashboard (required for most models)
    • Google Gemini: https://makersuite.google.com/app/apikey
    • OpenRouter: https://openrouter.ai/keys
    • ElevenLabs: https://elevenlabs.io/app/settings

📋 Dependencies

The package installs core dependencies automatically. See requirements.txt for the complete list.

🛠️ Quick Start

Console Commands

# List all available AI models ai-content-pipeline list-models # Generate image from text ai-content-pipeline generate-image --text "epic space battle" --model flux_dev # Create video (text → image → video) ai-content-pipeline create-video --text "serene mountain lake" # Run custom pipeline from YAML config ai-content-pipeline run-chain --config config.yaml --input "cyberpunk city" # Create example configurations ai-content-pipeline create-examples # Shortened command alias aicp --help

Python API

from packages.core.ai_content_pipeline.pipeline.manager import AIPipelineManager # Initialize manager manager = AIPipelineManager() # Quick video creation result = manager.quick_create_video( text="serene mountain lake", image_model="flux_dev", video_model="auto" ) # Run custom chain chain = manager.create_chain_from_config("config.yaml") result = manager.execute_chain(chain, "input text")

📚 Package Structure

Core Packages

Provider Packages

Google Services

  • google-veo - Google Veo video generation (Vertex AI)

FAL AI Services

Service Packages

🔧 Configuration

Environment Setup

Create a .env file in the project root:

# FAL AI API Configuration FAL_KEY=your_fal_api_key # Google Cloud Configuration (for Veo) PROJECT_ID=your-project-id OUTPUT_BUCKET_PATH=gs://your-bucket/veo_output/ # ElevenLabs Configuration ELEVENLABS_API_KEY=your_elevenlabs_api_key # Optional: Gemini for AI analysis GEMINI_API_KEY=your_gemini_api_key # Optional: OpenRouter for additional models OPENROUTER_API_KEY=your_openrouter_api_key

YAML Pipeline Configuration

name: "Text to Video Pipeline" description: "Generate video from text prompt" steps: - name: "generate_image" type: "text_to_image" model: "flux_dev" aspect_ratio: "16:9" - name: "create_video" type: "image_to_video" model: "kling_video" input_from: "generate_image" duration: 8

Parallel Execution

Enable parallel processing for 2-3x speedup:

# Enable parallel execution PIPELINE_PARALLEL_ENABLED=true ai-content-pipeline run-chain --config config.yaml

Example parallel pipeline configuration:

name: "Parallel Processing Example" steps: - type: "parallel_group" steps: - type: "text_to_image" model: "flux_schnell" params: prompt: "A cat" - type: "text_to_image" model: "flux_schnell" params: prompt: "A dog" - type: "text_to_image" model: "flux_schnell" params: prompt: "A bird"

💰 Cost Management

Cost Estimation

Always estimate costs before running pipelines:

# Estimate cost for a pipeline ai-content-pipeline estimate-cost --config config.yaml

Typical Costs

  • Text-to-Image: $0.001-0.004 per image
  • Image-to-Image: $0.01-0.05 per modification
  • Text-to-Video: $0.08-6.00 per video (model dependent)
  • Avatar Generation: $0.02-0.05 per video
  • Text-to-Speech: Varies by usage (ElevenLabs pricing)
  • Video Processing: $0.05-2.50 per video (model dependent)

Cost-Conscious Usage

  • Use cheaper models for prototyping (flux_schnell, hailuo)
  • Test with small batches before large-scale generation
  • Monitor costs with built-in tracking

🧪 Testing

# Quick tests python tests/run_all_tests.py --quick

📋 See tests/README.md for complete testing guide.

💰 Cost Management

Estimation

  • FAL AI Video: ~$0.05-0.10 per video
  • FAL AI Text-to-Video: ~$0.08 (MiniMax) to $2.50-6.00 (Google Veo 3)
  • FAL AI Avatar: ~$0.02-0.05 per video
  • FAL AI Images: ~$0.001-0.01 per image
  • Text-to-Speech: Varies by usage (ElevenLabs pricing)

Best Practices

  1. Always run test_setup.py first (FREE)
  2. Use cost estimation in pipeline manager
  3. Start wi
五维分析
清晰度8/10
创新性7/10
实用性9/10
完整性8/10
可维护性8/10
优缺点分析

优点

  • 支持多种AI模型以生成多样化内容
  • 提供并行处理以提高性能
  • 全面的命令行界面,易于使用

缺点

  • 设置过程复杂,需要多个API密钥
  • 根据使用情况可能会产生高昂的费用
  • 新用户学习曲线陡峭

相关技能

pytorch

S
toolCode Lib / 代码库
92/ 100

“它是深度学习的瑞士军刀,但祝你好运能从47种安装方法里找到那个不会搞崩你系统的那一个。”

agno

S
toolCode Lib / 代码库
90/ 100

“它承诺成为智能体领域的Kubernetes,但得看开发者有没有耐心学习又一个编排层。”

nuxt-skills

S
toolCo-Pilot / 辅助式
90/ 100

“这本质上是一份组织良好的小抄,能把你的 AI 助手变成一只 Nuxt 框架的复读机。”

免责声明:本内容来源于 GitHub 开源项目,仅供展示和评分分析使用。

版权归原作者所有 donghaozhang.