Co-Pilot
Updated 24 days ago

video-agent-skill

Ddonghaozhang
0.0k
donghaozhang/video-agent-skill
80
Agent Score

๐Ÿ’ก Summary

A comprehensive AI content generation suite offering multiple models for text, image, and video processing.

๐ŸŽฏ Target Audience

Content creators looking to automate video productionDevelopers integrating AI models into applicationsMarketers needing quick visual content generationEducators creating engaging multimedia materialsResearchers exploring AI capabilities in media generation

๐Ÿค– AI Roast: โ€œThis README is like a Swiss Army knifeโ€”useful, but a bit overwhelming at times.โ€

Security AnalysisMedium Risk

The README indicates the use of API keys for various services, which poses a risk if not securely managed. Ensure to use environment variables and avoid hardcoding sensitive information.

AI Content Generation Suite

A comprehensive AI content generation package with multiple providers and services, consolidated into a single installable package.

Python 3.10+ License: MIT Code style: black PyPI

โšก Production-ready Python package with comprehensive CLI, parallel execution, and enterprise-grade architecture

๐ŸŽฌ Demo Video

AI Content Generation Suite Demo

Click to watch the complete demo of AI Content Generation Suite in action

๐ŸŽจ Available AI Models

40+ AI models across 8 categories - showing top picks below. See full models reference for complete list.

Text-to-Image (Top Picks)

| Model | Cost | Best For | |-------|------|----------| | nano_banana_pro | $0.002 | Fast & high-quality | | gpt_image_1_5 | $0.003 | GPT-powered generation |

Image-to-Video (Top Picks)

| Model | Cost | Best For | |-------|------|----------| | sora_2 | $0.40-1.20 | OpenAI quality | | kling_2_6_pro | $0.50-1.00 | Professional quality |

Text-to-Video (Top Picks)

| Model | Cost | Best For | |-------|------|----------| | sora_2 | $0.40-1.20 | OpenAI quality | | kling_2_6_pro | $0.35-1.40 | Quality + audio |

๐Ÿ’ก Cost-Saving Tip: Use --mock flag for FREE validation: ai-content-pipeline generate-image --text "test" --mock

๐Ÿ“š View all 40+ models โ†’

๐Ÿท๏ธ Latest Release

PyPI Version GitHub Release

What's New in v1.0.18

  • โœ… Automated PyPI publishing via GitHub Actions
  • ๐Ÿ”ง Consolidated setup files for cleaner package structure
  • ๐ŸŽฏ All 40+ AI models with comprehensive parallel processing support
  • ๐Ÿ“ฆ Improved CI/CD workflow with skip-existing option

๐Ÿš€ FLAGSHIP: AI Content Pipeline

The unified AI content generation pipeline with parallel execution support, multi-model integration, and YAML-based configuration.

Core Capabilities

  • ๐Ÿ”„ Unified Pipeline Architecture - YAML/JSON-based configuration for complex multi-step workflows
  • โšก Parallel Execution Engine - 2-3x performance improvement with thread-based parallel processing
  • ๐ŸŽฏ Type-Safe Configuration - Pydantic models with comprehensive validation
  • ๐Ÿ’ฐ Cost Management - Real-time cost estimation and tracking across all services
  • ๐Ÿ“Š Rich Logging - Beautiful console output with progress tracking and performance metrics

AI Service Integrations

  • ๐Ÿ–ผ๏ธ FAL AI - Text-to-image, image-to-image, text-to-video, video generation, avatar creation
  • ๐Ÿ—ฃ๏ธ ElevenLabs - Professional text-to-speech with 20+ voice options
  • ๐ŸŽฅ Google Vertex AI - Veo video generation and Gemini text generation
  • ๐Ÿ”— OpenRouter - Alternative TTS and chat completion services

Developer Experience

  • ๐Ÿ› ๏ธ Professional CLI - Comprehensive command-line interface with Click
  • ๐Ÿ“ฆ Modular Architecture - Clean separation of concerns with extensible design
  • ๐Ÿงช Comprehensive Testing - Unit and integration tests with pytest
  • ๐Ÿ“š Type Hints - Full type coverage for excellent IDE support

๐Ÿ“ฆ Installation

Quick Start

# Install from PyPI pip install video-ai-studio # Or install in development mode pip install -e .

๐Ÿ”‘ API Keys Setup

After installation, you need to configure your API keys:

  1. Download the example configuration:

    # Option 1: Download from GitHub curl -o .env https://raw.githubusercontent.com/donghaozhang/video-agent-skill/main/.env.example # Option 2: Create manually touch .env
  2. Add your API keys to .env:

    # Required for most functionality FAL_KEY=your_fal_api_key_here # Optional - add as needed GEMINI_API_KEY=your_gemini_api_key_here OPENROUTER_API_KEY=your_openrouter_api_key_here ELEVENLABS_API_KEY=your_elevenlabs_api_key_here
  3. Get API keys from:

    • FAL AI: https://fal.ai/dashboard (required for most models)
    • Google Gemini: https://makersuite.google.com/app/apikey
    • OpenRouter: https://openrouter.ai/keys
    • ElevenLabs: https://elevenlabs.io/app/settings

๐Ÿ“‹ Dependencies

The package installs core dependencies automatically. See requirements.txt for the complete list.

๐Ÿ› ๏ธ Quick Start

Console Commands

# List all available AI models ai-content-pipeline list-models # Generate image from text ai-content-pipeline generate-image --text "epic space battle" --model flux_dev # Create video (text โ†’ image โ†’ video) ai-content-pipeline create-video --text "serene mountain lake" # Run custom pipeline from YAML config ai-content-pipeline run-chain --config config.yaml --input "cyberpunk city" # Create example configurations ai-content-pipeline create-examples # Shortened command alias aicp --help

Python API

from packages.core.ai_content_pipeline.pipeline.manager import AIPipelineManager # Initialize manager manager = AIPipelineManager() # Quick video creation result = manager.quick_create_video( text="serene mountain lake", image_model="flux_dev", video_model="auto" ) # Run custom chain chain = manager.create_chain_from_config("config.yaml") result = manager.execute_chain(chain, "input text")

๐Ÿ“š Package Structure

Core Packages

Provider Packages

Google Services

  • google-veo - Google Veo video generation (Vertex AI)

FAL AI Services

Service Packages

๐Ÿ”ง Configuration

Environment Setup

Create a .env file in the project root:

# FAL AI API Configuration FAL_KEY=your_fal_api_key # Google Cloud Configuration (for Veo) PROJECT_ID=your-project-id OUTPUT_BUCKET_PATH=gs://your-bucket/veo_output/ # ElevenLabs Configuration ELEVENLABS_API_KEY=your_elevenlabs_api_key # Optional: Gemini for AI analysis GEMINI_API_KEY=your_gemini_api_key # Optional: OpenRouter for additional models OPENROUTER_API_KEY=your_openrouter_api_key

YAML Pipeline Configuration

name: "Text to Video Pipeline" description: "Generate video from text prompt" steps: - name: "generate_image" type: "text_to_image" model: "flux_dev" aspect_ratio: "16:9" - name: "create_video" type: "image_to_video" model: "kling_video" input_from: "generate_image" duration: 8

Parallel Execution

Enable parallel processing for 2-3x speedup:

# Enable parallel execution PIPELINE_PARALLEL_ENABLED=true ai-content-pipeline run-chain --config config.yaml

Example parallel pipeline configuration:

name: "Parallel Processing Example" steps: - type: "parallel_group" steps: - type: "text_to_image" model: "flux_schnell" params: prompt: "A cat" - type: "text_to_image" model: "flux_schnell" params: prompt: "A dog" - type: "text_to_image" model: "flux_schnell" params: prompt: "A bird"

๐Ÿ’ฐ Cost Management

Cost Estimation

Always estimate costs before running pipelines:

# Estimate cost for a pipeline ai-content-pipeline estimate-cost --config config.yaml

Typical Costs

  • Text-to-Image: $0.001-0.004 per image
  • Image-to-Image: $0.01-0.05 per modification
  • Text-to-Video: $0.08-6.00 per video (model dependent)
  • Avatar Generation: $0.02-0.05 per video
  • Text-to-Speech: Varies by usage (ElevenLabs pricing)
  • Video Processing: $0.05-2.50 per video (model dependent)

Cost-Conscious Usage

  • Use cheaper models for prototyping (flux_schnell, hailuo)
  • Test with small batches before large-scale generation
  • Monitor costs with built-in tracking

๐Ÿงช Testing

# Quick tests python tests/run_all_tests.py --quick

๐Ÿ“‹ See tests/README.md for complete testing guide.

๐Ÿ’ฐ Cost Management

Estimation

  • FAL AI Video: ~$0.05-0.10 per video
  • FAL AI Text-to-Video: ~$0.08 (MiniMax) to $2.50-6.00 (Google Veo 3)
  • FAL AI Avatar: ~$0.02-0.05 per video
  • FAL AI Images: ~$0.001-0.01 per image
  • Text-to-Speech: Varies by usage (ElevenLabs pricing)

Best Practices

  1. Always run test_setup.py first (FREE)
  2. Use cost estimation in pipeline manager
  3. Start wi
5-Dim Analysis
Clarity8/10
Novelty7/10
Utility9/10
Completeness8/10
Maintainability8/10
Pros & Cons

Pros

  • Supports multiple AI models for diverse content generation
  • Offers parallel processing for improved performance
  • Comprehensive CLI for ease of use

Cons

  • Complex setup process with multiple API keys
  • Potentially high costs depending on usage
  • Steep learning curve for new users

Related Skills

pytorch

S
toolCode Lib
92/ 100

โ€œIt's the Swiss Army knife of deep learning, but good luck figuring out which of the 47 installation methods is the one that won't break your system.โ€

agno

S
toolCode Lib
90/ 100

โ€œIt promises to be the Kubernetes for agents, but let's see if developers have the patience to learn yet another orchestration layer.โ€

nuxt-skills

S
toolCo-Pilot
90/ 100

โ€œIt's essentially a well-organized cheat sheet that turns your AI assistant into a Nuxt framework parrot.โ€

Disclaimer: This content is sourced from GitHub open source projects for display and rating purposes only.

Copyright belongs to the original author donghaozhang.