Co-Pilot / 辅助式
更新于 25 days ago

wanx-img

Mmebusw
0.0k
mebusw/wanx-img
76
Agent 评分

💡 摘要

该技能使用阿里巴巴的WanX文本到图像模型生成和编辑图像。

🎯 适合人群

寻求AI辅助图像创作的平面设计师。将图像生成集成到应用程序中的开发人员。需要独特视觉效果的内容创作者。探索视觉艺术中AI的研究人员。

🤖 AI 吐槽:README建议访问图像URL并需要API凭证,这可能导致未经授权的访问风险。确保妥善管理环境变量以降低敏感信息的暴露。

安全分析中风险

README建议访问图像URL并需要API凭证,这可能导致未经授权的访问风险。确保妥善管理环境变量以降低敏感信息的暴露。


name: wanx-img description: Generate or edit images using WanX (Alibaba's text-to-image model) allowed-tools: Read, Bash

Overview

This skill provides commands for generating and editing images using the WanX model from Alibaba.

Workflow

  1. decide user's intent: whether to generate a new image, or edit given images
  2. decide which version of visual LLM will be used, use the highest version by default, if it's not availiable or rejected by provider, then downgrade to other version
  3. if user provides images urls/paths, you don't need to read the files but only pass them to scripts
  4. if user specify size of image in pixel number, pass it to scripts, or if user specify aspect ratio, convert it to image size and then pass to scripts
  5. run a proper script to generate or edit images with user input prompt, synchronously by default
  6. output original prompts, extended actual prompts, size of image, and full urls of generated images (don't miss any parameters from the url for example Signature etc. )

Conversion between ratio and image size

The total number of pixels must be between [12801280, 14401440] and the aspect ratio must be between [1:4, 4:1]. For example, 7682700 meets the requirements. The default value is 12801280.

| aspect ratio | image size | | --- | --- | | 1:1 | 12801280 | | 3:4 | 11041472 | | 4:3 | 14721104 | | 9:16 | 9601696 | | 16:9 | 1696*960 |

Available Scripts

  • wanx2.6-text-to-image-v2-demo.py - Generate images from text prompts, using WanX visual model version 2.6
  • wanx2.6-image-edit-demo.py - Edit images with text prompts, optionally with reference/mask images, using WanX visual model version 2.6
  • wanx2.5-text-to-image-v2-demo.py - Generate images from text prompts, using WanX visual model version 2.5
  • wanx2.5-image-edit-demo.py - Edit images with text prompts, optionally with reference/mask images, using WanX visual model version 2.5
  • wanx2.2-text-to-image-v2-demo.py - Generate images from text prompts, using WanX visual model version 2.2

Setting Up

首次使用时,进入目录并安装依赖:

cd ~/.claude/skills/wanx-img python3 -m venv ~/.pyenv/versions/py312-ai-rag source ~/.pyenv/versions/py312-ai-rag/bin/activate pip install dashscope python-dotenv argparse cp .env.example .env

后续执行python脚本时,先执行source ~/.pyenv/versions/py312-ai-rag/bin/activate,然后再执行脚本。如果pythonpython3命令找不到合适的packages,则应用~/.pyenv/versions/py312-ai-rag/bin/python来替代作为执行器。

Usage Examples

  • Use custom prompt to generate image
~/.pyenv/versions/py312-ai-rag/bin/python "./scripts/wanx2.6-text-to-image-v2-demo.py" --prompt "一只可爱的猫咪在花园里玩耍. ar 3:4" --size "960*1280"
  • Use synchronous call with custom prompt and negative prompt to generate image
~/.pyenv/versions/py312-ai-rag/bin/python "./scripts/wanx2.6-text-to-image-v2-demo.py" -p "美丽的日落风景" -n "人物" --sync
  • Use custom prompt and referencing images to edit image
~/.pyenv/versions/py312-ai-rag/bin/python "./scripts/wanx2.6-image-edit-demo.py" --prompt "参考图1的风格和图2的背景,生成番茄炒蛋" --images http://1.img http://2.img -m http://3.img -b http://4.img
  • Use synchronous call with custom prompt to edit iamge
~/.pyenv/versions/py312-ai-rag/bin/python "./scripts/wanx2.6-image-edit-demo.py" -p "参考图1的风格和图2的背景,生成番茄炒蛋" --sync

Requirements

  • Python 3.12+
  • LLM API credentials configured in demo scripts
  • DashScope Python SDK 1.25.8+
五维分析
清晰度8/10
创新性6/10
实用性9/10
完整性8/10
可维护性7/10
优缺点分析

优点

  • 利用先进的AI进行图像生成。
  • 支持图像创作和编辑。
  • 灵活的宽高比和尺寸选项。

缺点

  • 需要特定的Python版本和依赖项。
  • 仅限于WanX模型的能力。
  • 对于非技术用户,设置可能较复杂。

相关技能

pytorch

S
toolCode Lib / 代码库
92/ 100

“它是深度学习的瑞士军刀,但祝你好运能从47种安装方法里找到那个不会搞崩你系统的那一个。”

agno

S
toolCode Lib / 代码库
90/ 100

“它承诺成为智能体领域的Kubernetes,但得看开发者有没有耐心学习又一个编排层。”

nuxt-skills

S
toolCo-Pilot / 辅助式
90/ 100

“这本质上是一份组织良好的小抄,能把你的 AI 助手变成一只 Nuxt 框架的复读机。”

免责声明:本内容来源于 GitHub 开源项目,仅供展示和评分分析使用。

版权归原作者所有 mebusw.