💡 摘要
该技能使用阿里巴巴的WanX文本到图像模型生成和编辑图像。
🎯 适合人群
🤖 AI 吐槽: “README建议访问图像URL并需要API凭证,这可能导致未经授权的访问风险。确保妥善管理环境变量以降低敏感信息的暴露。”
README建议访问图像URL并需要API凭证,这可能导致未经授权的访问风险。确保妥善管理环境变量以降低敏感信息的暴露。
name: wanx-img description: Generate or edit images using WanX (Alibaba's text-to-image model) allowed-tools: Read, Bash
Overview
This skill provides commands for generating and editing images using the WanX model from Alibaba.
Workflow
- decide user's intent: whether to generate a new image, or edit given images
- decide which version of visual LLM will be used, use the highest version by default, if it's not availiable or rejected by provider, then downgrade to other version
- if user provides images urls/paths, you don't need to read the files but only pass them to scripts
- if user specify size of image in pixel number, pass it to scripts, or if user specify aspect ratio, convert it to image size and then pass to scripts
- run a proper script to generate or edit images with user input prompt, synchronously by default
- output original prompts, extended actual prompts, size of image, and full urls of generated images (don't miss any parameters from the url for example Signature etc. )
Conversion between ratio and image size
The total number of pixels must be between [12801280, 14401440] and the aspect ratio must be between [1:4, 4:1]. For example, 7682700 meets the requirements. The default value is 12801280.
| aspect ratio | image size | | --- | --- | | 1:1 | 12801280 | | 3:4 | 11041472 | | 4:3 | 14721104 | | 9:16 | 9601696 | | 16:9 | 1696*960 |
Available Scripts
wanx2.6-text-to-image-v2-demo.py- Generate images from text prompts, using WanX visual model version 2.6wanx2.6-image-edit-demo.py- Edit images with text prompts, optionally with reference/mask images, using WanX visual model version 2.6wanx2.5-text-to-image-v2-demo.py- Generate images from text prompts, using WanX visual model version 2.5wanx2.5-image-edit-demo.py- Edit images with text prompts, optionally with reference/mask images, using WanX visual model version 2.5wanx2.2-text-to-image-v2-demo.py- Generate images from text prompts, using WanX visual model version 2.2
Setting Up
首次使用时,进入目录并安装依赖:
cd ~/.claude/skills/wanx-img python3 -m venv ~/.pyenv/versions/py312-ai-rag source ~/.pyenv/versions/py312-ai-rag/bin/activate pip install dashscope python-dotenv argparse cp .env.example .env
后续执行python脚本时,先执行source ~/.pyenv/versions/py312-ai-rag/bin/activate,然后再执行脚本。如果python或python3命令找不到合适的packages,则应用~/.pyenv/versions/py312-ai-rag/bin/python来替代作为执行器。
Usage Examples
- Use custom prompt to generate image
~/.pyenv/versions/py312-ai-rag/bin/python "./scripts/wanx2.6-text-to-image-v2-demo.py" --prompt "一只可爱的猫咪在花园里玩耍. ar 3:4" --size "960*1280"
- Use synchronous call with custom prompt and negative prompt to generate image
~/.pyenv/versions/py312-ai-rag/bin/python "./scripts/wanx2.6-text-to-image-v2-demo.py" -p "美丽的日落风景" -n "人物" --sync
- Use custom prompt and referencing images to edit image
~/.pyenv/versions/py312-ai-rag/bin/python "./scripts/wanx2.6-image-edit-demo.py" --prompt "参考图1的风格和图2的背景,生成番茄炒蛋" --images http://1.img http://2.img -m http://3.img -b http://4.img
- Use synchronous call with custom prompt to edit iamge
~/.pyenv/versions/py312-ai-rag/bin/python "./scripts/wanx2.6-image-edit-demo.py" -p "参考图1的风格和图2的背景,生成番茄炒蛋" --sync
Requirements
- Python 3.12+
- LLM API credentials configured in demo scripts
- DashScope Python SDK 1.25.8+
优点
- 利用先进的AI进行图像生成。
- 支持图像创作和编辑。
- 灵活的宽高比和尺寸选项。
缺点
- 需要特定的Python版本和依赖项。
- 仅限于WanX模型的能力。
- 对于非技术用户,设置可能较复杂。
相关技能
免责声明:本内容来源于 GitHub 开源项目,仅供展示和评分分析使用。
版权归原作者所有 mebusw.
