Co-Pilot / 辅助式
更新于 a month ago

listenhub

Mmarswaveai
0.0k
marswaveai/skills/skills/listenhub
80
Agent 评分

💡 摘要

ListenHub轻松将文本和URL转换为播客、解释视频和图像。

🎯 适合人群

希望快速制作音频/视频内容的内容创作者。希望创建引人入胜的解释视频的教育工作者。需要为品牌故事生成播客的营销人员。寻求简单API进行多媒体内容生成的开发人员。希望将文章转换为音频以便更容易消费的学生。

🤖 AI 吐槽:看起来很能打,但别让配置把人劝退。

安全分析中风险

风险:Medium。建议检查:是否执行 shell/命令行指令;是否发起外网请求(SSRF/数据外发);API Key/Token 的获取、存储与泄露风险;文件读写范围与路径穿越风险。以最小权限运行,并在生产环境启用前审计代码与依赖。


name: listenhub description: | Explain anything — turn ideas into podcasts, explainer videos, or voice narration. Use when the user wants to "make a podcast", "create an explainer video", "read this aloud", "generate an image", or share knowledge in audio/visual form. Supports: topic descriptions, YouTube links, article URLs, plain text, and image prompts.

Four modes, one entry point:

  • Podcast — Two-person dialogue, ideal for deep discussions
  • Explain — Single narrator + AI visuals, ideal for product intros
  • TTS/Flow Speech — Pure voice reading, ideal for articles
  • Image Generation — AI image creation, ideal for creative visualization

Users don't need to remember APIs, modes, or parameters. Just say what you want.

⛔ Hard Constraints (Inviolable)

The scripts are the ONLY interface. Period.

┌─────────────────────────────────────────────────────────┐
│  AI Agent  ──▶  ./scripts/*.sh  ──▶  ListenHub API     │
│                      ▲                                  │
│                      │                                  │
│            This is the ONLY path.                       │
│            Direct API calls are FORBIDDEN.              │
└─────────────────────────────────────────────────────────┘

MUST:

  • Execute functionality ONLY through provided scripts in **/skills/listenhub/scripts/
  • Pass user intent as script arguments exactly as documented
  • Trust script outputs; do not second-guess internal logic

MUST NOT:

  • Write curl commands to ListenHub/Marswave API directly
  • Construct JSON bodies for API calls manually
  • Guess or fabricate speakerIds, endpoints, or API parameters
  • Assume API structure based on patterns or web searches
  • Hallucinate features not exposed by existing scripts

Why: The API is proprietary. Endpoints, parameters, and speakerIds are NOT publicly documented. Web searches will NOT find this information. Any attempt to bypass scripts will produce incorrect, non-functional code.

Script Location

Scripts are located at **/skills/listenhub/scripts/ relative to your working context.

Different AI clients use different dot-directories:

  • Claude Code: .claude/skills/listenhub/scripts/
  • Other clients: may vary (.cursor/, .windsurf/, etc.)

Resolution: Use glob pattern **/skills/listenhub/scripts/*.sh to locate scripts reliably, or resolve from the SKILL.md file's own path.

Private Data (Cannot Be Searched)

The following are internal implementation details that AI cannot reliably know:

| Category | Examples | How to Obtain | |----------|----------|---------------| | API Base URL | api.marswave.ai/... | ✗ Cannot — internal to scripts | | Endpoints | podcast/episodes, etc. | ✗ Cannot — internal to scripts | | Speaker IDs | cozy-man-english, etc. | ✓ Call get-speakers.sh | | Request schemas | JSON body structure | ✗ Cannot — internal to scripts | | Response formats | Episode ID, status codes | ✓ Documented per script |

Rule: If information is not in this SKILL.md or retrievable via a script (like get-speakers.sh), assume you don't know it.

Design Philosophy

Hide complexity, reveal magic.

Users don't need to know: Episode IDs, API structure, polling mechanisms, credits, endpoint differences. Users only need: Say idea → wait a moment → get the link.

Environment

ListenHub API Key

API key stored in $LISTENHUB_API_KEY. Check on first use:

source ~/.zshrc 2>/dev/null; [ -n "$LISTENHUB_API_KEY" ] && echo "ready" || echo "need_setup"

If setup needed, guide user:

  1. Visit https://listenhub.ai/zh/settings/api-keys
  2. Paste key (only the lh_sk_... part)
  3. Auto-save to ~/.zshrc

Labnana API Key (for Image Generation)

API key stored in $LABNANA_API_KEY, output path in $LABNANA_OUTPUT_DIR.

On first image generation, the script auto-guides configuration:

  1. Visit https://labnana.com/api-keys (requires subscription)
  2. Paste API key
  3. Configure output path (default: ~/Downloads)
  4. Auto-save to shell rc file

Security: Never expose full API keys in output.

Mode Detection

Auto-detect mode from user input:

→ Podcast (Two-person dialogue)

  • Keywords: "podcast", "chat about", "discuss", "debate", "dialogue"
  • Use case: Topic exploration, opinion exchange, deep analysis
  • Feature: Two voices, interactive feel

→ Explain (Explainer video)

  • Keywords: "explain", "introduce", "video", "explainer", "tutorial"
  • Use case: Product intro, concept explanation, tutorials
  • Feature: Single narrator + AI-generated visuals, can export video

→ TTS (Text-to-speech)

  • Keywords: "read aloud", "convert to speech", "tts", "voice"
  • Use case: Article to audio, note review, document narration
  • Feature: Fastest (1-2 min), pure audio

→ Image Generation

  • Keywords: "generate image", "draw", "create picture", "visualize"
  • Use case: Creative visualization, concept art, illustrations
  • Feature: AI image generation via Labnana API, multiple resolutions and aspect ratios

Default: If unclear, ask user which format they prefer.

Explicit override: User can say "make it a podcast" / "I want explainer video" / "just voice" / "generate image" to override auto-detection.

Interaction Flow

Step 1: Receive input + detect mode

→ Got it! Preparing...
  Mode: Two-person podcast
  Topic: Latest developments in Manus AI

For URLs, identify type:

  • youtu.be/XXX → convert to https://www.youtube.com/watch?v=XXX
  • Other URLs → use directly

Step 2: Submit generation

→ Generation submitted

  Estimated time:
  • Podcast: 2-3 minutes
  • Explain: 3-5 minutes
  • TTS: 1-2 minutes

  You can:
  • Wait and ask "done yet?"
  • Check listenhub.ai/zh/app/library
  • Do other things, ask later

Internally remember Episode ID for status queries.

Step 3: Query status

When user says "done yet?" / "ready?" / "check status":

  • Success: Show result + next options
  • Processing: "Still generating, wait another minute?"
  • Failed: "Generation failed, content might be unparseable. Try another?"

Step 4: Show results

Podcast result:

✓ Podcast generated!

  "{title}"

  Listen: https://listenhub.ai/zh/app/library

  Duration: ~{duration} minutes

  Need to download? Just say so.

Explain result:

✓ Explainer video generated!

  "{title}"

  Watch: https://listenhub.ai/zh/app/explainer-video/slides/{episodeId}

  Duration: ~{duration} minutes

  Need to download audio? Just say so.

Image result:

✓ Image generated!

  ~/Downloads/labnana-{timestamp}.jpg

Important: Prioritize web experience. Only provide download URLs when user explicitly requests.

Script Reference

All scripts are curl-based (no extra dependencies). Locate via **/skills/listenhub/scripts/*.sh.

⚠️ Long-running Tasks: Generation may take 1-5 minutes. Use your CLI client's native background execution feature:

  • Claude Code: set run_in_background: true in Bash tool
  • Other CLIs: use built-in async/background job management if available

Invocation pattern: $SCRIPTS/script-name.sh [args]

Where $SCRIPTS = resolved path to **/skills/listenhub/scripts/

Podcast (One-Stage)

$SCRIPTS/create-podcast.sh "query" [mode] [source_url] # mode: quick (default) | deep | debate # source_url: optional URL for content analysis # Example: $SCRIPTS/create-podcast.sh "The future of AI development" deep $SCRIPTS/create-podcast.sh "Analyze this article" deep "https://example.com/article"

Podcast (Two-Stage: Text → Audio)

For advanced workflows requiring script editing between generation:

# Stage 1: Generate text content $SCRIPTS/create-podcast-text.sh "query" [mode] [source_url] # Returns: episode_id + scripts array # Stage 2: Generate audio from text $SCRIPTS/create-podcast-audio.sh "<episode-id>" [modified_scripts.json] # Without scripts file: uses original scripts # With scripts file: uses modified scripts

Speech (Multi-Speaker)

$SCRIPTS/create-speech.sh <scripts_json_file> # Or pipe: echo '{"scripts":[...]}' | $SCRIPTS/create-speech.sh - # scripts.json format: # { # "scripts": [ # {"content": "Script content here", "speakerId": "speaker-id"}, # ... # ] # }

Get Available Speakers

$SCRIPTS/get-speakers.sh [language] # language: zh (default) | en

Response structure (for AI parsing):

{ "code": 0, "data": { "items": [ { "name": "Yuanye", "speakerId": "cozy-man-english", "gender": "male", "language": "zh" } ] } }

Usage: When user requests specific voice characteristics (gender, style), call this script first to discover available speakerId values. NEVER hardcode or assume speakerIds.

Explain

$SCRIPTS/create-explainer.sh "<topic>" [mode] # mode: info (default) | story # Generate video file (optional) $SCRIPTS/generate-video.sh "<episode-id>"

TTS

$SCRIPTS/create-tts.sh "<text>" [mode] # mode: smart (default) | direct

Image Generation

$SCRIPTS/generate-image.sh "<prompt>" [size] [ratio] [reference_images] # size: 1K | 2K | 4K (default: 2K) # ratio: 16:9 | 1:1 | 9:16 | 2:3 | 3:2 | 3:4 | 4:3 | 21:9 (default: 16:9) # reference_images: comma-separated URLs (max 14), e.g. "url1,url2" # - Provides visual guidance for style, composition, or content # - Supports jpg, png, gif, webp, bmp formats # - URLs must be publicly accessible

Check Status

$SCRIPTS/check-status.sh "<episode-id>" <type> # type: podcast | explainer | tts

Language Adaptation

Automatic Language Detection: Adapt output language based on user input and context.

Detection Rules:

  1. User Input Language: If user writes in Chinese, respond in Chinese. If user writes in English, respond in English.
  2. **Context
五维分析
清晰度8/10
创新性8/10
实用性9/10
完整性8/10
可维护性7/10
优缺点分析

优点

  • 用户友好的界面,命令简单。
  • 支持多种内容格式,包括播客和图像。
  • 自动模式检测,使用方便。

缺点

  • 功能仅限于预定义脚本。
  • 不允许高级用户直接访问API。
  • 操作依赖于外部API密钥。

相关技能

pytorch

S
toolCode Lib / 代码库
92/ 100

“它是深度学习的瑞士军刀,但祝你好运能从47种安装方法里找到那个不会搞崩你系统的那一个。”

agno

S
toolCode Lib / 代码库
90/ 100

“它承诺成为智能体领域的Kubernetes,但得看开发者有没有耐心学习又一个编排层。”

nuxt-skills

S
toolCo-Pilot / 辅助式
90/ 100

“这本质上是一份组织良好的小抄,能把你的 AI 助手变成一只 Nuxt 框架的复读机。”

免责声明:本内容来源于 GitHub 开源项目,仅供展示和评分分析使用。

版权归原作者所有 marswaveai.