Co-Pilot / 辅助式
更新于 24 days ago

mcp-browser-use

SSaik0s
0.9k
saik0s/mcp-browser-use
82
Agent 评分

💡 摘要

该项目使AI助手能够通过MCP服务器自动化网页浏览任务。

🎯 适合人群

希望集成浏览器自动化的AI开发者需要自动化网页数据提取的数据分析师进行多源网页研究的研究人员希望简化网页交互的产品经理探索AI能力的技术爱好者

🤖 AI 吐槽:看起来很能打,但别让配置把人劝退。

安全分析中风险

风险:Medium。建议检查:是否执行 shell/命令行指令;是否发起外网请求(SSRF/数据外发);API Key/Token 的获取、存储与泄露风险;文件读写范围与路径穿越风险。以最小权限运行,并在生产环境启用前审计代码与依赖。

mcp-server-browser-use

MCP server that gives AI assistants the power to control a web browser.

License


Table of Contents


What is this?

This wraps browser-use as an MCP server, letting Claude (or any MCP client) automate a real browser—navigate pages, fill forms, click buttons, extract data, and more.

Why HTTP instead of stdio?

Browser automation tasks take 30-120+ seconds. The standard MCP stdio transport has timeout issues with long-running operations—connections drop mid-task. HTTP transport solves this by running as a persistent daemon that handles requests reliably regardless of duration.


Installation

Claude Code Plugin (Recommended)

Install as a Claude Code plugin for automatic setup:

# Install the plugin /plugin install browser-use/mcp-browser-use

The plugin automatically:

  • Installs Playwright browsers on first run
  • Starts the HTTP daemon when Claude Code starts
  • Registers the MCP server with Claude

Set your API key (the browser agent needs an LLM to decide actions):

# Set API key (environment variable - recommended) export GEMINI_API_KEY=your-key-here # Or use config file mcp-server-browser-use config set -k llm.api_key -v your-key-here

That's it! Claude can now use browser automation tools.

Manual Installation

For other MCP clients or standalone use:

# Clone and install git clone https://github.com/Saik0s/mcp-browser-use.git cd mcp-server-browser-use uv sync # Install browser uv run playwright install chromium # Start the server uv run mcp-server-browser-use server

Add to Claude Desktop (~/Library/Application Support/Claude/claude_desktop_config.json):

{ "mcpServers": { "browser-use": { "type": "streamable-http", "url": "http://localhost:8383/mcp" } } }

For MCP clients that don't support HTTP transport, use mcp-remote as a proxy:

{ "mcpServers": { "browser-use": { "command": "npx", "args": ["mcp-remote", "http://localhost:8383/mcp"] } } }

Web UI

Access the task viewer at http://localhost:8383 when the daemon is running.

Features:

  • Real-time task list with status and progress
  • Task details with execution logs
  • Server health status and uptime
  • Running tasks monitoring

The web UI provides visibility into browser automation tasks without requiring CLI commands.


Web Dashboard

Access the full-featured dashboard at http://localhost:8383/dashboard when the daemon is running.

Features:

  • Tasks Tab: Complete task history with filtering, real-time status updates, and detailed execution logs
  • Skills Tab: Browse, inspect, and manage learned skills with usage statistics
  • History Tab: Historical view of all completed tasks with filtering by status and time

Key Capabilities:

  • Run existing skills directly from the dashboard with custom parameters
  • Start learning sessions to capture new skills
  • Delete outdated or invalid skills
  • Monitor running tasks with live progress updates
  • View full task results and error details

The dashboard provides a comprehensive web interface for managing all aspects of browser automation without CLI commands.


Configuration

Settings are stored in ~/.config/mcp-server-browser-use/config.json.

View current config:

mcp-server-browser-use config view

Change settings:

mcp-server-browser-use config set -k llm.provider -v openai mcp-server-browser-use config set -k llm.model_name -v gpt-4o # Note: Set API keys via environment variables (e.g., ANTHROPIC_API_KEY) for better security # mcp-server-browser-use config set -k llm.api_key -v sk-... mcp-server-browser-use config set -k browser.headless -v false mcp-server-browser-use config set -k agent.max_steps -v 30

Settings Reference

| Key | Default | Description | |-----|---------|-------------| | llm.provider | google | LLM provider (anthropic, openai, google, azure_openai, groq, deepseek, cerebras, ollama, bedrock, browser_use, openrouter, vercel) | | llm.model_name | gemini-3-flash-preview | Model for the browser agent | | llm.api_key | - | API key for the provider (prefer env vars: GEMINI_API_KEY, ANTHROPIC_API_KEY, etc.) | | browser.headless | true | Run browser without GUI | | browser.cdp_url | - | Connect to existing Chrome (e.g., http://localhost:9222) | | browser.user_data_dir | - | Chrome profile directory for persistent logins/cookies | | browser.chromium_sandbox | true | Enable Chromium sandboxing for security | | agent.max_steps | 20 | Max steps per browser task | | agent.use_vision | true | Enable vision capabilities for the agent | | research.max_searches | 5 | Max searches per research task | | research.search_timeout | - | Timeout for individual searches | | server.host | 127.0.0.1 | Server bind address | | server.port | 8383 | Server port | | server.results_dir | - | Directory to save results | | server.auth_token | - | Auth token for non-localhost connections | | skills.enabled | false | Enable skills system (beta - disabled by default) | | skills.directory | ~/.config/browser-skills | Skills storage location | | skills.validate_results | true | Validate skill execution results |

Config Priority

Environment Variables > Config File > Defaults

Environment variables use prefix MCP_ + section + _ + key (e.g., MCP_LLM_PROVIDER).

Using Your Own Browser

Option 1: Persistent Profile (Recommended)

Use a dedicated Chrome profile to preserve logins and cookies:

# Set user data directory mcp-server-browser-use config set -k browser.user_data_dir -v ~/.chrome-browser-use

Option 2: Connect to Existing Chrome

Connect to an existing Chrome instance (useful for advanced debugging):

# Launch Chrome with debugging enabled google-chrome --remote-debugging-port=9222 # Configure CDP connection (localhost only for security) mcp-server-browser-use config set -k browser.cdp_url -v http://localhost:9222

CLI Reference

Server Management

mcp-server-browser-use server # Start as background daemon mcp-server-browser-use server -f # Start in foreground (for debugging) mcp-server-browser-use status # Check if running mcp-server-browser-use stop # Stop the daemon mcp-server-browser-use logs -f # Tail server logs

Calling Tools

mcp-server-browser-use tools # List all available MCP tools mcp-server-browser-use call run_browser_agent task="Go to google.com" mcp-server-browser-use call run_deep_research topic="quantum computing"

Configuration

mcp-server-browser-use config view # Show all settings mcp-server-browser-use config set -k <key> -v <value> mcp-server-browser-use config path # Show config file location

Observability

mcp-server-browser-use tasks # List recent tasks mcp-server-browser-use tasks --status running mcp-server-browser-use task <id> # Get task details mcp-server-browser-use task cancel <id> # Cancel a running task mcp-server-browser-use health # Server health + stats

Skills Management

mcp-server-browser-use call skill_list mcp-server-browser-use call skill_get name="my-skill" mcp-server-browser-use call skill_delete name="my-skill"

Tip: Skills can also be managed through the web dashboard at http://localhost:8383/dashboard for a visual interface with one-click execution and learning sessions.


MCP Tools

These tools are exposed via MCP for AI clients:

| Tool | Description | Typical Duration | |------|-------------|------------------| | run_browser_agent | Execute browser automation tasks | 60-120s | | run_deep_research | Multi-search research with synthesis | 2-5 min | | skill_list | List learned skills | <1s | | skill_get | Get skill definition | <1s | | skill_delete | Delete a skill | <1s | | health_check | Server status and running tasks | <1s | | task_list | Query task history | <1s | | task_get | Get full task details | <1s |

run_browser_agent

The main tool. Tell it what you want in plain English:

mcp-server-browser-use call run_browser_agent \ task="Find the price of iPhone 16 Pro on Apple's website"

The agent launches a browser, navigates to apple.com, finds the product, and returns the price.

Parameters:

| Parameter | Type | Description | |-----------|------|-------------| | task | string | What to do (required) | | max_steps | int | Override default max steps | | skill_name | string | Use a learned skill | | skill_params | JSON | Parameters for the skill | | learn | bool | Enable learning mode | | save_skill_as | string | Name for the learned skill |

run_deep_research

Multi-step web research with automatic synthesis:

mcp-server-browser-use call run_deep_research \ topic="Latest developments in quantum computing" \ max_searches=5

The agent searches multiple sources, extracts key findings, and compiles a markdown report.


Deep Research

Deep research executes a 3-phase workflow:

┌─────────────────────────────────────────────────────────┐
│  Phase 1: PLANNING                                       │
│  LLM generates 3-5 focused search queries from topic     │
└─────────────────────────────┬───────────────────────────┘
五维分析
清晰度8/10
创新性7/10
实用性9/10
完整性9/10
可维护性8/10
优缺点分析

优点

  • 通过HTTP支持多个AI客户端
  • 用户友好的网页UI进行任务管理
  • 灵活的配置选项
  • 实时监控任务

缺点

  • 需要额外依赖的设置
  • API密钥存在潜在安全风险
  • 对非技术用户来说较复杂
  • 高级功能的文档有限

相关技能

pytorch

S
toolCode Lib / 代码库
92/ 100

“它是深度学习的瑞士军刀,但祝你好运能从47种安装方法里找到那个不会搞崩你系统的那一个。”

agno

S
toolCode Lib / 代码库
90/ 100

“它承诺成为智能体领域的Kubernetes,但得看开发者有没有耐心学习又一个编排层。”

nuxt-skills

S
toolCo-Pilot / 辅助式
90/ 100

“这本质上是一份组织良好的小抄,能把你的 AI 助手变成一只 Nuxt 框架的复读机。”

免责声明:本内容来源于 GitHub 开源项目,仅供展示和评分分析使用。

版权归原作者所有 Saik0s.