Co-Pilot / 辅助式

更新于 5 months ago

mcp-browser-use

Name: mcp-browser-use
Rating: 4.1 (885 reviews)
Author: Saik0s

SSaik0s

0.9k

saik0s/mcp-browser-use

Agent 评分

💡 摘要

该项目使AI助手能够通过MCP服务器自动化网页浏览任务。

🎯 适合人群

希望集成浏览器自动化的AI开发者需要自动化网页数据提取的数据分析师进行多源网页研究的研究人员希望简化网页交互的产品经理探索AI能力的技术爱好者

🤖 AI 吐槽: “看起来很能打，但别让配置把人劝退。”

安全分析中风险

风险：Medium。建议检查：是否执行 shell/命令行指令；是否发起外网请求（SSRF/数据外发）；API Key/Token 的获取、存储与泄露风险；文件读写范围与路径穿越风险。以最小权限运行，并在生产环境启用前审计代码与依赖。

mcp-server-browser-use

MCP server that gives AI assistants the power to control a web browser.

What is this?
Installation
Web UI
Web Dashboard
Configuration
CLI Reference
MCP Tools
Deep Research
Observability
Skills System
REST API Reference
Architecture
License

What is this?

This wraps browser-use as an MCP server, letting Claude (or any MCP client) automate a real browser—navigate pages, fill forms, click buttons, extract data, and more.

Why HTTP instead of stdio?

Browser automation tasks take 30-120+ seconds. The standard MCP stdio transport has timeout issues with long-running operations—connections drop mid-task. HTTP transport solves this by running as a persistent daemon that handles requests reliably regardless of duration.

Installation

Claude Code Plugin (Recommended)

Install as a Claude Code plugin for automatic setup:

# Install the plugin
/plugin install browser-use/mcp-browser-use

The plugin automatically:

Installs Playwright browsers on first run
Starts the HTTP daemon when Claude Code starts
Registers the MCP server with Claude

Set your API key (the browser agent needs an LLM to decide actions):

# Set API key (environment variable - recommended)
export GEMINI_API_KEY=your-key-here

# Or use config file
mcp-server-browser-use config set -k llm.api_key -v your-key-here

That's it! Claude can now use browser automation tools.

Manual Installation

For other MCP clients or standalone use:

# Clone and install
git clone https://github.com/Saik0s/mcp-browser-use.git
cd mcp-server-browser-use
uv sync

# Install browser
uv run playwright install chromium

# Start the server
uv run mcp-server-browser-use server

Add to Claude Desktop (~/Library/Application Support/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "browser-use": {
      "type": "streamable-http",
      "url": "http://localhost:8383/mcp"
    }
  }
}

For MCP clients that don't support HTTP transport, use mcp-remote as a proxy:

{
  "mcpServers": {
    "browser-use": {
      "command": "npx",
      "args": ["mcp-remote", "http://localhost:8383/mcp"]
    }
  }
}

Web UI

Access the task viewer at http://localhost:8383 when the daemon is running.

Features:

Real-time task list with status and progress
Task details with execution logs
Server health status and uptime
Running tasks monitoring

The web UI provides visibility into browser automation tasks without requiring CLI commands.

Web Dashboard

Access the full-featured dashboard at http://localhost:8383/dashboard when the daemon is running.

Features:

Tasks Tab: Complete task history with filtering, real-time status updates, and detailed execution logs
Skills Tab: Browse, inspect, and manage learned skills with usage statistics
History Tab: Historical view of all completed tasks with filtering by status and time

Key Capabilities:

Run existing skills directly from the dashboard with custom parameters
Start learning sessions to capture new skills
Delete outdated or invalid skills
Monitor running tasks with live progress updates
View full task results and error details

The dashboard provides a comprehensive web interface for managing all aspects of browser automation without CLI commands.

Configuration

Settings are stored in ~/.config/mcp-server-browser-use/config.json.

View current config:

mcp-server-browser-use config view

Change settings:

mcp-server-browser-use config set -k llm.provider -v openai
mcp-server-browser-use config set -k llm.model_name -v gpt-4o
# Note: Set API keys via environment variables (e.g., ANTHROPIC_API_KEY) for better security
# mcp-server-browser-use config set -k llm.api_key -v sk-...
mcp-server-browser-use config set -k browser.headless -v false
mcp-server-browser-use config set -k agent.max_steps -v 30

Settings Reference

| Key | Default | Description | |-----|---------|-------------| | llm.provider | google | LLM provider (anthropic, openai, google, azure_openai, groq, deepseek, cerebras, ollama, bedrock, browser_use, openrouter, vercel) | | llm.model_name | gemini-3-flash-preview | Model for the browser agent | | llm.api_key | - | API key for the provider (prefer env vars: GEMINI_API_KEY, ANTHROPIC_API_KEY, etc.) | | browser.headless | true | Run browser without GUI | | browser.cdp_url | - | Connect to existing Chrome (e.g., http://localhost:9222) | | browser.user_data_dir | - | Chrome profile directory for persistent logins/cookies | | browser.chromium_sandbox | true | Enable Chromium sandboxing for security | | agent.max_steps | 20 | Max steps per browser task | | agent.use_vision | true | Enable vision capabilities for the agent | | research.max_searches | 5 | Max searches per research task | | research.search_timeout | - | Timeout for individual searches | | server.host | 127.0.0.1 | Server bind address | | server.port | 8383 | Server port | | server.results_dir | - | Directory to save results | | server.auth_token | - | Auth token for non-localhost connections | | skills.enabled | false | Enable skills system (beta - disabled by default) | | skills.directory | ~/.config/browser-skills | Skills storage location | | skills.validate_results | true | Validate skill execution results |

Config Priority

Environment Variables > Config File > Defaults

Environment variables use prefix MCP_ + section + _ + key (e.g., MCP_LLM_PROVIDER).

Using Your Own Browser

Option 1: Persistent Profile (Recommended)

Use a dedicated Chrome profile to preserve logins and cookies:

# Set user data directory
mcp-server-browser-use config set -k browser.user_data_dir -v ~/.chrome-browser-use

Option 2: Connect to Existing Chrome

Connect to an existing Chrome instance (useful for advanced debugging):

# Launch Chrome with debugging enabled
google-chrome --remote-debugging-port=9222

# Configure CDP connection (localhost only for security)
mcp-server-browser-use config set -k browser.cdp_url -v http://localhost:9222

CLI Reference

Server Management

mcp-server-browser-use server          # Start as background daemon
mcp-server-browser-use server -f       # Start in foreground (for debugging)
mcp-server-browser-use status          # Check if running
mcp-server-browser-use stop            # Stop the daemon
mcp-server-browser-use logs -f         # Tail server logs

Calling Tools

mcp-server-browser-use tools           # List all available MCP tools
mcp-server-browser-use call run_browser_agent task="Go to google.com"
mcp-server-browser-use call run_deep_research topic="quantum computing"

Configuration

mcp-server-browser-use config view     # Show all settings
mcp-server-browser-use config set -k <key> -v <value>
mcp-server-browser-use config path     # Show config file location

Observability

mcp-server-browser-use tasks           # List recent tasks
mcp-server-browser-use tasks --status running
mcp-server-browser-use task <id>       # Get task details
mcp-server-browser-use task cancel <id> # Cancel a running task
mcp-server-browser-use health          # Server health + stats

Skills Management

mcp-server-browser-use call skill_list
mcp-server-browser-use call skill_get name="my-skill"
mcp-server-browser-use call skill_delete name="my-skill"

Tip: Skills can also be managed through the web dashboard at http://localhost:8383/dashboard for a visual interface with one-click execution and learning sessions.

MCP Tools

These tools are exposed via MCP for AI clients:

| Tool | Description | Typical Duration | |------|-------------|------------------| | run_browser_agent | Execute browser automation tasks | 60-120s | | run_deep_research | Multi-search research with synthesis | 2-5 min | | skill_list | List learned skills | <1s | | skill_get | Get skill definition | <1s | | skill_delete | Delete a skill | <1s | | health_check | Server status and running tasks | <1s | | task_list | Query task history | <1s | | task_get | Get full task details | <1s |

run_browser_agent

The main tool. Tell it what you want in plain English:

mcp-server-browser-use call run_browser_agent \
  task="Find the price of iPhone 16 Pro on Apple's website"

The agent launches a browser, navigates to apple.com, finds the product, and returns the price.

Parameters:

| Parameter | Type | Description | |-----------|------|-------------| | task | string | What to do (required) | | max_steps | int | Override default max steps | | skill_name | string | Use a learned skill | | skill_params | JSON | Parameters for the skill | | learn | bool | Enable learning mode | | save_skill_as | string | Name for the learned skill |

run_deep_research

Multi-step web research with automatic synthesis:

mcp-server-browser-use call run_deep_research \
  topic="Latest developments in quantum computing" \
  max_searches=5

The agent searches multiple sources, extracts key findings, and compiles a markdown report.

Deep Research

Deep research executes a 3-phase workflow:

┌─────────────────────────────────────────────────────────┐
│  Phase 1: PLANNING                                       │
│  LLM generates 3-5 focused search queries from topic     │
└─────────────────────────────┬───────────────────────────┘

五维分析

清晰度8/10

创新性7/10

实用性9/10

完整性9/10

可维护性8/10

优缺点分析

优点

通过HTTP支持多个AI客户端
用户友好的网页UI进行任务管理
灵活的配置选项
实时监控任务

缺点

需要额外依赖的设置
API密钥存在潜在安全风险
对非技术用户来说较复杂
高级功能的文档有限

mcp-browser-use

💡 摘要

🎯 适合人群

mcp-server-browser-use

Table of Contents

What is this?

Why HTTP instead of stdio?

Installation

Claude Code Plugin (Recommended)

Manual Installation

Web UI

Web Dashboard

Configuration

Settings Reference

Config Priority

Using Your Own Browser

CLI Reference

Server Management

Calling Tools

Configuration

Observability

Skills Management

MCP Tools

run_browser_agent

run_deep_research

Deep Research

优点

缺点

相关技能

pytorch

agno

nuxt-skills