Co-Pilot
Updated 24 days ago

mcp-browser-use

SSaik0s
0.9k
saik0s/mcp-browser-use
82
Agent Score

πŸ’‘ Summary

This project enables AI assistants to automate web browsing tasks through an MCP server.

🎯 Target Audience

AI developers looking to integrate browser automationData analysts needing automated web data extractionResearchers conducting multi-source web researchProduct managers wanting to streamline web interactionsTech enthusiasts exploring AI capabilities

πŸ€– AI Roast: β€œPowerful, but the setup might scare off the impatient.”

Security AnalysisMedium Risk

Risk: Medium. Review: shell/CLI command execution; outbound network access (SSRF, data egress); API keys/tokens handling and storage; filesystem read/write scope and path traversal. Run with least privilege and audit before enabling in production.

mcp-server-browser-use

MCP server that gives AI assistants the power to control a web browser.

License


Table of Contents


What is this?

This wraps browser-use as an MCP server, letting Claude (or any MCP client) automate a real browserβ€”navigate pages, fill forms, click buttons, extract data, and more.

Why HTTP instead of stdio?

Browser automation tasks take 30-120+ seconds. The standard MCP stdio transport has timeout issues with long-running operationsβ€”connections drop mid-task. HTTP transport solves this by running as a persistent daemon that handles requests reliably regardless of duration.


Installation

Claude Code Plugin (Recommended)

Install as a Claude Code plugin for automatic setup:

# Install the plugin /plugin install browser-use/mcp-browser-use

The plugin automatically:

  • Installs Playwright browsers on first run
  • Starts the HTTP daemon when Claude Code starts
  • Registers the MCP server with Claude

Set your API key (the browser agent needs an LLM to decide actions):

# Set API key (environment variable - recommended) export GEMINI_API_KEY=your-key-here # Or use config file mcp-server-browser-use config set -k llm.api_key -v your-key-here

That's it! Claude can now use browser automation tools.

Manual Installation

For other MCP clients or standalone use:

# Clone and install git clone https://github.com/Saik0s/mcp-browser-use.git cd mcp-server-browser-use uv sync # Install browser uv run playwright install chromium # Start the server uv run mcp-server-browser-use server

Add to Claude Desktop (~/Library/Application Support/Claude/claude_desktop_config.json):

{ "mcpServers": { "browser-use": { "type": "streamable-http", "url": "http://localhost:8383/mcp" } } }

For MCP clients that don't support HTTP transport, use mcp-remote as a proxy:

{ "mcpServers": { "browser-use": { "command": "npx", "args": ["mcp-remote", "http://localhost:8383/mcp"] } } }

Web UI

Access the task viewer at http://localhost:8383 when the daemon is running.

Features:

  • Real-time task list with status and progress
  • Task details with execution logs
  • Server health status and uptime
  • Running tasks monitoring

The web UI provides visibility into browser automation tasks without requiring CLI commands.


Web Dashboard

Access the full-featured dashboard at http://localhost:8383/dashboard when the daemon is running.

Features:

  • Tasks Tab: Complete task history with filtering, real-time status updates, and detailed execution logs
  • Skills Tab: Browse, inspect, and manage learned skills with usage statistics
  • History Tab: Historical view of all completed tasks with filtering by status and time

Key Capabilities:

  • Run existing skills directly from the dashboard with custom parameters
  • Start learning sessions to capture new skills
  • Delete outdated or invalid skills
  • Monitor running tasks with live progress updates
  • View full task results and error details

The dashboard provides a comprehensive web interface for managing all aspects of browser automation without CLI commands.


Configuration

Settings are stored in ~/.config/mcp-server-browser-use/config.json.

View current config:

mcp-server-browser-use config view

Change settings:

mcp-server-browser-use config set -k llm.provider -v openai mcp-server-browser-use config set -k llm.model_name -v gpt-4o # Note: Set API keys via environment variables (e.g., ANTHROPIC_API_KEY) for better security # mcp-server-browser-use config set -k llm.api_key -v sk-... mcp-server-browser-use config set -k browser.headless -v false mcp-server-browser-use config set -k agent.max_steps -v 30

Settings Reference

| Key | Default | Description | |-----|---------|-------------| | llm.provider | google | LLM provider (anthropic, openai, google, azure_openai, groq, deepseek, cerebras, ollama, bedrock, browser_use, openrouter, vercel) | | llm.model_name | gemini-3-flash-preview | Model for the browser agent | | llm.api_key | - | API key for the provider (prefer env vars: GEMINI_API_KEY, ANTHROPIC_API_KEY, etc.) | | browser.headless | true | Run browser without GUI | | browser.cdp_url | - | Connect to existing Chrome (e.g., http://localhost:9222) | | browser.user_data_dir | - | Chrome profile directory for persistent logins/cookies | | browser.chromium_sandbox | true | Enable Chromium sandboxing for security | | agent.max_steps | 20 | Max steps per browser task | | agent.use_vision | true | Enable vision capabilities for the agent | | research.max_searches | 5 | Max searches per research task | | research.search_timeout | - | Timeout for individual searches | | server.host | 127.0.0.1 | Server bind address | | server.port | 8383 | Server port | | server.results_dir | - | Directory to save results | | server.auth_token | - | Auth token for non-localhost connections | | skills.enabled | false | Enable skills system (beta - disabled by default) | | skills.directory | ~/.config/browser-skills | Skills storage location | | skills.validate_results | true | Validate skill execution results |

Config Priority

Environment Variables > Config File > Defaults

Environment variables use prefix MCP_ + section + _ + key (e.g., MCP_LLM_PROVIDER).

Using Your Own Browser

Option 1: Persistent Profile (Recommended)

Use a dedicated Chrome profile to preserve logins and cookies:

# Set user data directory mcp-server-browser-use config set -k browser.user_data_dir -v ~/.chrome-browser-use

Option 2: Connect to Existing Chrome

Connect to an existing Chrome instance (useful for advanced debugging):

# Launch Chrome with debugging enabled google-chrome --remote-debugging-port=9222 # Configure CDP connection (localhost only for security) mcp-server-browser-use config set -k browser.cdp_url -v http://localhost:9222

CLI Reference

Server Management

mcp-server-browser-use server # Start as background daemon mcp-server-browser-use server -f # Start in foreground (for debugging) mcp-server-browser-use status # Check if running mcp-server-browser-use stop # Stop the daemon mcp-server-browser-use logs -f # Tail server logs

Calling Tools

mcp-server-browser-use tools # List all available MCP tools mcp-server-browser-use call run_browser_agent task="Go to google.com" mcp-server-browser-use call run_deep_research topic="quantum computing"

Configuration

mcp-server-browser-use config view # Show all settings mcp-server-browser-use config set -k <key> -v <value> mcp-server-browser-use config path # Show config file location

Observability

mcp-server-browser-use tasks # List recent tasks mcp-server-browser-use tasks --status running mcp-server-browser-use task <id> # Get task details mcp-server-browser-use task cancel <id> # Cancel a running task mcp-server-browser-use health # Server health + stats

Skills Management

mcp-server-browser-use call skill_list mcp-server-browser-use call skill_get name="my-skill" mcp-server-browser-use call skill_delete name="my-skill"

Tip: Skills can also be managed through the web dashboard at http://localhost:8383/dashboard for a visual interface with one-click execution and learning sessions.


MCP Tools

These tools are exposed via MCP for AI clients:

| Tool | Description | Typical Duration | |------|-------------|------------------| | run_browser_agent | Execute browser automation tasks | 60-120s | | run_deep_research | Multi-search research with synthesis | 2-5 min | | skill_list | List learned skills | <1s | | skill_get | Get skill definition | <1s | | skill_delete | Delete a skill | <1s | | health_check | Server status and running tasks | <1s | | task_list | Query task history | <1s | | task_get | Get full task details | <1s |

run_browser_agent

The main tool. Tell it what you want in plain English:

mcp-server-browser-use call run_browser_agent \ task="Find the price of iPhone 16 Pro on Apple's website"

The agent launches a browser, navigates to apple.com, finds the product, and returns the price.

Parameters:

| Parameter | Type | Description | |-----------|------|-------------| | task | string | What to do (required) | | max_steps | int | Override default max steps | | skill_name | string | Use a learned skill | | skill_params | JSON | Parameters for the skill | | learn | bool | Enable learning mode | | save_skill_as | string | Name for the learned skill |

run_deep_research

Multi-step web research with automatic synthesis:

mcp-server-browser-use call run_deep_research \ topic="Latest developments in quantum computing" \ max_searches=5

The agent searches multiple sources, extracts key findings, and compiles a markdown report.


Deep Research

Deep research executes a 3-phase workflow:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Phase 1: PLANNING                                       β”‚
β”‚  LLM generates 3-5 focused search queries from topic     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
5-Dim Analysis
Clarity8/10
Novelty7/10
Utility9/10
Completeness9/10
Maintainability8/10
Pros & Cons

Pros

  • Supports multiple AI clients via HTTP
  • User-friendly web UI for task management
  • Flexible configuration options
  • Real-time monitoring of tasks

Cons

  • Requires setup of additional dependencies
  • Potential security risks with API keys
  • Complex for non-technical users
  • Limited documentation on advanced features

Related Skills

pytorch

S
toolCode Lib
92/ 100

β€œIt's the Swiss Army knife of deep learning, but good luck figuring out which of the 47 installation methods is the one that won't break your system.”

agno

S
toolCode Lib
90/ 100

β€œIt promises to be the Kubernetes for agents, but let's see if developers have the patience to learn yet another orchestration layer.”

nuxt-skills

S
toolCo-Pilot
90/ 100

β€œIt's essentially a well-organized cheat sheet that turns your AI assistant into a Nuxt framework parrot.”

Disclaimer: This content is sourced from GitHub open source projects for display and rating purposes only.

Copyright belongs to the original author Saik0s.