browser-use is an open-source browser-controlling agent – an LLM navigates a real Chrome instance, clicks, types, and extracts data. On our dedicated GPU hosting it is a legitimate production path for web automation, data gathering, and QA workflows.
Contents
Stack
- browser-use Python library
- Playwright + headless Chromium
- LLM via OpenAI-compatible API (vLLM)
- Optional: vision-capable LLM for screenshot-based navigation
Deployment
pip install browser-use playwright
playwright install chromium
from browser_use import Agent
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
model="llama-3.3-70b",
openai_api_base="http://localhost:8000/v1",
openai_api_key="not-needed",
)
agent = Agent(
task="Find current GPU server pricing on gigagpu.com and return a CSV",
llm=llm,
)
result = await agent.run()
Vision vs Text-Only
browser-use can run in two modes:
- Text-only: feeds the agent a structured DOM representation
- Vision: captures screenshots and feeds them to a vision-capable LLM
Vision mode works better on visually-dense sites but requires a VLM – Qwen VL 2 7B or Llama 3.2 Vision 11B. Text-only works on more sites and runs faster.
Tips
- Chromium needs a display (xvfb) on headless servers
- Rate-limit your agent – web automation is easy to take too far
- Respect robots.txt and site terms
- Persist cookies across runs for authenticated sites
Browser Automation Agent Hosting
UK dedicated GPUs with browser-use, Chromium, and LLM preconfigured.
Browse GPU ServersSee Open Interpreter.