Skill v1.0.0
currentAutomated scan100/100version: "1.0.0" name: qiaomu-opencli-browser description: Make websites accessible for AI agents. Navigate, click, type, extract, wait — using Chrome with existing login sessions. No LLM API key needed. author: joeseesun upstream: jackwener/opencli allowed-tools: Bash(opencli:*), Read, Edit, Write
OpenCLI Browser — Browser Automation for AI Agents
Control Chrome step-by-step via CLI. Reuses existing login sessions — no passwords needed.
Prerequisites
opencli doctor # Verify extension + daemon connectivity
Requires: Chrome running + OpenCLI Browser Bridge extension installed.
Critical Rules
- ALWAYS use `state` to inspect the page, NEVER use `screenshot` —
statereturns structured DOM with[N]element indices, is instant and costs zero tokens.screenshotrequires vision processing and is slow. Only usescreenshotwhen the user explicitly asks to save a visual. - ALWAYS use `click`/`type`/`select` for interaction, NEVER use `eval` to click or type —
eval "el.click()"bypasses scrollIntoView and CDP click pipeline, causing failures on off-screen elements. Usestateto find the[N]index, thenclick <N>. - Verify inputs with `get value`, not screenshots — after
type, runget value <index>to confirm. - Run `state` after every page change — after
open,click(on links),scroll, always runstateto see the new elements and their indices. Never guess indices. - Chain commands aggressively with `&&` — combine
open + state, multipletypecalls, andtype + get valueinto single&&chains. Each tool call has overhead; chaining cuts it. - `eval` is read-only — use
evalONLY for data extraction (JSON.stringify(...)), never for clicking, typing, or navigating. Always wrap in IIFE to avoid variable conflicts:eval "(function(){ const x = ...; return JSON.stringify(x); })()". - Minimize total tool calls — plan your sequence before acting. A good task completion uses 3-5 tool calls, not 15-20. Combine
open + stateas one call. Combinetype + type + clickas one call. Only runstateseparately when you need to discover new indices. - Prefer `network` to discover APIs — most sites have JSON APIs. API-based adapters are more reliable than DOM scraping.
Command Cost Guide
| Cost | Commands | When to use | |
|---|---|---|---|
| Free & instant | state, get *, eval, network, scroll, keys | Default — use these | |
| Free but changes page | open, click, type, select, back | Interaction — run state after | |
| Expensive (vision tokens) | screenshot | ONLY when user needs a saved image |
Action Chaining Rules
Commands can be chained with &&. The browser persists via daemon, so chaining is safe.
Always chain when possible — fewer tool calls = faster completion:
# GOOD: open + inspect in one call (saves 1 round trip)opencli browser open https://example.com && opencli browser state# GOOD: fill form in one call (saves 2 round trips)opencli browser type 3 "hello" && opencli browser type 4 "world" && opencli browser click 7# GOOD: type + verify in one callopencli browser type 5 "test@example.com" && opencli browser get value 5# GOOD: click + wait + state in one call (for page-changing clicks)opencli browser click 12 && opencli browser wait time 1 && opencli browser state# BAD: separate calls for each action (wasteful)opencli browser type 3 "hello" # Don't do thisopencli browser type 4 "world" # when you can chainopencli browser click 7 # all three together
Page-changing — always put last in a chain (subsequent commands see stale indices):
open <url>,back,click <link/button that navigates>
Rule: Chain when you already know the indices. Run state separately when you need to discover indices first.
Core Workflow
- Navigate:
opencli browser open <url> - Inspect:
opencli browser state→ elements with[N]indices - Interact: use indices —
click,type,select,keys - Wait (if needed):
opencli browser wait selector ".loaded"orwait text "Success" - Verify:
opencli browser stateoropencli browser get value <N> - Repeat: browser stays open between commands
- Save: write a TS adapter to
~/.opencli/clis/<site>/<command>.ts
Commands
Navigation
opencli browser open <url> # Open URL (page-changing)opencli browser back # Go back (page-changing)opencli browser scroll down # Scroll (up/down, --amount N)opencli browser scroll up --amount 1000
Inspect (free & instant)
opencli browser state # Structured DOM with [N] indices — PRIMARY toolopencli browser screenshot [path.png] # Save visual to file — ONLY for user deliverables
Get (free & instant)
opencli browser get title # Page titleopencli browser get url # Current URLopencli browser get text <index> # Element text contentopencli browser get value <index> # Input/textarea value (use to verify after type)opencli browser get html # Full page HTMLopencli browser get html --selector "h1" # Scoped HTMLopencli browser get attributes <index> # Element attributes
Interact
opencli browser click <index> # Click element [N]opencli browser type <index> "text" # Type into element [N]opencli browser select <index> "option" # Select dropdownopencli browser keys "Enter" # Press key (Enter, Escape, Tab, Control+a)
Wait
Three variants — use the right one for the situation:
opencli browser wait time 3 # Wait N seconds (fixed delay)opencli browser wait selector ".loaded" # Wait until element appears in DOMopencli browser wait selector ".spinner" --timeout 5000 # With timeout (default 30s)opencli browser wait text "Success" # Wait until text appears on page
When to wait: After open on SPAs, after click that triggers async loading, before eval on dynamically rendered content.
Extract (free & instant, read-only)
Use eval ONLY for reading data. Never use it to click, type, or navigate.
opencli browser eval "document.title"opencli browser eval "JSON.stringify([...document.querySelectorAll('h2')].map(e => e.textContent))"# IMPORTANT: wrap complex logic in IIFE to avoid "already declared" errorsopencli browser eval "(function(){ const items = [...document.querySelectorAll('.item')]; return JSON.stringify(items.map(e => e.textContent)); })()"
Selector safety: Always use fallback selectors — querySelector returns null on miss:
# BAD: crashes if selector missesopencli browser eval "document.querySelector('.title').textContent"# GOOD: fallback with || or ?.opencli browser eval "(document.querySelector('.title') || document.querySelector('h1') || {textContent:''}).textContent"opencli browser eval "document.querySelector('.title')?.textContent ?? 'not found'"
Network (API Discovery)
opencli browser network # Show captured API requests (auto-captured since open)opencli browser network --detail 3 # Show full response body of request #3opencli browser network --all # Include static resources
Sedimentation (Save as CLI)
opencli browser init hn/top # Generate adapter scaffold at ~/.opencli/clis/hn/top.tsopencli browser verify hn/top # Test the adapter (adds --limit 3 only if `limit` arg is defined)
initauto-detects the domain from the active browser session (no need to specify it)initcreates the file + populatessite,name,domain, andcolumnsfrom current pageverifyruns the adapter end-to-end and prints output; if nolimitarg exists in the adapter, it won't pass--limit 3
Session
opencli browser close # Close automation window
Example: Extract HN Stories
opencli browser open https://news.ycombinator.comopencli browser state # See [1] a "Story 1", [2] a "Story 2"...opencli browser eval "JSON.stringify([...document.querySelectorAll('.titleline a')].slice(0,5).map(a => ({title: a.textContent, url: a.href})))"opencli browser close
Example: Fill a Form
opencli browser open https://httpbin.org/forms/postopencli browser state # See [3] input "Customer Name", [4] input "Telephone"opencli browser type 3 "OpenCLI" && opencli browser type 4 "555-0100"opencli browser get value 3 # Verify: "OpenCLI"opencli browser close
Saving as Reusable CLI — Complete Workflow
Step-by-step sedimentation flow:
# 1. Explore the websiteopencli browser open https://news.ycombinator.comopencli browser state # Understand DOM structure# 2. Discover APIs (crucial for high-quality adapters)opencli browser eval "fetch('/api/...').then(r=>r.json())" # Trigger API callsopencli browser network # See captured API requestsopencli browser network --detail 0 # Inspect response body# 3. Generate scaffoldopencli browser init hn/top # Creates ~/.opencli/clis/hn/top.ts# 4. Edit the adapter (fill in func logic)# - If API found: use fetch() directly (Strategy.PUBLIC or COOKIE)# - If no API: use page.evaluate() for DOM extraction (Strategy.UI)# 5. Verifyopencli browser verify hn/top # Runs the adapter and shows output# 6. If verify fails, edit and retry# 7. Close when doneopencli browser close
Example adapter:
// ~/.opencli/clis/hn/top.tsimport { cli, Strategy } from '@jackwener/opencli/registry';cli({site: 'hn',name: 'top',description: 'Top Hacker News stories',domain: 'news.ycombinator.com',strategy: Strategy.PUBLIC,browser: false,args: [{ name: 'limit', type: 'int', default: 5 }],columns: ['rank', 'title', 'score', 'url'],func: async (_page, kwargs) => {const limit = Math.min(Math.max(1, kwargs.limit ?? 5), 50);const resp = await fetch('https://hacker-news.firebaseio.com/v0/topstories.json');const ids = await resp.json();return Promise.all(ids.slice(0, limit).map(async (id: number, i: number) => {const item = await (await fetch(`https://hacker-news.firebaseio.com/v0/item/${id}.json`)).json();return { rank: i + 1, title: item.title, score: item.score, url: item.url ?? '' };}));},});
Save to ~/.opencli/clis/<site>/<command>.ts → immediately available as opencli <site> <command>.
Strategy Guide
| Strategy | When | browser: | |
|---|---|---|---|
Strategy.PUBLIC | Public API, no auth | false | |
Strategy.COOKIE | Needs login cookies | true | |
Strategy.UI | Direct DOM interaction | true |
Always prefer API over UI — if you discovered an API during browsing, use fetch() directly.
Tips
- Always `state` first — never guess element indices, always inspect first
- Sessions persist — browser stays open between commands, no need to re-open
- Use `eval` for data extraction —
eval "JSON.stringify(...)"is faster than multiplegetcalls - Use `network` to find APIs — JSON APIs are more reliable than DOM scraping
- Alias:
opencli opis shorthand foropencli browser
Common Pitfalls
- `form.submit()` fails in automation — Don't use
form.submit()orevalto submit forms. Navigate directly to the search URL instead:
``bash # BAD: form.submit() often silently fails opencli browser eval "document.querySelector('form').submit()" # GOOD: construct the URL and navigate opencli browser open "https://github.com/search?q=opencli&type=repositories" ``
- GitHub DOM changes frequently — Prefer
data-testidattributes when available; they are more stable than class names or tag structure.
- SPA pages need `wait` before extraction — After
openorclickon single-page apps, the DOM isn't ready immediately. Alwayswait selectororwait textbeforeeval.
- Use `state` before clicking — Run
opencli browser stateto inspect available interactive elements and their indices. Never guess indices from memory.
- `evaluate` runs in browser context —
page.evaluate()in adapters executes inside the browser. Node.js APIs (fs,path,process) are NOT available. Usefetch()for network calls, DOM APIs for page data.
- Backticks in `page.evaluate` break JSON storage — When writing adapters that will be stored/transported as JSON, avoid template literals inside
page.evaluate. Use string concatenation or function-style evaluate:
``typescript // BAD: template literal backticks break when adapter is in JSON page.evaluate(document.querySelector("${selector}")) // GOOD: function-style evaluate page.evaluate((sel) => document.querySelector(sel), selector) ``
Troubleshooting
| Error | Fix | |
|---|---|---|
| "Browser not connected" | Run opencli doctor | |
| "attach failed: chrome-extension://" | Disable 1Password temporarily | |
| Element not found | opencli browser scroll down && opencli browser state | |
| Stale indices after page change | Run opencli browser state again to get fresh indices |