Skip to main content

Documentation Index

Fetch the complete documentation index at: https://actionbook.dev/docs/llms.txt

Use this file to discover all available pages before exploring further.

The actionbook browser command lets AI agents automate any Chromium-based browser. Three modes are available depending on the use case:
Normal usage: run actionbook setup once, then use actionbook browser ... commands only.

Modes

ModeHow It WorksBest For
Local (default)Launches a new Chrome with a fresh, dedicated profileAutomated tasks, CI/CD, headless, no user data needed
CloudConnects to a remote browser via a cloud providerScalable headless infra, geo-targeting, persistent profiles in the cloud
ExtensionConnects to the user’s running Chrome via extension bridgeTasks requiring existing login sessions, cookies, personal context
All three modes are designed for AI agents:
  • Local gives agents a clean, reproducible environment.
  • Cloud offloads the browser to a remote provider, suited for production workloads, parallel execution at scale, or IP geolocation control.
  • Extension lets agents operate inside the user’s real browser when the task requires existing authentication (e.g., “book my flight”, “reply to that email”).

Local Mode

Works out of the box with no setup. The CLI launches a temporary Chrome process with a dedicated profile and connects via CDP.
# If local is your default mode (or not configured)
actionbook browser start --set-session-id s1
actionbook browser goto "https://example.com" --session s1 --tab t1
actionbook browser click "button[type='submit']" --session s1 --tab t1
actionbook browser fill "#search" "actionbook" --session s1 --tab t1
actionbook browser screenshot output.png --session s1 --tab t1
When done, close the browser:
actionbook browser close --session s1

Cloud Providers

Cloud mode lets you run browser sessions on a remote provider instead of launching a local Chrome. Use the -p (or --provider) flag with browser start:
actionbook browser start -p hyperbrowser --session s1

Supported Providers

ProviderFlag ValueRequired Env VarService
driver.devdriverDRIVER_API_KEYdriver.dev
HyperbrowserhyperbrowserHYPERBROWSER_API_KEYhyperbrowser.ai
Browser UsebrowseruseBROWSER_USE_API_KEYbrowser-use.com

Quick Start

1

Export your provider API key

# Pick one provider
export DRIVER_API_KEY="your-key"
# or
export HYPERBROWSER_API_KEY="your-key"
# or
export BROWSER_USE_API_KEY="your-key"
2

Start a cloud browser session

actionbook browser start -p hyperbrowser --session cloud1
The CLI creates a remote browser session and returns the session_id and tab_id for subsequent commands.
3

Use browser commands as usual

actionbook browser open "https://example.com" --session cloud1
actionbook browser snapshot --session cloud1 --tab t1
actionbook browser click "#login" --session cloud1 --tab t1
All browser commands work the same way. The session runs on the cloud provider transparently.
4

Close when done

actionbook browser close --session cloud1
This terminates the remote browser session and releases provider resources.

How It Works

  • -p <name> automatically implies --mode cloud, so you don’t need to set both
  • -p is mutually exclusive with --cdp-endpoint and --mode local/extension
  • The CLI forwards provider env vars from your current shell to the daemon at start time (the daemon’s own env is frozen at spawn)
  • browser restart --session <id> mints a fresh remote session while preserving the same session_id, so you can reset state without losing your addressing handle
For raw CDP connections without a managed provider, use --mode cloud --cdp-endpoint wss://... instead of -p.

Extension Mode Setup

Extension mode is under active development. For most use cases, Local or Cloud mode is recommended.
If you’ve configured mode = "extension" in your config file, you only need to perform this one-time setup.
1

Install from Chrome Web Store (recommended)

  1. Open Actionbook on Chrome Web Store
  2. Click Add to Chrome
  3. Confirm Add extension
2

Verify with any browser command

actionbook browser start --set-session-id s1
actionbook browser goto "https://example.com" --session s1 --tab t1
actionbook browser snapshot --session s1 --tab t1
The extension bridge (part of the actionbook daemon) auto-starts in the background and the extension auto-connects within a few seconds.
3

Fallback: local debug install (only if Web Store install fails)

  1. Install the fallback package:
actionbook extension install
  1. Open chrome://extensions
  2. Enable Developer mode
  3. Click Load unpacked
  4. Select the local folder printed by:
actionbook extension path

Extension Commands

The actionbook extension commands manage the Chrome extension and its connection to the actionbook daemon. The recommended install method is the Chrome Web Store (current version: 0.4.0). actionbook extension install is a local fallback — after running it, you must manually load the unpacked extension in Chrome via chrome://extensions > Developer mode > Load unpacked, using the path from actionbook extension path.
actionbook extension status                   # Bridge status + extension connection state
actionbook extension ping                     # Measure bridge RTT
actionbook extension install                  # Fallback: install to ~/Actionbook/extension/ (requires manual Chrome load)
actionbook extension install --force          # Force reinstall
actionbook extension uninstall                # Remove extension from disk
actionbook extension path                     # Print install path, status, and version
The extension bridge runs inside the actionbook daemon (auto-started by browser commands). extension status shows whether the bridge is listening and the extension is connected. extension ping tests connectivity by connecting to the bridge WebSocket at ws://127.0.0.1:19222.
Extension 0.4.0: Tabs opened by Actionbook are automatically grouped into a Chrome “Actionbook” tab group (toggleable via extension popup). list-tabs in extension mode now returns only Actionbook-managed tabs — other user tabs are hidden. Extensions below 0.4.0 are rejected at handshake.

Configuration

The recommended way to set your preferred browser mode is via actionbook setup. It writes your choice to ~/.actionbook/config.toml, so you can run browser commands without extra flags.

Set Default Mode

Create or edit ~/.actionbook/config.toml:
[browser]
# Use "local" (default), "cloud", or "extension"
mode = "local"
Once configured, all browser commands will automatically use your chosen mode:
# No flags needed - uses configured mode
actionbook browser open "https://example.com"
actionbook browser click "button[type='submit']"

Browser Commands

Most commands work in all modes. All per-tab commands require --session and --tab flags to address a specific tab.

Session Lifecycle

actionbook browser start --session s1 --open-url https://google.com
actionbook browser start --session s1 --max-tracked-requests 1000   # custom network buffer size
actionbook browser list-sessions                     # list all active sessions (includes max_tracked_requests)
actionbook browser status --session s1               # show session info
actionbook browser restart --session s1              # restart, preserving session_id and max_tracked_requests
actionbook browser close --session s1                # close a session
Both --session and --set-session-id are get-or-create: they reuse a Running session with the given ID, or create one if not found.
When reusing a session, if --profile is passed and does not match the session’s bound profile, the command fails with SESSION_PROFILE_MISMATCH. Omit --profile or pass the matching value to reuse.

Tab Management

actionbook browser list-tabs --session s1                          # list all tabs
actionbook browser new-tab "https://example.com" --session s1      # open a new tab (alias: open)
actionbook browser close-tab --session s1 --tab t1                 # close a tab
For running parallel tasks across multiple tabs, use Multi-Session with the --session flag instead of manual tab switching.
actionbook browser goto "https://example.com/login" --session s1 --tab t1
actionbook browser back --session s1 --tab t1
actionbook browser forward --session s1 --tab t1
actionbook browser reload --session s1 --tab t1

Page Interaction

actionbook browser click "button[type='submit']" --session s1 --tab t1
actionbook browser fill "#username" "demo" --session s1 --tab t1
actionbook browser type "#search" "hello" --session s1 --tab t1     # append text (does not clear)
actionbook browser select "#country" "US" --session s1 --tab t1
actionbook browser hover ".menu-item" --session s1 --tab t1
actionbook browser focus "#search-input" --session s1 --tab t1
actionbook browser press "Enter" --session s1 --tab t1              # single key or combo
actionbook browser press "Control+A" --session s1 --tab t1          # keyboard shortcuts
actionbook browser scroll down 500 --session s1 --tab t1            # scroll by pixels
actionbook browser drag "#src" "#target" --session s1 --tab t1
actionbook browser mouse-move 100,200 --session s1 --tab t1
actionbook browser cursor-position --session s1 --tab t1
Use @eN refs from snapshot output directly as selectors (e.g., actionbook browser click @e5 --session s1 --tab t1).

Wait Conditions

actionbook browser wait element "#username" --session s1 --tab t1 --timeout 10000
actionbook browser wait navigation --session s1 --tab t1
actionbook browser wait network-idle --session s1 --tab t1
actionbook browser wait condition "document.readyState === 'complete'" --session s1 --tab t1
wait network-idle is edge-triggered: it only tracks fetch/XHR requests started after the command begins. Pre-existing background connections (SSE, WebSocket, analytics pings) are ignored and do not block. This is an agent-friendly settle signal, not a guarantee of global network silence.

Observation & Export

actionbook browser snapshot --session s1 --tab t1                   # accessibility tree
actionbook browser snapshot --interactive --session s1 --tab t1     # only interactive elements
actionbook browser screenshot output.png --session s1 --tab t1
actionbook browser screenshot output.png --full --session s1 --tab t1
actionbook browser pdf report.pdf --session s1 --tab t1
actionbook browser eval "document.title" --session s1 --tab t1      # JS eval (structured errors)
actionbook browser html --session s1 --tab t1
actionbook browser text --session s1 --tab t1
actionbook browser title --session s1 --tab t1
actionbook browser url --session s1 --tab t1
actionbook browser viewport --session s1 --tab t1
actionbook browser inspect-point 100,200 --session s1 --tab t1     # inspect element at coordinates
actionbook browser describe "#submit" --session s1 --tab t1        # element properties & context
actionbook browser box "#submit" --session s1 --tab t1             # element bounding box
actionbook browser attrs "#submit" --session s1 --tab t1           # all element attributes
actionbook browser state "#submit" --session s1 --tab t1           # element state
actionbook browser logs console --session s1 --tab t1              # console logs
actionbook browser logs errors --session s1 --tab t1               # error logs
actionbook browser network requests --session s1 --tab t1          # tracked network requests
actionbook browser network requests --dump --out /tmp/dump --session s1 --tab t1  # export to file
actionbook browser network har start --session s1 --tab t1         # start HAR recording
actionbook browser network har stop --session s1 --tab t1          # stop and export HAR 1.2 file
har start accepts --max-entries N to set the ring-buffer cap (default: 10000).
When har stop detects dropped entries, the envelope includes meta.truncated = true and a HAR_TRUNCATED warning in meta.warnings with the drop count and configured cap. Read data.dropped and data.max_entries to decide whether to raise --max-entries or stop recording sooner.
# Cookies (session-level, no --tab needed)
actionbook browser cookies list --session s1
actionbook browser cookies get session_id --session s1
actionbook browser cookies set demo_cookie hello --session s1
actionbook browser cookies delete demo_cookie --session s1
actionbook browser cookies clear --domain example.com --session s1

# Local Storage (window.localStorage)
actionbook browser local-storage list --session s1 --tab t1
actionbook browser local-storage get myKey --session s1 --tab t1
actionbook browser local-storage set myKey "myValue" --session s1 --tab t1
actionbook browser local-storage delete myKey --session s1 --tab t1
actionbook browser local-storage clear myKey --session s1 --tab t1

# Session Storage (window.sessionStorage)
actionbook browser session-storage list --session s1 --tab t1
actionbook browser session-storage get myKey --session s1 --tab t1
actionbook browser session-storage set myKey "myValue" --session s1 --tab t1
actionbook browser session-storage delete myKey --session s1 --tab t1
actionbook browser session-storage clear myKey --session s1 --tab t1

File Upload

actionbook browser upload "input[type=file]" photo.jpg --session s1 --tab t1
actionbook browser upload "#file-input" a.png b.png --session s1 --tab t1   # multiple files

Multi-Session (Parallel Tabs)

Available since v0.11.0. Requires local mode (default).
Multi-session lets you run multiple independent tabs in a single browser instance. Each named session is bound to its own tab, so commands sent to one session never affect another.

Basic Usage

Use the --session flag to name a session:
# Open three sites in separate sessions
actionbook browser start --session research --open-url "https://arxiv.org"
actionbook browser start --session social --open-url "https://x.com"
actionbook browser start --session shopping --open-url "https://amazon.com"

# Each session operates independently
actionbook browser snapshot --session research --tab t1   # only sees arxiv.org
actionbook browser snapshot --session social --tab t1     # only sees x.com
actionbook browser click "#search" --session shopping --tab t1  # only affects amazon.com
Without --session, commands use a default session.

Session Management

# List all active sessions
actionbook browser list-sessions

# Show session status
actionbook browser status --session research

# Close a session
actionbook browser close --session research

Common Mistakes

Don’t create multiple profiles or ports for parallel tabs. Use --session instead.
# WRONG: wasteful, launches 3 Chrome instances
actionbook browser start --profile work1
actionbook browser start --profile work2
actionbook browser start --profile work3
# CORRECT: one browser, three sessions
actionbook browser start --session research --open-url "https://arxiv.org"
actionbook browser start --session social --open-url "https://x.com"
actionbook browser start --session shopping --open-url "https://amazon.com"
# WRONG: all three snapshots see the same (last opened) tab
actionbook browser snapshot --session default --tab t1
actionbook browser snapshot --session default --tab t1
actionbook browser snapshot --session default --tab t1
# CORRECT: each snapshot targets a specific session
actionbook browser snapshot --session research --tab t1
actionbook browser snapshot --session social --tab t1
actionbook browser snapshot --session shopping --tab t1
Combine action manuals with browser automation for the best results:
# 1. Find the action manual
actionbook search "github search repos"

# 2. Get verified selectors
actionbook get "<area_id>"

# 3. Start a session and execute
actionbook browser start --set-session-id s1
actionbook browser goto "https://github.com" --session s1 --tab t1
actionbook browser snapshot --session s1 --tab t1
actionbook browser fill @e3 "actionbook" --session s1 --tab t1
actionbook browser click @e7 --session s1 --tab t1

Shutdown

actionbook browser close --session s1       # close a session
actionbook browser restart --session s1     # restart with same profile/mode
browser close is idempotent — closing an unknown or already-closed session returns ok: true with a warning in meta.warnings, not an error. Safe to call unconditionally during cleanup without checking session existence first. Read meta.warnings to distinguish a fresh close from an already-gone session.

Extension Security

Extension mode connects via a local WebSocket bridge with the following guarantees:
  • The bridge only listens on 127.0.0.1 (loopback), no external network access
  • No token or pairing is required; local origin is sufficient for authentication
  • The extension identifies itself via a WebSocket handshake with role and version fields
  • The bridge validates the connection source (origin + extension ID) before accepting commands

Eval Error Handling

browser eval accepts the expression from three mutually-exclusive sources: a positional argument, --file <path>, or stdin (-). browser eval returns structured error codes on failure. Branch on error.code instead of parsing the message string: Runtime errors (after browser execution):
CodeMeaningBest practice
EVAL_RUNTIME_ERRORJS exceptionCheck expression syntax and referenced variables
EVAL_CROSS_ORIGINCross-origin or CSP blockUse same-origin fetch or proxy server-side
EVAL_RESPONSE_NOT_JSONNon-JSON response bodyRead error.details.content_type and body_head before retrying
EVAL_RESPONSE_NOT_OKNon-2xx HTTP statusRead error.details.status and body_head to distinguish 403 / challenge / CORS
EVAL_TIMEOUTExpression exceeded --timeoutReduce work or increase timeout
CLI-layer errors (before any browser interaction):
CodeMeaningBest practice
EVAL_ARGS_CONFLICTMultiple input sources or noneProvide exactly one of: positional, --file, or stdin -
EVAL_FILE_NOT_FOUND--file path unreadableVerify the path resolves and is readable
EVAL_STDIN_TTY- but stdin is a terminalPipe the expression: echo 'expr' | actionbook browser eval -
EVAL_STDIN_EMPTYStdin produced empty inputVerify the upstream command or pipeline
For EVAL_RESPONSE_NOT_OK and EVAL_RESPONSE_NOT_JSON, inspect error.details.body_head (first ≤256 chars of the response body) to decide whether to retry — a 403 challenge page is not worth retrying blindly.

CDP Error Handling

Browser commands that interact with elements, navigate, or communicate via CDP return structured CDP_* error codes. Branch on error.code:
CodeMeaningBest practiceRetryable
CDP_NODE_NOT_FOUNDDOM node is stale or nonexistentCall snapshot to refresh node refs then retryNo
CDP_NOT_INTERACTABLEElement exists but can’t be acted onScroll into view, wait for visibility, or dismiss overlaysNo
CDP_NAV_TIMEOUTNavigation or eval timeoutIncrease --timeout or verify URL reachabilityYes
CDP_TARGET_CLOSEDTab navigated away or session torn down mid-commandStart a fresh session or re-attach to the tabYes
CDP_PROTOCOL_ERRORCDP response malformed (-32xxx codes)Inspect details.reason and details.cdp_codeNo
CDP_GENERICUnclassified CDP error (transport/parse)(no specific remediation)No
When error.code is a CDP_* code, error.details includes reason (raw CDP message) and cdp_code (upstream numeric code) when available. CDP_NAV_TIMEOUT also includes timeout_ms.
CDP_NAV_TIMEOUT and CDP_TARGET_CLOSED set error.retryable = true in the JSON envelope — orchestrators can auto-retry these. All other CDP codes require caller intervention (e.g. refreshing snapshot refs, scrolling elements into view).
Some interaction paths (cookies, screenshots, PDF, etc.) still emit the legacy CDP_ERROR code. These are being migrated to the structured CDP_* taxonomy (ACT-999).

Troubleshooting

  • Run any browser command once: actionbook browser open <url> (this auto-starts the bridge)
  • Keep Chrome open and confirm the Actionbook extension is enabled in chrome://extensions
  • First startup can take a few seconds while the bridge starts and the extension reconnects
  • This usually means the WebSocket handshake did not complete
  • Update CLI: npm update -g @actionbookdev/cli
  • Reinstall from Chrome Web Store
  • If needed, use fallback install: actionbook extension install --force, then reload it in chrome://extensions
  • Click Connect in the extension popup to retry
  • Re-run actionbook setup and choose your preferred browser mode
  • Check effective config with actionbook config show
  • Environment variables (for example ACTIONBOOK_BROWSER_MODE) can override config
  • Make sure Chrome has at least one visible, active tab
  • Run actionbook browser open <url> first to attach a tab
  • Avoid chrome:// pages; the debugger cannot attach to them
  • Browser mode currently uses a fixed bridge address: ws://127.0.0.1:19222
  • Stop the conflicting process on that port (macOS/Linux: lsof -i :19222)
  • Then retry your browser command; the bridge will auto-start again
  • Retry Web Store install: Actionbook on Chrome Web Store
  • Use local debug fallback: actionbook extension install
  • Load it manually from chrome://extensions with Load unpacked
  • If GitHub API is rate-limited during fallback, wait and retry or download from GitHub Releases