arxiv:2604.23772

PageGuide: Browser extension to assist users in navigating a webpage and locating information

Published on Apr 26

· Submitted by

TTN on Apr 28

Upvote

Authors:

Thang T. Truong ,

Abstract

PageGuide is a browser extension that enhances AI assistant interactions by providing visual grounding of responses in web page elements, improving verification, guidance, and focus during web browsing tasks.

AI-generated summary

Users browsing the web daily struggle to quickly locate relevant information in cluttered pages, complete unfamiliar multi-step tasks, and stay focused amid distracting content. State-of-the-art AI assistants (e.g., ChatGPT, Gemini, Claude) and browser agents (e.g., OpenAI Operator, Browser Use) can answer questions and automate actions, yet they return answers without showing where the information comes from on the page, forcing users to manually verify results and blindly trust every automated steps. We present PageGuide, a browser extension that grounds LLM answers directly in the HTML DOM via visual overlays, addressing three core user needs: (a) Find-locating and highlighting relevant evidence in-situ so users can instantly verify answers on the page; (b) Guide-showing step-by-step instructions (e.g. how to change password) one at a time so users can follow and perform actions by themselves; and (c) Hide-hiding distracting content-giving users a chance to decide to hide an element or not. In a user study (N=94), PageGuide outperform unaided browsing across all modes: Hide accuracy improve by 26 percentage points (86.7% relative gain) and task completion time drops by 70%; Guide completion rate increases by 30 percentage points; and Find reduces manual search effort, with Ctrl+F usage falling by 80% and task time decreasing by 19%. Code and demo is at: pageguide.github.io.

View arXiv page View PDF Project page GitHub 3 Add to collection

Community

ttn0011

Paper submitter about 20 hours ago

Users browsing the web daily struggle to locate relevant information on cluttered pages, complete unfamiliar multi-step tasks, and stay focused amid distracting content. State-of-the-art AI assistants and browser agents return answers without showing where information comes from, forcing users to manually verify results and blindly trust every automated step.

We present 🍊 PageGuide, a browser extension that grounds LLM answers directly in the HTML DOM via visual overlays, addressing three core user needs:

Find — locating and highlighting relevant evidence in-situ so users can instantly verify answers on the page;
Guide — showing step-by-step instructions one at a time so users can follow and perform actions by themselves;
Hide — hiding distracting content with a per-element justification and a reviewable checklist.
In a within-subject controlled user study (N = 94), PageGuide outperforms unaided browsing across all modes: Hide accuracy improves by 26 percentage points and task time drops by 70%; Guide completion rate increases by 30 percentage points; and Find reduces Ctrl+F usage by 80% and task time by 19%.