PageGuide: Browser extension to assist users in navigating a webpage and locating information
Abstract
PageGuide is a browser extension that enhances AI assistant interactions by providing visual grounding of responses in web page elements, improving verification, guidance, and focus during web browsing tasks.
Users browsing the web daily struggle to quickly locate relevant information in cluttered pages, complete unfamiliar multi-step tasks, and stay focused amid distracting content. State-of-the-art AI assistants (e.g., ChatGPT, Gemini, Claude) and browser agents (e.g., OpenAI Operator, Browser Use) can answer questions and automate actions, yet they return answers without showing where the information comes from on the page, forcing users to manually verify results and blindly trust every automated steps. We present PageGuide, a browser extension that grounds LLM answers directly in the HTML DOM via visual overlays, addressing three core user needs: (a) Find-locating and highlighting relevant evidence in-situ so users can instantly verify answers on the page; (b) Guide-showing step-by-step instructions (e.g. how to change password) one at a time so users can follow and perform actions by themselves; and (c) Hide-hiding distracting content-giving users a chance to decide to hide an element or not. In a user study (N=94), PageGuide outperform unaided browsing across all modes: Hide accuracy improve by 26 percentage points (86.7% relative gain) and task completion time drops by 70%; Guide completion rate increases by 30 percentage points; and Find reduces manual search effort, with Ctrl+F usage falling by 80% and task time decreasing by 19%. Code and demo is at: pageguide.github.io.
Community
Users browsing the web daily struggle to locate relevant information on cluttered pages, complete unfamiliar multi-step tasks, and stay focused amid distracting content. State-of-the-art AI assistants and browser agents return answers without showing where information comes from, forcing users to manually verify results and blindly trust every automated step.
We present ๐ PageGuide, a browser extension that grounds LLM answers directly in the HTML DOM via visual overlays, addressing three core user needs:
Find โ locating and highlighting relevant evidence in-situ so users can instantly verify answers on the page;
Guide โ showing step-by-step instructions one at a time so users can follow and perform actions by themselves;
Hide โ hiding distracting content with a per-element justification and a reviewable checklist.
In a within-subject controlled user study (N = 94), PageGuide outperforms unaided browsing across all modes: Hide accuracy improves by 26 percentage points and task time drops by 70%; Guide completion rate increases by 30 percentage points; and Find reduces Ctrl+F usage by 80% and task time by 19%.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Beyond Chat and Clicks: GUI Agents for In-Situ Assistance via Live Interface Transformation (2026)
- MiniAppBench: Evaluating the Shift from Text to Interactive HTML Responses in LLM-Powered Assistants (2026)
- WebAgentGuard: A Reasoning-Driven Guard Model for Detecting Prompt Injection Attacks in Web Agents (2026)
- WebSP-Eval: Evaluating Web Agents on Website Security and Privacy Tasks (2026)
- IntentWeave: A Progressive Entry Ladder for Multi-Surface Browser Agents in Cloud Portals (2026)
- A Contextual Help Browser Extension to Assist Digital Illiterate Internet Users (2026)
- TraceScope: Interactive URL Triage via Decoupled Checklist Adjudication (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Get this paper in your agent:
hf papers read 2604.23772 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper