An AI-native browser operating system with multi-agent browsing and Gemini-powered video intelligence.

Smart Browser is an Electron-based browser that reimagines web browsing as an operating system. It features a multi-agent workspace where up to 4 browser agents can run simultaneously, each in its own isolated BrowserView. The browser integrates AI capabilities including natural language commands and Gemini-powered video explanations.
- 4 Independent Browser Agents: Run up to 4 browser sessions simultaneously in a 2x2 grid layout
- Round-Robin Navigation: URLs automatically distribute across available agent slots
- Fullscreen Focus Mode: Double-click any agent slot to expand it to full view
- Agent Lifecycle Management: Create, navigate, and destroy agents independently
- Real-time Status Updates: Visual indicators for loading states and navigation
- Automatic Caption Extraction: Captures YouTube captions via network interception
- AI-Generated Summaries: One-click video summaries powered by Gemini 2.0 Flash
- Detailed Explanations: Beginner-friendly and technical explanation modes
- Context-Aware Q&A: Ask follow-up questions grounded in video transcripts
- Session Management: Per-agent, per-video chat history with conversational memory
- URL-Based Blocking: Pattern-matching rules for ad network domains
- Response Inspection: Strips ad fields from YouTube player API responses
- Manifest Rewriting: Removes ad segments from DASH/HLS video manifests
- Multiple Modes: Strict, Balanced, Allowlist-only, and Off modes
- Natural Language Input: Type commands like "go to youtube.com" or just "google.com"
- AI Task Planning: Complex commands are parsed and executed by the Planner module
- DOM Interaction: Click, type, scroll, and extract data from web pages
Smart Browser
├── packages/
│ ├── shell/ # Electron main process and UI
│ │ ├── main.js # Main process, AgentManager, IPC handlers
│ │ ├── preload.js # Secure bridge between renderer and main
│ │ └── ui/ # HTML, CSS, JavaScript for shell UI
│ │
│ ├── video-intelligence/ # Gemini-powered video explanations
│ │ ├── caption-extractor.ts # Network-based caption capture
│ │ ├── transcript-store.ts # Per-agent transcript storage
│ │ ├── explain-session.ts # Window-scoped Gemini sessions
│ │ └── gemini-client.ts # Gemini API integration
│ │
│ ├── adblock/ # Ad blocking engine
│ │ ├── rule-engine.ts # Trie-based URL matching
│ │ ├── filter-parser.ts # AdBlock Plus filter syntax
│ │ ├── response-inspector.ts # API response modification
│ │ └── manifest-rewriter.ts # DASH/HLS ad removal
│ │
│ ├── planner/ # AI task planning
│ │ └── index.ts # Gemini-based command parsing
│ │
│ └── kernel/ # Core abstractions
│ └── abi/ # Protocol buffer definitions
| Layer | Technology |
|---|---|
| Runtime | Electron 28+ with Chromium |
| Language | TypeScript, JavaScript |
| AI Model | Google Gemini 2.0 Flash |
| Build | Node.js, npm workspaces |
| Styling | CSS with custom design tokens |
| IPC | Electron contextBridge |
- Node.js 18 or higher
- npm 9 or higher
- Google Gemini API key
- Clone the repository:
git clone https://github.com/ashutosh0x/smart-browser.git
cd smart-browser- Install dependencies:
npm install- Create environment file:
echo "GEMINI_API_KEY=your_api_key_here" > .env- Build packages:
npm run build -w packages/video-intelligence
npm run build -w packages/adblock
npm run build -w packages/planner- Start the browser:
npm start -w packages/shell| Command | Action |
|---|---|
go to example.com |
Navigate to URL in next available slot |
example.com |
Auto-detect URL and navigate |
new |
Create a new agent in empty slot |
focus 0 |
Toggle fullscreen for slot 0 |
- Navigate to a YouTube video
- Click the "Explain" button in the slot header
- View the Summary or Explained tabs
- Ask follow-up questions in the Ask tab
| Key | Action |
|---|---|
Enter |
Execute command |
Escape |
Exit fullscreen mode |
Double-click |
Toggle fullscreen on slot |
| Variable | Description |
|---|---|
GEMINI_API_KEY |
Google Gemini API key for video intelligence |
| Mode | Description |
|---|---|
| Strict | Block all ad-related requests |
| Balanced | Default mode with common ad networks |
| Allowlist | Only block explicitly listed domains |
| Off | Disable adblocking |
# Build all packages
npm run build --workspaces
# Run tests
npm test -w packages/adblock
npm test -w packages/video-intelligence
# Start in development mode
npm start -w packages/shell- Create a new package in
packages/ - Add TypeScript configuration
- Export interfaces via
src/index.ts - Import in shell's
main.js
- Context Isolation: Renderer processes use strict context isolation
- Preload Scripts: Only approved APIs exposed via contextBridge
- No Node Integration: Web content cannot access Node.js APIs
- API Key Protection: Gemini API key never exposed to renderer
MIT License
Ashutosh Kumar Singh
ashutoshkumarsingh0x@gmail.com