-
Notifications
You must be signed in to change notification settings - Fork 751
feat(mcp): simplify MCP tool architecture and unify platform tools #1507
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
quanru
wants to merge
30
commits into
v1
Choose a base branch
from
feat/mcp-simplify-tools
base: v1
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
30 commits
Select commit
Hold shift + click to select a range
8ab4326
feat(mcp): simplify MCP tool architecture and unify platform tools
quanru 93be11b
refactor(shared): remove zod dependency and clean up README files acrβ¦
quanru 484fdc3
Update packages/shared/src/mcp/base-tools.ts
quanru 24a8d4e
Update packages/mcp/src/web-tools.ts
quanru e6ab696
fix(android): update launchUri to launch in AndroidMidsceneTools
quanru c1aa52a
feat(android, ios): implement AI-driven app loading and update connecβ¦
quanru 02df3a4
fix(mcp): update type definitions path in package.json and enhance rsβ¦
quanru 8f5b3a9
feat(mcp): add default action spaces for Android, iOS, and Web tools;β¦
quanru bbf54e8
feat(mcp): implement temporary device creation for Android, iOS, and β¦
quanru 60fe955
refactor(mcp): remove default action space methods from Android, iOS,β¦
quanru c95765a
feat(mcp): add HTTP launch support for Android, iOS, and Web MCP servβ¦
quanru c4e67ae
refactor(mcp): streamline command line argument parsing and launch loβ¦
quanru cb99f62
refactor(mcp): enhance type definitions and improve error handling inβ¦
quanru 8569fde
refactor(mcp): delegate web MCP server launch to the main MCP packageβ¦
quanru cfa03e0
feat(mcp): refactor MCP server structure and tools
quanru 1001e47
refactor(mcp): improve bridge mode initialization and test organization
quanru 7da15b5
fix(mcp): address GitHub Copilot review comments from PR #1507
quanru ce7e1ce
refactor(mcp): improve code quality following yuyutaotao style guide
quanru 84ecb68
refactor(mcp): centralize app loading timeout constants in shared pacβ¦
quanru 3fb98b4
feat(web-bridge-mcp): add initial implementation of web bridge MCP seβ¦
quanru 0ee904f
refactor(mcp): remove web-mcp package and update dependencies
quanru 88698d3
feat(mcp): refactor code structure for improved readability and maintβ¦
quanru 848fcd1
chore(mcp): update @rslib/core to version 0.18.3 across multiple packβ¦
quanru 5c2fddb
refactor(tests): enhance server startup handling with error catching β¦
quanru 6fe50da
feat(android-tools, ios-tools): add autoDismissKeyboard option for agβ¦
quanru ccd72e7
feat(mcp): add rspack configuration with BannerPlugin for multiple paβ¦
quanru 34df6ec
feat(mcp): update package.json and rslib.config.ts for rspack integraβ¦
quanru eabb1f0
feat(shared): enhance error handling and make parameters optional in β¦
quanru 3ac6a22
feat(core): replace console.log with debug logging for locateParam
quanru 69b9d84
feat(mcp): make 'prompt' field optional in locate schema for better fβ¦
quanru File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,3 @@ | ||
| # Midscene MCP | ||
|
|
||
| docs: https://midscenejs.com/mcp.html |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,48 @@ | ||
| { | ||
| "name": "@midscene/android-mcp", | ||
| "version": "1.0.0", | ||
| "description": "Midscene MCP Server for Android automation", | ||
| "bin": "dist/index.js", | ||
| "files": ["dist"], | ||
| "main": "./dist/server.js", | ||
| "types": "./dist/server.d.ts", | ||
| "exports": { | ||
| ".": { | ||
| "types": "./dist/server.d.ts", | ||
| "default": "./dist/server.js" | ||
| }, | ||
| "./server": { | ||
| "types": "./dist/server.d.ts", | ||
| "default": "./dist/server.js" | ||
| } | ||
| }, | ||
| "scripts": { | ||
| "build": "rslib build", | ||
| "dev": "npm run build:watch", | ||
| "build:watch": "rslib build --watch", | ||
| "mcp-playground": "npx @modelcontextprotocol/inspector node ./dist/index.js", | ||
| "test": "vitest run", | ||
| "inspect": "node scripts/inspect.mjs" | ||
| }, | ||
| "devDependencies": { | ||
| "@midscene/android": "workspace:*", | ||
| "@midscene/core": "workspace:*", | ||
| "@midscene/shared": "workspace:*", | ||
| "@modelcontextprotocol/inspector": "^0.16.3", | ||
| "@modelcontextprotocol/sdk": "1.10.2", | ||
| "@rslib/core": "^0.18.3", | ||
| "@rspack/core": "1.6.6", | ||
| "@types/node": "^18.0.0", | ||
| "dotenv": "^16.4.5", | ||
| "typescript": "^5.8.3", | ||
| "vitest": "3.0.5" | ||
| }, | ||
| "dependencies": { | ||
| "@silvia-odwyer/photon": "0.3.3", | ||
| "@silvia-odwyer/photon-node": "0.3.3", | ||
| "bufferutil": "4.0.9", | ||
| "sharp": "^0.34.3", | ||
| "utf-8-validate": "6.0.5" | ||
| }, | ||
| "license": "MIT" | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,56 @@ | ||
| import { defineConfig } from '@rslib/core'; | ||
| import { rspack } from '@rspack/core'; | ||
| import { version } from './package.json'; | ||
|
|
||
| export default defineConfig({ | ||
| source: { | ||
| define: { | ||
| __VERSION__: `'${version}'`, | ||
| }, | ||
| entry: { | ||
| index: './src/index.ts', | ||
| server: './src/server.ts', | ||
| }, | ||
| }, | ||
| output: { | ||
| externals: [ | ||
| (data, cb) => { | ||
| if ( | ||
| data.context?.includes('/node_modules/ws/lib') && | ||
| ['bufferutil', 'utf-8-validate'].includes(data.request as string) | ||
| ) { | ||
| cb(undefined, data.request); | ||
| } | ||
| cb(); | ||
| }, | ||
| '@silvia-odwyer/photon', | ||
| '@silvia-odwyer/photon-node', | ||
quanru marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| '@modelcontextprotocol/sdk', | ||
| ], | ||
| }, | ||
| tools: { | ||
| rspack: { | ||
| plugins: [ | ||
| new rspack.BannerPlugin({ | ||
| banner: '#!/usr/bin/env node', | ||
| raw: true, | ||
| test: /^index\.js$/, | ||
| }), | ||
| ], | ||
| optimization: { | ||
| minimize: false, | ||
| }, | ||
| }, | ||
| }, | ||
| lib: [ | ||
| { | ||
| format: 'cjs', | ||
| syntax: 'es2021', | ||
| output: { | ||
| distPath: { | ||
| root: 'dist', | ||
| }, | ||
| }, | ||
| }, | ||
| ], | ||
| }); | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,112 @@ | ||
| import { type AndroidAgent, agentFromAdbDevice } from '@midscene/android'; | ||
| import { z } from '@midscene/core'; | ||
| import { getDebug } from '@midscene/shared/logger'; | ||
| import { | ||
| type BaseAgent, | ||
| BaseMidsceneTools, | ||
| type ToolDefinition, | ||
| defaultAppLoadingCheckIntervalMs, | ||
| defaultAppLoadingTimeoutMs, | ||
| } from '@midscene/shared/mcp'; | ||
|
|
||
| const debug = getDebug('mcp:android-tools'); | ||
|
|
||
| /** | ||
| * Android-specific tools manager | ||
| * Extends BaseMidsceneTools to provide Android ADB device connection tools | ||
| */ | ||
| export class AndroidMidsceneTools extends BaseMidsceneTools { | ||
| protected createTemporaryDevice() { | ||
| // Use require to avoid circular dependency with @midscene/android | ||
| const { AndroidDevice } = require('@midscene/android'); | ||
| // Create minimal temporary instance without connecting to device | ||
| // The constructor doesn't establish ADB connection | ||
| return new AndroidDevice('temp-for-action-space', {}); | ||
| } | ||
|
|
||
| protected async ensureAgent(deviceId?: string): Promise<AndroidAgent> { | ||
| if (this.agent && deviceId) { | ||
| // If a specific deviceId is requested and we have an agent, | ||
| // destroy it to create a new one with the new device | ||
| try { | ||
| await this.agent.destroy?.(); | ||
| } catch (error) { | ||
| debug('Failed to destroy agent during cleanup:', error); | ||
| } | ||
| this.agent = undefined; | ||
| } | ||
|
|
||
| if (this.agent) { | ||
| return this.agent as unknown as AndroidAgent; | ||
| } | ||
|
|
||
| debug('Creating Android agent with deviceId:', deviceId || 'auto-detect'); | ||
| const agent = await agentFromAdbDevice(deviceId, { | ||
| autoDismissKeyboard: false, | ||
| }); | ||
| this.agent = agent as unknown as BaseAgent; | ||
| return agent; | ||
| } | ||
|
|
||
| /** | ||
| * Provide Android-specific platform tools | ||
| */ | ||
| protected preparePlatformTools(): ToolDefinition[] { | ||
| return [ | ||
| { | ||
| name: 'android_connect', | ||
| description: | ||
| 'Connect to Android device and optionally launch an app. If deviceId not provided, uses the first available device.', | ||
| schema: { | ||
| deviceId: z | ||
| .string() | ||
| .optional() | ||
| .describe('Android device ID (from adb devices)'), | ||
| uri: z | ||
| .string() | ||
| .optional() | ||
| .describe( | ||
| 'Optional URI to launch app (e.g., market://details?id=com.example.app)', | ||
| ), | ||
| }, | ||
| handler: async ({ | ||
| deviceId, | ||
| uri, | ||
| }: { | ||
| deviceId?: string; | ||
| uri?: string; | ||
| }) => { | ||
| const agent = await this.ensureAgent(deviceId); | ||
|
|
||
| // If URI is provided, launch the app | ||
| if (uri) { | ||
| await agent.page.launch(uri); | ||
|
|
||
| // Wait for app to finish loading using AI-driven polling | ||
| await agent.aiWaitFor( | ||
| 'the app has finished loading and is ready to use', | ||
| { | ||
| timeoutMs: defaultAppLoadingTimeoutMs, | ||
| checkIntervalMs: defaultAppLoadingCheckIntervalMs, | ||
| }, | ||
| ); | ||
| } | ||
|
|
||
| const screenshot = await agent.page.screenshotBase64(); | ||
|
|
||
| return { | ||
| content: [ | ||
| { | ||
| type: 'text', | ||
| text: `Connected to Android device${deviceId ? `: ${deviceId}` : ' (auto-detected)'}${uri ? ` and launched: ${uri} (app ready)` : ''}`, | ||
| }, | ||
| ...this.buildScreenshotContent(screenshot), | ||
| ], | ||
| isError: false, | ||
| }; | ||
| }, | ||
| autoDestroy: false, // Keep agent alive for subsequent operations | ||
| }, | ||
| ]; | ||
| } | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,19 @@ | ||
| import { parseArgs } from 'node:util'; | ||
| import { type CLIArgs, CLI_ARGS_CONFIG } from '@midscene/shared/mcp'; | ||
| import { AndroidMCPServer } from './server.js'; | ||
|
|
||
| const { values } = parseArgs({ options: CLI_ARGS_CONFIG }); | ||
| const args = values as CLIArgs; | ||
|
|
||
| const server = new AndroidMCPServer(); | ||
|
|
||
| if (args.mode === 'http') { | ||
| server | ||
| .launchHttp({ | ||
| port: Number.parseInt(args.port || '3000', 10), | ||
| host: args.host || 'localhost', | ||
| }) | ||
| .catch(console.error); | ||
| } else { | ||
| server.launch().catch(console.error); | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,22 @@ | ||
| import { BaseMCPServer } from '@midscene/shared/mcp'; | ||
| import { AndroidMidsceneTools } from './android-tools.js'; | ||
|
|
||
| declare const __VERSION__: string; | ||
|
|
||
| /** | ||
| * Android MCP Server | ||
| * Provides MCP tools for Android automation through ADB | ||
| */ | ||
| export class AndroidMCPServer extends BaseMCPServer { | ||
| constructor() { | ||
| super({ | ||
| name: '@midscene/android-mcp', | ||
| version: __VERSION__, | ||
| description: 'Midscene MCP Server for Android automation', | ||
| }); | ||
| } | ||
|
|
||
| protected createToolsManager(): AndroidMidsceneTools { | ||
| return new AndroidMidsceneTools(); | ||
| } | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,99 @@ | ||
| import { afterAll, beforeAll, describe, expect, it, vi } from 'vitest'; | ||
| import { AndroidMCPServer } from '../src/server.js'; | ||
|
|
||
| describe('AndroidMCPServer HTTP mode', () => { | ||
| let server: AndroidMCPServer; | ||
| const testPort = 13580; // Use different port than web-bridge-mcp | ||
| const testHost = '127.0.0.1'; // Use IPv4 explicitly to avoid IPv6 issues in CI | ||
|
|
||
| beforeAll(async () => { | ||
| server = new AndroidMCPServer(); | ||
| }); | ||
|
|
||
| afterAll(async () => { | ||
| // Cleanup will be handled by process exit | ||
| }); | ||
|
|
||
| it('should start HTTP server successfully', async () => { | ||
| // Mock process.exit to prevent test exit | ||
| const exitSpy = vi.spyOn(process, 'exit').mockImplementation((() => { | ||
| throw new Error('process.exit called'); | ||
| }) as any); | ||
|
|
||
| try { | ||
| // Start server in background and handle potential errors | ||
| const serverPromise = server.launchHttp({ | ||
| port: testPort, | ||
| host: testHost, | ||
| }); | ||
|
|
||
| // Catch any errors from server startup without blocking | ||
| serverPromise.catch((error) => { | ||
| console.error('Server startup error:', error); | ||
| }); | ||
|
|
||
| // Wait for server to start with retries (up to 5 seconds) | ||
| let connected = false; | ||
| for (let i = 0; i < 10; i++) { | ||
| await new Promise((resolve) => setTimeout(resolve, 500)); | ||
| try { | ||
| const response = await fetch(`http://${testHost}:${testPort}/mcp`, { | ||
| method: 'POST', | ||
| headers: { | ||
| 'Content-Type': 'application/json', | ||
| }, | ||
| body: JSON.stringify({ | ||
| jsonrpc: '2.0', | ||
| method: 'initialize', | ||
| params: { | ||
| protocolVersion: '2024-11-05', | ||
| capabilities: {}, | ||
| clientInfo: { | ||
| name: 'test-client', | ||
| version: '1.0.0', | ||
| }, | ||
| }, | ||
| id: 1, | ||
| }), | ||
| }); | ||
|
|
||
| // Server should respond (even if initialization fails without device) | ||
| expect(response.status).toBeGreaterThanOrEqual(200); | ||
| expect(response.status).toBeLessThan(600); | ||
| connected = true; | ||
| console.log( | ||
| `β Android MCP server started and responding (attempt ${i + 1})`, | ||
| ); | ||
| break; | ||
| } catch (error) { | ||
| if (i === 9) { | ||
| throw error; // Throw on last attempt | ||
| } | ||
| // Otherwise continue retrying | ||
| } | ||
| } | ||
|
|
||
| expect(connected).toBe(true); | ||
| } finally { | ||
| exitSpy.mockRestore(); | ||
| } | ||
| }, 15000); // Increase timeout for CI | ||
|
|
||
| it('should reject invalid port numbers', async () => { | ||
| const invalidServer = new AndroidMCPServer(); | ||
|
|
||
| await expect( | ||
| invalidServer.launchHttp({ | ||
| port: -1, | ||
| host: testHost, | ||
| }), | ||
| ).rejects.toThrow(/Invalid port number/); | ||
|
|
||
| await expect( | ||
| invalidServer.launchHttp({ | ||
| port: 99999, | ||
| host: testHost, | ||
| }), | ||
| ).rejects.toThrow(/Invalid port number/); | ||
| }); | ||
| }); |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
prettierversion specifier in package.json has been downgraded from^3.7.4to^3.6.2, but the lock file shows version3.7.4is still being used. This mismatch could cause confusion. Either update the specifier to match the actual version used (^3.7.4) or ensure the lock file reflects the intended version (^3.6.2).