Skip to content

Conversation

@NotYuSheng
Copy link
Owner

No description provided.

NotYuSheng and others added 12 commits December 29, 2025 21:17
- Archive old frontend to frontend-old-20251229-203308
- Create new Vite + React + SGDS frontend
- Implement 4-step workflow: Upload -> Processing -> Transcript -> Summary
- Add visual progress indicators with color-coded steps
- Design upload/record cards with hover effects and animations
- Create processing screen with substep indicators
- Implement transcript view with sticky sidebar
- Add summary view with export options
- Fix backend PyTorch compatibility (downgrade to 2.5.1)
- Update Docker configuration for new frontend
- Remove old nginx service (frontend now handles routing)
…ar architecture

- Refactored backend with proper separation of concerns (database, models, security modules)
- Added comprehensive API documentation and design issue tracking
- Implemented secure file handling and input validation
- Switched from requests to httpx for async HTTP operations
- Added database persistence for job tracking
- Created modular frontend structure with components and services
- Enhanced Docker configuration for better security
- Added theme switcher component for UI customization

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…oints

Split monolithic processing into independent RESTful steps to enable
accurate progress tracking and cleaner architecture.

## Backend Changes

- Removed monolithic `process_transcription_job()` function
- Created three independent processing functions:
  - `process_transcription_step()` - Whisper transcription
  - `process_diarization_step()` - PyAnnote speaker diarization
  - `process_alignment_step()` - Align speakers with text
- Updated upload endpoint to only create jobs in 'uploaded' state
- Added new RESTful workflow endpoints:
  - POST/GET `/api/v1/jobs/{uuid}/transcriptions`
  - POST/GET `/api/v1/jobs/{uuid}/diarizations`
  - POST `/api/v1/jobs/{uuid}/alignments`

## Database Changes

- Added workflow state management functions:
  - `update_workflow_state()` - Update job workflow state
  - `update_step_progress()` - Update per-step progress
  - `save_transcription_data()` - Store Whisper output
  - `save_diarization_data()` - Store PyAnnote output
  - `get_transcription_data()` - Retrieve transcription
  - `get_diarization_data()` - Retrieve diarization
- Updated `add_job()` to accept workflow_state parameter
- Fixed SQL queries to use new workflow columns
- Created database migration schema with workflow support

## Frontend Changes

- Fixed race condition in polling with `workflowStepsStarted` ref
- Each workflow step now triggered only once instead of repeatedly
- Updated progress tracking to use per-step progress:
  - Transcription: 0-30%
  - Diarization: 30-90%
  - Alignment: 90-100%
- Improved workflow state transitions and error handling

## Workflow States

uploaded → transcribing → transcribed → diarizing → diarized →
aligning → completed | error

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Updated frontend packages for SGDS workflow UI
- Added speaker color utilities for UI consistency
- Updated nginx configuration for both frontend and reverse proxy
- Added .env.example with environment variable template
- Updated docker-compose.yml configuration
- Updated backend requirements.txt

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…nversion

- Delete original audio files immediately after successful conversion to WAV
- Add error handling: cleanup partial WAV if conversion fails, keep original
- Add hourly cleanup task to remove orphaned non-WAV files older than 1 hour
- Add created_at timestamp to job list API response
- Fixes issue where uploaded .m4a/.mp3 files remained after conversion

This prevents disk space waste from duplicate files and handles edge cases
like conversion failures or container crashes.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Remove "Past Meetings" button from header
- Remove progress bar from processing view (progress shown in step indicators)
- Simplify processing status message

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Add TIMEZONE_OFFSET environment variable for timezone configuration
- Update backend to read timezone from env (default GMT+8)
- Replace hardcoded tz_gmt8 with configurable tz_configured
- Update all export functions (PDF/Markdown) to use configured timezone
- Add timezone configuration section to README
- Update example.env with timezone offset documentation
- Clean up old frontend directory

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Add speaker dropdown to edit transcript segment modal
- Enable reassigning segments to existing speakers
- Allow creating new speakers via "+ Add New Speaker" option
- Automatically select newly added speaker in dropdown
- Update handleSaveSegmentText to persist speaker changes
- Backend already supports speaker field in transcript updates
- Meeting Info card auto-refreshes with new speakers after save

Fixes issue where diarization errors couldn't be corrected:
- Wrong speaker detected -> reassign to correct speaker
- Missing speakers -> add new speakers manually
- Speakers merged incorrectly -> split by creating new ones

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Major UI/UX improvements for a cleaner, more professional appearance:

**Color Theme Overhaul:**
- Removed bright green from workflow breadcrumb (now slate gray)
- Replaced purple gradient loading screen with clean light gray
- Updated processing checkmarks from green to slate gray
- Simplified speaker badge colors to muted professional palette
- Established CSS variables for consistent theming throughout app

**Visual Improvements:**
- Removed green (#198754) - replaced with blue/gray scale
- Primary color: Professional blue (#2563eb)
- Secondary color: Slate gray (#64748b)
- Background: Light blue-gray (#f8fafc)
- Borders: Subtle gray (#e2e8f0)

**Component Updates:**
- Loading screen: Clean white card on light gray background
- Error screen: Softer slate icon instead of red
- Workflow steps: Blue for active, gray for completed
- Processing indicators: Consistent blue theme
- Speaker badges: Blue, slate, cyan, dark slate rotation
- Export options: Primary CTA highlighted with solid blue

**Typography & Branding:**
- Changed navbar tagline from "AI Meeting Transcription" to "AI Meeting Summary"
- Consistent use of CSS custom properties for maintainability

Creates a professional, corporate-friendly aesthetic similar to Linear,
Notion, or Vercel - clean, minimal, and focused.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Major refactoring of frontend codebase to improve maintainability,
testability, and code organization.

## Changes

### App.jsx Refactoring
- Reduced from 1,510 lines to 226 lines (85% reduction)
- Extracted business logic into 7 custom hooks
- Extracted UI into 21 modular components
- Maintained all existing functionality with zero breaking changes

### Custom Hooks (Business Logic)
- useBackendHealth: Backend connection management
- useJobHistory: Recent jobs fetching and management
- useFileUpload: File upload and drag-drop handling
- useTranscriptPolling: Workflow state polling
- useTranscript: Transcript data and editing
- useSpeakerManagement: Speaker identification and editing
- useSummary: Summary generation and editing

### Components (UI Layer)
Layout:
- Header, WorkflowSteps, Footer

Common:
- LoadingScreen, ErrorAlert

Views:
- UploadView (with FileUploadCard, RecordingCard, RecentJobsList)
- ProcessingView
- TranscriptView (with TranscriptSegment, MeetingInfoSidebar)
- SummaryView (with SummaryContent, ExportSidebar, CollapsibleTranscript)

Modals:
- EditSpeakersModal, EditTextModal, EditSummaryModal

### Export Filename Improvements
- Created fileNaming.js utility module
- Fixed export filenames from ugly UUIDs to clean, descriptive names
- Added timestamps to all exports (YYYY-MM-DD_HH-MM format)
- Removed double extensions (meeting.mp3.pdf → meeting_2024-01-15_14-30.pdf)
- Distinguished transcript-only vs full exports with clear labeling

Examples:
- Before: transcript_a1b2c3d4-e5f6.md, meeting.mp3.pdf
- After: Team_Meeting_2024-01-15_14-30.md, Team_Meeting_2024-01-15_14-30.pdf

## Benefits
- Single responsibility principle for all components and hooks
- Easier to test individual features in isolation
- Better code reusability across the application
- Improved developer experience and onboarding
- Cleaner separation of concerns (logic vs presentation)
- Maintains single-page wizard architecture (no routing needed)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Implement browser-based audio recording using getUserMedia
- Add protocol detection (HTTPS/localhost check) for microphone access
- Create RecordingModal component with live timer
- Add RecordingCard with disabled state for HTTP access
- Include hover tooltips explaining HTTPS requirement
- Support WebM audio format with automatic backend conversion
- Add comprehensive HTTPS deployment guide to README
- Clean up duplicate .env.example file
- Update example.env with correct LLM endpoint format

Recording works on localhost (HTTP) and requires HTTPS for remote access.
Backend supports recording via existing /api/v1/jobs endpoint.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Change HTML title from "frontend" to "MeetMemo"
- Replace Vite logo with custom purple FileText icon
- Add both outlined and filled SVG favicon variants
- Use filled purple icon (#7C3AED) to match navbar branding

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @NotYuSheng, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request represents a significant architectural overhaul of both the backend and frontend systems. The backend now leverages PostgreSQL for more reliable and scalable job management, adopting asynchronous programming paradigms and stricter API contracts. The frontend has been entirely rebuilt with modern React tooling and a workflow-driven user experience, enhancing maintainability and user interaction. These changes lay a solid foundation for future development, addressing critical security, performance, and code quality concerns.

Highlights

  • Backend Architectural Overhaul: The backend has undergone a major refactor, transitioning from a CSV-based job management system to a robust PostgreSQL database with asynchronous operations using asyncpg. This includes a new database schema, migration scripts, and dedicated modules for database interactions, Pydantic models for strict API validation, and enhanced security utilities for input sanitization.
  • Frontend Modernization and UX Improvements: The frontend has been completely rewritten using Vite, React, and Singapore Government Design System (SGDS) components, introducing a modern, responsive, and workflow-based user interface. New custom hooks manage complex state logic for audio recording, file uploads, job history, speaker management, and API polling, significantly improving code organization and reusability.
  • Enhanced API Documentation and Design: Comprehensive documentation has been added, detailing existing API design issues (security, performance, code quality) with recommended fixes and implementation priorities. A new API documentation file provides clear specifications for all backend endpoints, data models, and example workflows.
  • Dependency Updates and Build Fixes: Frontend Docker build failures related to missing development dependencies and TypeScript version conflicts have been resolved. Backend dependencies were updated to include new asynchronous libraries, and docker-compose.yml was configured to integrate PostgreSQL and manage new cache volumes. Deployment options and environment variables in README.md and example.env have also been updated.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request represents a major and impressive refactoring of both the backend and frontend. The backend is modernized by migrating from a CSV-based storage to a robust PostgreSQL database with an async architecture. The frontend is migrated from Create React App to a modern Vite-based setup, adopting the Singapore Government Design System and a well-structured component/hook architecture. The addition of detailed documentation and security modules is also a significant improvement.

My review focuses on a few key areas:

  • Consistency: Ensuring documentation accurately reflects the implemented code changes.
  • Security: Highlighting a potentially risky configuration in the Docker setup.
  • Maintainability: Identifying and suggesting the removal of unused code and configuration files.

Overall, this is a high-quality contribution that significantly improves the project's architecture, performance, and maintainability. The identified issues are minor compared to the scale of the improvements.

BACKEND_FIX.md Outdated
Comment on lines 1 to 81
# Backend Fix - PyAnnote Hugging Face Hub Compatibility

## Issue

Backend was returning `500 Internal Server Error` when uploading audio files for transcription.

**Error Message:**
```
ERROR:root:2025-12-28 13:28:14: Unexpected error during audio processing for file chiam-sharing.m4a:
hf_hub_download() got an unexpected keyword argument 'use_auth_token'
TypeError: hf_hub_download() got an unexpected keyword argument 'use_auth_token'
```

## Root Cause

The `huggingface_hub` library updated its API and **deprecated the `use_auth_token` parameter** in favor of `token`.

PyAnnote's `Pipeline.from_pretrained()` internally calls `hf_hub_download()`, which was being called with the old parameter name.

## Solution

Updated all occurrences of `use_auth_token` to `token` in the backend code.

### Files Modified

1. **`backend/main.py:1093`**
```python
# Before
pipeline = Pipeline.from_pretrained(
"pyannote/speaker-diarization-3.1",
use_auth_token=hf_token
)

# After
pipeline = Pipeline.from_pretrained(
"pyannote/speaker-diarization-3.1",
token=hf_token
)
```

2. **`backend/pyannote_whisper/cli/transcribe.py:109`**
```python
# Before
pipeline = Pipeline.from_pretrained("pyannote/speaker-diarization-3.1",
use_auth_token=os.getenv("HF_TOKEN"))

# After
pipeline = Pipeline.from_pretrained("pyannote/speaker-diarization-3.1",
token=os.getenv("HF_TOKEN"))
```

## Fix Applied

1. Updated parameter names in both files
2. Rebuilt Docker container: `docker compose build meetmemo-backend`
3. Restarted backend: `docker compose restart meetmemo-backend`

## Verification

Backend now starts successfully:
```
INFO: Started server process [1]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
```

Audio transcription and speaker diarization now work correctly without errors.

## Related Documentation

- [Hugging Face Hub Migration Guide](https://huggingface.co/docs/huggingface_hub/package_reference/overview#authentication)
- PyAnnote.audio uses `huggingface_hub` for model downloads
- Parameter renamed from `use_auth_token``token` in `huggingface_hub>=0.14.0`

---

**Fix Date:** 2025-12-28
**Issue:** Backend 500 error on audio upload
**Resolution:** Updated PyAnnote authentication parameter
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There's an inconsistency between this documentation and the implemented changes. This file states the solution was to change the use_auth_token parameter to token in the code. However, the requirements.txt file pins huggingface_hub to version 0.13.4 (where use_auth_token is still correct), and the code in pyannote_whisper/cli/transcribe.py still uses use_auth_token. The actual fix implemented appears to be pinning the dependency, not changing the code. Please update this documentation to reflect the actual solution to avoid confusion.

NotYuSheng and others added 5 commits January 4, 2026 23:28
Backend improvements:
- Fix import order: standard library before third-party imports
- Convert 100+ logging f-strings to lazy % formatting for better performance
- Add docstrings to Config classes in models.py
- Fix exception chaining with raise-from in security.py
- Break long lines (>100 chars) into properly formatted multi-line statements
- Add pylint disable for acceptable Pydantic model warnings

Results: Improved pylint score from 8.09/10 to 8.97/10

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Update demo GIF reference from old demo to MeetMemo-DEMO_v2.0.0.gif
- Update sample files list to reflect v2.0.0 demo
- Keep reference to v1.0.0 demo as legacy version

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Add references to new v2.0.0 transcript exports (PDF & Markdown)
- Add references to new v2.0.0 summary exports (PDF & Markdown)
- Organize sample files by version (v2.0.0 vs v1.0.0)
- Highlight v2.0.0 as latest with improved export formats

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Sample Files:
- Add v2.0.0 transcript exports (PDF & Markdown)
- Add v2.0.0 summary export (Markdown)
- Add v2.0.0 demo GIF showcasing new UI
- Rename v1.0.0 files to include version suffix for clarity

Configuration:
- Update example.env LLM_API_KEY comment for clarity

All sample files now clearly versioned to distinguish between v1.0.0 and v2.0.0 outputs

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Remove all legacy non-versioned API endpoints (/jobs, /health, etc.)
- Update API_DOCUMENTATION.md to use /api/v1 prefix for all endpoints
- Delete outdated BACKEND_FIX.md and DOCKER_BUILD_FIX.md documentation
- Add security comment explaining TORCH_LOAD_WEIGHTS_ONLY=0 requirement
- Apply code quality improvements from pylint

Frontend already uses versioned /api/v1/* endpoints exclusively,
making legacy routes unnecessary. This cleanup reduces code
maintenance burden and prevents confusion.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Copy link
Owner Author

@NotYuSheng NotYuSheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reviewed

Remove API_DESIGN_ISSUES.md as 83% of identified issues have been
resolved in the v2.0.0 refactor:
- PostgreSQL migration completed
- Modular architecture implemented
- Async operations throughout
- Proper REST API design with versioning
- Comprehensive input validation
- Pydantic models for type safety

Remaining items (auth, Redis) are intentional design decisions for
VPN-protected deployment.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@NotYuSheng NotYuSheng merged commit 96afb7b into main Jan 5, 2026
0 of 2 checks passed
@NotYuSheng NotYuSheng deleted the feature/major-ui-backend-refactor branch January 5, 2026 00:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants