Memory inefficiency in upload for large file sets

### Problem
The Python client's upload implementation has memory issues when processing tens of thousands of files, compared to the TypeScript CLI.

### Issues

1. Duplicate check phase — stores all checksums in memory
   - Location: `immich/_internal/upload.py:155-217` in `check_duplicates()`
   - Problem: All checksums are accumulated in a list before batching:
   ```python
   checksums: list[tuple[Path, str]] = []
   for filepath in files:
       checksum = await asyncio.to_thread(compute_sha1_sync, filepath)
       checksums.append((filepath, checksum))  # All stored in memory
   ```
   - Impact: For 10k+ files, this can use significant memory (e.g., ~10k × ~100 bytes = ~1MB+ just for checksums, plus overhead).

2. Upload phase — creates all coroutines upfront
   - Location: `immich/_internal/upload.py:394` in `upload_files()`
   - Problem: `asyncio.gather(*[upload_with_semaphore(f) for f in files])` creates all coroutines at once:
   ```p
   await asyncio.gather(*[upload_with_semaphore(f) for f in files])
   ```
   - Impact: For 10k+ files, this creates 10k+ coroutine objects in memory before processing.

### Comparison with TypeScript CLI
The TypeScript CLI (`immich/cli/src/commands/asset.ts`) handles this better:
- Streaming duplicate checks: batches checksums as they're computed (batches of 5000), avoiding storing all in memory
- Queue-based uploads: uses a `Queue` that processes files incrementally rather than creating all tasks upfront

### Recommended fixes

1. Stream duplicate checks: batch checksums as they're computed instead of storing all first
2. Use a task queue: replace `asyncio.gather` with a queue/worker pattern that processes files incrementally

### Expected outcome
- Lower memory usage for large uploads (10k+ files)
- Better scalability without memory spikes
- Behavior aligned with the TypeScript CLI

### Priority
Medium — functional but inefficient for very large uploads.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Memory inefficiency in upload for large file sets #26

Problem

Issues

Comparison with TypeScript CLI

Recommended fixes

Expected outcome

Priority

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Memory inefficiency in upload for large file sets #26

Description

Problem

Issues

Comparison with TypeScript CLI

Recommended fixes

Expected outcome

Priority

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions