HOCR - Screen OCR and Translation Tool

A simple desktop overlay app that captures text from your screen using Azure AI Vision and translates it with Azure Translator. Perfect for reading foreign text in games, websites, or any application. Because it uses Azure, the latency is really low, and the accuracy is high.

Features

Draggable selection box: Move and resize the green overlay to select text areas
Real-time OCR: Uses Azure AI Vision to extract text from screenshots
Instant translation: Translates captured text using Azure Translator
Always on top: Stays visible over other applications
Floating text display: Shows OCR results and translations as text attached to the overlay

Usage

Position the box: Drag the green overlay over the text you want to translate
Resize if needed: Drag from the bottom-right corner to resize the selection area
Right-click to capture: The app will OCR the selected area and translate the text
Close when done: Middle-click or press Escape to exit

Setup

Prerequisites

Python 3.7+
Azure AI Services account (Vision + Translator)

Installation

Clone or download this repository
Install dependencies:
```
pip install -r requirements.txt
```

Set up your Azure credentials in main.py:

AZURE_VISION_ENDPOINT = "your-vision-endpoint"
AZURE_VISION_KEY = "your-vision-key"
AZURE_TRANSLATOR_ENDPOINT = "your-translator-endpoint"
AZURE_TRANSLATOR_KEY = "your-translator-key"
AZURE_TRANSLATOR_REGION = "your-region"
SOURCE_LANGUAGE = "source-language"  # e.g., "zh" for Chinese | can leave empty for auto-detection

Getting Azure Credentials

Create an Azure account at portal.azure.com
Create a Computer Vision resource for OCR
Create a Translator resource for translation
Copy the endpoints and keys to the config section in main.py

You can use the free tiers of both resources (F0) for limited usage, which is usually enough for personal use.

Usage

Run the application:

python main.py

Controls

Left-click + drag: Move the overlay
Left-click bottom-right corner + drag: Resize the overlay
Right-click: Capture and translate text in the selected area
Middle-click or Escape: Close the application

Current Limitations

Minimum selection size is 50x50 pixels (Azure requirement)
Requires active internet connection for Azure services (and keys)

Potential Improvements

Support for multiple target languages (unlikely)
Offline OCR option
Offline translation option (unlikely)
Configuration file for Azure credentials
Hotkey support for capture without clicking
Text history/clipboard integration

Dependencies

PyQt6: GUI framework
mss: Fast screenshot capture
azure-ai-vision-imageanalysis: Azure Computer Vision OCR
azure-ai-translation-text: Azure Translator service
Pillow: Image processing support

Why "HOCR"?

It just stands for "Hovering OCR".

License

This project is licensed under the MIT License. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

HOCR - Screen OCR and Translation Tool

Features

Usage

Setup

Prerequisites

Installation

Getting Azure Credentials

Usage

Controls

Current Limitations

Potential Improvements

Dependencies

Why "HOCR"?

License

About

Uh oh!

Releases

Languages

License

jakstein/HOCR

Folders and files

Latest commit

History

Repository files navigation

HOCR - Screen OCR and Translation Tool

Features

Usage

Setup

Prerequisites

Installation

Getting Azure Credentials

Usage

Controls

Current Limitations

Potential Improvements

Dependencies

Why "HOCR"?

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Languages