Empowering the visually impaired to navigate, shop, and transact with confidence using AI-powered assistance.
A-eye transforms smartphones into intelligent companions for blind and visually impaired individuals, providing real-time voice guidance for everyday tasks through computer vision and natural language processing.
Real-time environmental awareness that helps users commute safely and independently. A-eye continuously analyzes surroundings through the phone's camera and provides voice guidance about:
- Approaching obstacles
- Crosswalks and intersections
- Storefronts and landmarks
- Path navigation
Point your camera at any product and A-eye becomes your personal shopping companion:
- Reads product information and packaging
- Checks ingredients against dietary restrictions and allergies
- Provides conversational guidance to help make purchasing decisions
- Identifies products and their details
Make cash transactions with confidence:
- Accurately identifies and counts currency
- Prevents shortchanging and fraud
- Works with various denominations
- Provides clear voice feedback
Worldwide, over 2.2 billion people live with vision impairment, facing daily challenges in navigation, shopping, and financial transactions. A-eye was created to empower blind and visually impaired individuals to move through the world with greater confidence and autonomy, leveraging cutting-edge AI technology as a trustworthy companion.
- Vision AI: Google Gemini for real-time environment analysis and object recognition
- Text-to-Speech: ElevenLabs for natural-sounding voice output
- Camera Processing: Real-time video feed analysis
- Voice-First Interface: Hands-free, intuitive interaction design
- Capture: The app uses the phone's camera to continuously or selectively capture the user's environment
- Analyze: Gemini's vision capabilities process the visual data to understand scenes, read text, and identify objects
- Communicate: ElevenLabs converts the analysis into natural voice output, providing clear audio guidance
- Assist: Users receive actionable information through conversational voice interaction
- ✅ Functional multi-feature accessibility tool built during a hackathon
- ✅ Real-time computer vision integrated with natural voice interaction
- ✅ High accuracy currency counting across different denominations
- ✅ Voice-first experience designed for true accessibility
- Accessible design means building voice-first experiences from the ground up
- Optimizing AI model performance under tight latency constraints
- Working with vision AI and speech synthesis APIs in real-time applications
- The importance of building inclusive technology that serves real community needs
Latency Optimization: Balancing AI analysis quality with response speed for real-time navigation where split-second awareness matters
Environmental Adaptability: Handling varying lighting conditions, camera angles, and environments (indoor/outdoor, bright/dim)
Conversational Design: Creating natural voice flows that present information concisely without overwhelming users
- GPS integration for turn-by-turn directions
- Crowd-sourced accessibility information about businesses and public spaces
- Multi-currency and multi-language support
- Object finding capabilities ("Where did I put my keys?")
- Social scene description features
- Smart home device integration
- Community feedback integration and user testing with visually impaired individuals
We welcome contributions from the community! Whether you're interested in:
- Improving accessibility features
- Optimizing performance
- Adding language support
- Enhancing AI accuracy
- Documentation and testing
Please feel free to open issues and submit pull requests.
Built with ❤️ at HackKnight by Aditya Dwivedi, Aditya Jha, Arnav Deepaware, Arsh Anand
- Google Gemini for powerful vision AI capabilities
- ElevenLabs for natural text-to-speech technology
- The blind and visually impaired community for inspiring this project
Made possible by AI, designed for humanity 🌍