Skip to content

arnavdeepaware/A-eye

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A-eye 👁️🤖

Empowering the visually impaired to navigate, shop, and transact with confidence using AI-powered assistance.

A-eye transforms smartphones into intelligent companions for blind and visually impaired individuals, providing real-time voice guidance for everyday tasks through computer vision and natural language processing.

🌟 Features

🚶 Smart Navigation

Real-time environmental awareness that helps users commute safely and independently. A-eye continuously analyzes surroundings through the phone's camera and provides voice guidance about:

  • Approaching obstacles
  • Crosswalks and intersections
  • Storefronts and landmarks
  • Path navigation

🛒 Intelligent Shopping Assistant

Point your camera at any product and A-eye becomes your personal shopping companion:

  • Reads product information and packaging
  • Checks ingredients against dietary restrictions and allergies
  • Provides conversational guidance to help make purchasing decisions
  • Identifies products and their details

💵 Currency Counter

Make cash transactions with confidence:

  • Accurately identifies and counts currency
  • Prevents shortchanging and fraud
  • Works with various denominations
  • Provides clear voice feedback

🎯 Inspiration

Worldwide, over 2.2 billion people live with vision impairment, facing daily challenges in navigation, shopping, and financial transactions. A-eye was created to empower blind and visually impaired individuals to move through the world with greater confidence and autonomy, leveraging cutting-edge AI technology as a trustworthy companion.

🛠️ Technology Stack

  • Vision AI: Google Gemini for real-time environment analysis and object recognition
  • Text-to-Speech: ElevenLabs for natural-sounding voice output
  • Camera Processing: Real-time video feed analysis
  • Voice-First Interface: Hands-free, intuitive interaction design

🚀 How It Works

  1. Capture: The app uses the phone's camera to continuously or selectively capture the user's environment
  2. Analyze: Gemini's vision capabilities process the visual data to understand scenes, read text, and identify objects
  3. Communicate: ElevenLabs converts the analysis into natural voice output, providing clear audio guidance
  4. Assist: Users receive actionable information through conversational voice interaction

🏆 Accomplishments

  • ✅ Functional multi-feature accessibility tool built during a hackathon
  • ✅ Real-time computer vision integrated with natural voice interaction
  • ✅ High accuracy currency counting across different denominations
  • ✅ Voice-first experience designed for true accessibility

💡 What We Learned

  • Accessible design means building voice-first experiences from the ground up
  • Optimizing AI model performance under tight latency constraints
  • Working with vision AI and speech synthesis APIs in real-time applications
  • The importance of building inclusive technology that serves real community needs

🔧 Challenges Overcome

Latency Optimization: Balancing AI analysis quality with response speed for real-time navigation where split-second awareness matters

Environmental Adaptability: Handling varying lighting conditions, camera angles, and environments (indoor/outdoor, bright/dim)

Conversational Design: Creating natural voice flows that present information concisely without overwhelming users

🔮 What's Next

  • GPS integration for turn-by-turn directions
  • Crowd-sourced accessibility information about businesses and public spaces
  • Multi-currency and multi-language support
  • Object finding capabilities ("Where did I put my keys?")
  • Social scene description features
  • Smart home device integration
  • Community feedback integration and user testing with visually impaired individuals

🤝 Contributing

We welcome contributions from the community! Whether you're interested in:

  • Improving accessibility features
  • Optimizing performance
  • Adding language support
  • Enhancing AI accuracy
  • Documentation and testing

Please feel free to open issues and submit pull requests.

👥 Team

Built with ❤️ at HackKnight by Aditya Dwivedi, Aditya Jha, Arnav Deepaware, Arsh Anand

🙏 Acknowledgments

  • Google Gemini for powerful vision AI capabilities
  • ElevenLabs for natural text-to-speech technology
  • The blind and visually impaired community for inspiring this project

Made possible by AI, designed for humanity 🌍

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •