InspirationFunctionalityDevelopment ApproachTechnical AchievementsLearning CurveFuture Directions

T.O.M.M.Y.: Talk Operated Multi-Media System

Devin DeMatto·01/27/2024
OpenAIReactTypeScriptChrome Extension

Inspiration

Our project was inspired by a personal story, focusing on a team member's brother who, despite having a mild case of Cerebral Palsy, faces challenges using traditional input devices for web navigation. T.O.M.M.Y. emerged from a desire to create a more inclusive and efficient web browsing experience, allowing users with physical disabilities—as well as those seeking supplementary navigation aids—to operate their internet browsers through voice commands.

Functionality

T.O.M.M.Y., short for "Talk Operated Multi-Media System", leverages using a trained model from OpenAI enabling users to navigate the web and interact with multimedia content solely using their voice. Beyond basic navigation, T.O.M.M.Y. offers the ability to summarize web pages, streamlining information consumption for its users.

Development Approach

Constructed using a combination of React, TypeScript, and the Manifest 3 framework for Chrome extensions, T.O.M.M.Y. translates speech into actionable browser commands. Despite the project's ambitious scope, our team managed to deliver a functional prototype within a constrained 24-hour development period, marking a significant first step toward our vision of a fully accessible web.

Technical Achievements

Key achievements of this project include the successful injection of the extension into the Document Object Model (DOM), enabling AI-mediated web interactions and establishing a robust communication channel between the frontend and the extension. This foundational work has laid the groundwork for filtering and presenting web page content in a user-friendly manner.

Learning Curve

The development journey introduced the team to the unique challenges of working with Chrome extensions and required rapid acquisition and application of new technical knowledge, including the React framework and the OpenAI API. The concentrated effort to research, learn, and implement these technologies within a 24-hour window stands as a testament to the team's dedication and adaptability.


Below is our initial workflow diagram, a rough draft conceptualized after thorough research into the project's requirements.

Initial Workflow Diagram

Future Directions

Looking ahead, T.O.M.M.Y. is slated for further development and refinement. Immediate priorities include revisiting and optimizing the existing codebase, enhancing the system's functionality, and improving the user interface.

  • Code Optimization: Revisiting and improving the efficiency of existing code.
  • Enhanced Functionality: Expanding TOMMY's capabilities for a broader range of voice commands.

Made With Love

Resume

Contact