AnimusGlasses

Animus for Meta Ray-Ban Display glasses.

Requires Meta Ray-Ban glasses (Gen 1, Gen 2, or Display) + Meta AI app on Android. Enable Developer Mode on glasses before installing.

What It Does

Point your Meta Ray-Ban Display glasses at any object. Gemini Vision identifies it through the glasses camera, generates a personality, and speaks its opening line through the glasses speakers. Talk back through the companion Android app. The object responds in character.

The camera input comes from the glasses. The audio output goes to the glasses speakers via Bluetooth A2DP. The phone handles all AI processing. Built on the Meta Wearables Device Access Toolkit (DAT SDK v0.6.0), demoed on Meta Ray-Ban Display glasses.

Why This Exists

Animus started as a web app — point your phone camera at an object, it speaks to you in its own voice. The glasses version is what it was always pointing toward. The same pipeline — camera in, vision AI, voice out — but now the camera is on your face and the voice comes from your ear. No screen interaction required for the core loop.

This is the interaction model that smart glasses platforms like AirCaps, Mentra, and Brilliant Labs are building toward. AnimusGlasses is an independent implementation of that loop, built in one day on consumer hardware.

What I Built

  • Android companion app in Kotlin + Jetpack Compose — session management, UI, all AI processing
  • Meta Wearables DAT SDK integration — device registration, session creation, camera stream management
  • I420 YUV frame conversion — glasses stream raw I420 frames over Bluetooth; converted to JPEG in-app for Gemini Vision
  • Gemini 2.5 Flash Vision — object identification, personality generation, opening line, voice selection
  • Groq Orpheus TTS — neural voice synthesis routed to glasses speakers via Bluetooth A2DP
  • Groq Whisper STT — voice input from phone mic, transcribed and sent to Gemini chat
  • SDK permission flow — Wearables.RequestPermissionContract() for glasses camera access through Meta AI app
  • Registration state management — checks existing registration before re-triggering the Meta AI authorization flow
  • All API calls routed through the Animus backend — no keys in the APK, safe to distribute
  • Full in-character conversation loop — object maintains personality across multiple exchanges

Architecture

The phone is the processing layer. The glasses are sensors and speakers.

app/src/main/java/com/varun/animusglasses/
├── AnimusApplication.kt     # Wearables.initialize() on app start
├── MainActivity.kt          # Android + SDK permission flow, Compose UI host
├── AnimusViewModel.kt       # Session, stream, Gemini, TTS, STT, frame conversion
└── ui/
    ├── ScanScreen.kt        # Connection status + scan button
    └── ChatScreen.kt        # Object personality + conversation UI

Full pipeline:

Meta Ray-Ban Display glasses
    | I420 YUV frames over Bluetooth (Meta Wearables DAT SDK)
Android companion app
    | i420ToBitmap() -- JPEG -- base64
animusai.app/api/gemini (vision)
    | object_type, personality_summary, opening_line, voice, vocal_direction
animusai.app/api/speak -- Groq Orpheus TTS -- WAV audio
    | AudioTrack -- Bluetooth A2DP -- glasses speakers

SDK session flow:
    Android permissions (BLUETOOTH_CONNECT, RECORD_AUDIO)
    -- Wearables.RequestPermissionContract() -- SDK camera permission
    -- Check RegistrationState -- startRegistration() if needed
    -- Wearables.createSession(AutoDeviceSelector())
    -- session.addStream(StreamConfiguration(MEDIUM, 7fps))
    -- stream.videoStream frames -- i420BufferToJpeg()
    -- user taps Scan -- vision API -- personality -- TTS -- glasses speakers
    -- user speaks/types -- chat API -- TTS -- glasses speakers

Why It Matters

The Meta Wearables DAT SDK is in public developer preview — very few independent developers have shipped anything on it. AnimusGlasses demonstrates the full camera-to-voice pipeline on real glasses hardware: I420 frame conversion, SDK session management, Bluetooth audio routing, and AI at every layer. It's not a demo app — it's a working product loop built in a day on a platform that most developers haven't touched.

The same architecture — glasses camera, vision AI, voice response through speakers — is the core loop of every smart glasses AI product being built right now. This is that, running on consumer hardware, open source.

Tech Summary

Technologies: Kotlin, Jetpack Compose, Meta Wearables DAT SDK v0.6.0, Gemini 2.5 Flash Vision API, Gemini Chat API, Groq Orpheus TTS, Groq Whisper STT, Android AudioTrack, Bluetooth A2DP, OkHttp

Hardware: Meta Ray-Ban Display glasses, Android phone (OnePlus 9 5G)