Release notes

What's new in PhotoLens

Every change shipped, since day one. Versions follow Semantic Versioning.

What's New in PhotoLens

All notable changes to PhotoLens are documented here. Versions follow Semantic Versioning.

[1.2.0] — 2026-05-10

Theme: Multi-photo chat, rich sound design, secure vault, full accessibility polish, and a settings overhaul.

New Features

Multi-Photo Chat (URI Preview)

Attach multiple photos in a single chat session — add up to N photos from your gallery or camera directly into a conversation
Horizontal photo strip above the message list shows each attached image with a remove button
Camera capture within AskScreen — take a photo on the spot and send it immediately to the AI
Photo picker integration in AskScreen, same flow as HomeScreen's "New Photo" sheet

Sound Design

Five ambient audio cues — processing loop, reply ding, success chime, page turn, and delete sweep; all non-blocking and run off the main thread
Haptic feedback with configurable sensitivity (0–100%)
All sounds independently togglable per-event in Settings

Secure Vault (Collections)

Per-collection security — lock any album behind biometric or a custom password; vault photos are physically moved to filesDir/secure_photos, not merely hidden
Vault passwords hashed with SHA-256 + 16-byte random salt + 10,000 iterations; raw password is never written to disk
Constant-time password comparison to prevent timing attacks
Security progress bottom sheet with animated progress bar during vault operations
showSecureCollectionsInList setting to control vault visibility in the Collections tab

TTS Reader — Complete Rewrite

ReaderComponent embedded in every chat bubble and OCR result sheet
Reading modes: Characters, Words, Sentences, Paragraphs, Lines — switchable on the fly
Prev / Play-Pause / Next segment controls with TalkBack-labelled state
autoplayOnLoad and stopOnBackground lifecycle hooks
Non-blocking architecture: all TTS work runs on Dispatchers.IO; UI thread is never held
Voice selection, pitch, rate, and engine configurable in Settings

Full Settings Screen

Appearance & Gallery — theme (System/Light/Dark/AMOLED), accent color (6 options + Dynamic Material You), startup screen selection, grid density (2–5 columns), keep screen awake toggle
Advanced Filters — saved per-session; min/max width & height, date range picker, MIME type chip filter
AI Tuning — temperature, top-P, top-K sliders; response length (Brief / Balanced / Detailed / Extremely Detailed); thinking budget; streaming toggle; auto-generate descriptions
Sound & Vibration — per-event toggles for all five sound events plus haptic sensitivity slider
Security — biometric vs. custom-password vault mode; change vault password; disable vault with confirmation dialog
TTS — engine picker (all installed TTS engines), language picker (50+ languages from languages.json), pitch, rate, voice selection
Secure Sharing — strip GPS and full EXIF metadata before sharing
Language — AI response language picker with human-readable names (13+ languages from languages.json)

Onboarding Flow

5-page pager onboarding on first launch, fully TalkBack-navigable
Pages: Welcome, Smart Gallery, AI Features, Accessibility, Privacy & Security
Model download prompt triggered automatically after onboarding if no model is present

ReasoningBlock Component

Expandable "chain-of-thought" block in chat bubbles and photo description — shows the model's internal reasoning before the final answer
Collapsed by default; animates open/close with a chevron

Markdown Renderer

Full MarkdownContent composable with no external dependency
Supports: headings (H1–H4), bold, italic, bold-italic, inline code, code blocks, blockquotes, unordered lists, ordered lists, horizontal rules, links (tappable)

About, Help & Support, Privacy, Terms screens

All rendered via the generic MarkdownScreen composable backed by res/raw/*.md files

Improvements

Smart Collections empty state — contextual message depending on whether any AI descriptions exist; CTA button to go to Settings when descriptions haven't been generated yet
Photo detail screen — pinch-to-zoom + two-finger rotate with spring-back animation; rotation persisted per photo; zoom/rotation reset on navigation
Bulk operations — add to favorites, generate descriptions, recognize text, share, add to collection, delete — all from the selection top bar
OCR bottom sheet — TTS reader embedded; copy + share buttons; streaming progress indicator
ListScreen — collection/album detail with the same sort/filter/bulk-select as HomeScreen
Session memory optimization — "Preparing AI Environment" overlay with polite live-region announcement so TalkBack users know when chat is ready
Scroll-to-bottom FAB in AskScreen — appears when the user scrolls up; animates in/out with fade + scale
Chat export — copy all messages to clipboard or share as plain text
Message regeneration — re-run the last AI turn from the same message index
SpeechRecognizer lifecycle fix — recognizer is properly destroyed when AskScreen leaves composition, preventing a native listener leak after back-navigation

Bug Fixes

GemmaAdapter.processResponse now builds JSON with JSONObject/JSONArray instead of string interpolation — fixes crashes on model output containing quotes, backslashes, or newlines
Vault password stored as SHA-256 hash + salt rather than plaintext — security fix
LiteRtLmManager single-session constraint: properly prevents a second createConversation while one is already open; description session re-opens when chat session ends
TtsManager no longer blocks the UI thread — all TTS calls post to Main via withContext(Dispatchers.Main) from Dispatchers.IO
Photo URI resolution handles both file:// scheme and content:// URIs via ParcelFileDescriptor /proc/self/fd trick

[1.1.0] — 2026-04-14

Theme: Multi-model architecture with model-specific adapters for full control over tokenisation strategy.

New Features

Multi-Model Architecture

ModelAdapter interface — pluggable adapter per model that controls system instruction, content building, conversation config creation, and response post-processing
GemmaAdapter — dedicated adapter for Gemma 4 models; implements tool-call structured JSON response parsing with proper escaping
FastVlmAdapter — adapter skeleton for FastVLM-class models with different tokenisation strategy
Models defined in assets/models.json — adapter field determines which adapter is instantiated at runtime

Model-Specific Behaviour

getSystemInstruction() — per-adapter system prompt customisation
buildAnalysisContent() — per-adapter content construction for photo analysis turns
buildAskContent() — per-adapter content construction for interactive chat turns
createConversationConfig() — per-adapter ConversationConfig including sampler config and tool registration
processResponse() — per-adapter post-processing of raw model output and tool call arguments

`JsonModel` Data Class

New fields: adapter (string, selects adapter class), preferredBackend (overrides global CPU/GPU), toolCall (boolean), memoryMinRequired, memoryRecommended
ModelStatus wraps JsonModel with download state, progress bytes, speed, ETA, and error message

Models Manager

Redesigned ModelConfigScreen — card-per-model layout showing size, RAM requirements, download status, speed, ETA
Download/cancel/delete controls per model; active model shown with a checkmark badge
Model switch triggers full LiteRtLmManager.shutdown() + re-init to ensure correct adapter is loaded

Improvements

LiteRtLmManager session mode enum (DESCRIPTION / CHAT) replaces boolean flag — clearer invariants and easier to extend
Speculative decoding enabled via ExperimentalFlags.enableSpeculativeDecoding = true for faster token generation
ModelDownloadService — foreground service with resume support via HTTP Range header; progress reported via SharedFlow

Bug Fixes

ModelAdapter.processResponse with tool calls no longer crashes on model output that embeds special JSON characters — replaced string interpolation with JSONObject.put() throughout

[1.0.0] — 2026-04-14

Theme: Initial release — on-device AI photo gallery built for accessibility.

New Features

Core AI Integration

On-device AI inference via Google AI Edge LiteRT-LM with Gemma 4 multimodal model
Natural language photo descriptions — full sentences generated by Gemma 4 locally, no internet
Interactive Ask Mode — streaming chat interface to ask any question about a photo
Smart categorisation — photos automatically grouped into Nature, People, Food, Documents, Travel, Architecture, Pets, Sports via tool calls
OCR / text recognition — extract text from any image using the same on-device model
Thinking Mode — chain-of-thought reasoning visible before the final answer
Multilingual output — 13 languages selectable for AI response language

Gallery & Navigation

Grid and list view — togglable; grid supports 2–5 columns
Date-grouped photo timeline with sticky headers
Local Albums via Android MediaStore
Smart Collections — dynamic albums built from AI-generated categories
Favorites — star any photo; dedicated Favorites tab
Bottom navigation: Photos / Collections / Favorites
Full-screen photo detail with share, favorite, rotate, more-menu

Privacy & Security

Zero cloud processing — all AI runs on-device GPU/CPU/NPU
No analytics, no telemetry, no account required
Secure Vault foundation — architecture for secure collections in place

Accessibility Foundation

Full TalkBack semantic labelling on every UI element
WCAG 2.1 Level AA — high contrast, generous touch targets, predictable navigation
Live regions for progress announcements
Voice input for Ask Mode
Built-in TTS with segment-based reading controls

Model Management

Foreground download service for model files (~2.4 GB)
Progress tracking with speed and ETA
GPU / CPU backend selection

Settings

AI backend (GPU / CPU), temperature, response language
Basic gallery preferences (view mode, grid columns, sort order)

Other Screens

About, Help & Support — Markdown-rendered
Privacy Policy and Terms of Use — in-app Markdown

Technical Foundation

MVVM + Repository pattern; single source of truth in PhotoRepository
Jetpack Compose + Material 3 throughout
Hilt dependency injection
Room v9 database for photo metadata and description state
DataStore for persistent preferences
Coil for image loading with HEIC→JPEG auto-conversion

For the full feature list, see features.md.

What's New in PhotoLens

[1.2.0] — 2026-05-10

New Features

Multi-Photo Chat (URI Preview)

Sound Design

Secure Vault (Collections)

TTS Reader — Complete Rewrite

Full Settings Screen

Onboarding Flow

ReasoningBlock Component

Markdown Renderer

About, Help & Support, Privacy, Terms screens

Improvements

Bug Fixes

[1.1.0] — 2026-04-14

New Features

Multi-Model Architecture

Model-Specific Behaviour

JsonModel Data Class

Models Manager

Improvements

Bug Fixes

[1.0.0] — 2026-04-14

New Features

Core AI Integration

Gallery & Navigation

Privacy & Security

Accessibility Foundation

Model Management

Settings

Other Screens

Technical Foundation

`JsonModel` Data Class