Sample MVP code is open-source on GitHub at https://github.com/rvgpl/photosearch.
I take photos on too many devices.
Sony, Fujifilm, as main camera and iPhone and an Android phone(Wife's phone).
The pipeline was a mess:
Camera photos → Lightroom on laptop Phone photos → device
After a trip, I would dump everything in a random folder. I didn't have a way to index them.
No way to search across all of the photos.
Looking for “sunset from the boat” meant scrolling through thousands of images.
So I built a local-first photo search engine.
Drop in a folder with photos from anywhere. It indexes everything. You search in plain English.
No cloud. No APIs. No sync headaches.
Under the hood:
- YOLOv8 → objects
- SmolVLM → scene + context
- SQLite → index
- Streamlit → UI
- Apple MPS → runs locally with decent speed
Each image goes through:
- Object detection + scene tagging
- EXIF extraction (date, GPS, camera, focal length)
- Blur detection (Laplacian variance)
- Closed-eye detection (MediaPipe FaceMesh)
- Duplicate detection (perceptual hashing)
Everything is indexed into SQLite.
Query looks like:
baby playing on the beach
night streetfood
portraits 50mmHere is a short demo of the MVP:
It combines labels + metadata. No embeddings (yet).
~2s per image to index. Slow, but fine for now.
The real win for me is that everything is in one place, and I can actually find what I'm looking for.
Next steps are to improve speed, ranking, maybe embeddings, maybe face clustering.