Davor Cukeric
Live2026

ContextLens Turbo

Semantic code intelligence — search by meaning, not just keywords

Multi-stage Vector CompressionONNX RuntimeSQLite Hybrid IndexMCP ProtocolTree-sitterTypeScript / Node.js
Your codebase speaks in symbols. ContextLens Turbo listens for meaning.

ContextLens gave AI agents a precise map of your codebase. ContextLens Turbo gives them the ability to search it by meaning. Searching for 'user authentication flow' now surfaces the right functions even when the code is named 'validateToken' or 'checkCredentials' — no keyword overlap required. It runs entirely offline, requires no GPU, no external APIs, and no database infrastructure. Just a local index that understands what your code does.

The Problem

What needed solving

Keyword search has a ceiling. A developer searching for 'payment processing logic' won't find a function called 'handleStripeWebhook' unless they already know what to search for. AI agents face the same problem — they can find what they're told to look for, but they can't reason about intent across a codebase.

The obvious solution is semantic vector search: represent each symbol as a high-dimensional vector capturing its meaning, then find the closest matches at query time. The problem is scale. Raw embedding vectors for a mid-sized project consume hundreds of megabytes. Enterprise vector databases solve this with dedicated infrastructure, cloud APIs, and ongoing subscription costs — none of which belong in a self-hosted developer tool.

ContextLens classic solved the keyword problem well. The token overhead is 1,200 tokens. The indexing is fast. But it has no concept of meaning — only text.

The Solution

How I approached it

ContextLens Turbo embeds a custom multi-stage compression pipeline that reduces vector storage 6.9–7.4× compared to raw float32 representations, making semantic search practical on any developer machine with zero external dependencies.

Every symbol and documentation section gets a compressed semantic fingerprint stored alongside the existing keyword index in the same SQLite database. At query time, a hybrid ranking algorithm fuses keyword precision with semantic relevance — surfacing results that match both what you typed and what you meant. The pipeline runs entirely offline after a one-time 23MB model download. Your code never leaves the machine.

How It Works

Under the hood

The compression pipeline uses two sequential encoding stages. The first stage applies a recursive geometric decomposition to the high-dimensional embedding vector, converting it into a compact angle-based representation that captures the vector's directional structure with less than 4% angular error. The second stage encodes the residual error from the first stage into a small set of sign bits using a deterministic random projection — a statistical technique that provides an unbiased correction to the first stage's approximation.

The combined representation achieves 6.9–7.4× storage compression while maintaining ≥95% cosine similarity fidelity. The entire encode/decode cycle is pure TypeScript — no native binaries, no platform-specific builds.

The embedding model is loaded as an ONNX computation graph using int8 quantized weights (23MB, compared to 90MB for full precision). It loads in under 500ms from a local cache after first download and runs on CPU using SIMD-accelerated matrix operations — no GPU required, no cloud API calls, no data egress.

Indexing is incremental. Only files that have changed since the last run are re-processed. A full index of a 188-file, 2,600-symbol project completes in approximately 20 seconds, including all vector generation. Subsequent runs touching 3–5 changed files complete in under 2 seconds.

Performance

By the numbers

ContextLens vs ContextLens Turbo

MetricContextLensContextLens Turbo
Search typeKeyword (FTS5)Hybrid — keyword + semantic
Keyword search latency~1ms~1ms
Semantic search latency247–311ms
Model load (once per session)~430ms
Vector storage compression6.9–7.4×
GPU requiredNoNo
Offline after setupYesYes
Full index, 188-file project~2s~20s
Incremental re-index (3–5 files)<100ms<2s

Benchmarks by project size

ProjectFilesSymbolsIndex timeSemantic searchDB size
Small1181,0407.9s247ms2.5MB
Medium1882,63720.5s311ms5.5MB

Impact

Results and outcomes

The core capability delta over ContextLens classic is intent-based retrieval: finding semantically related code even when there's zero keyword overlap between the query and the symbol name. A search for 'error boundary handling' surfaces the relevant React components and utility functions regardless of what they're called.

All seven original MCP tools are preserved. The search tool gains a semantic mode and a hybrid mode. The context tool gains semantic neighbour discovery — when you pull context around a symbol, it now surfaces the most semantically similar code across the entire project, not just structural children and siblings.