The Problem vs. The Solution

Memory Tokens →
Standard Growing Memory
Memory O(N) grows forever → KV cache keeps expanding, leads to OOM
Elliott Triangular
Memory O(L·d) stays flat at L=2,048 or 8,192 - you choose

How It Works

y_i = Σ [1 − (i−j)/L] · v_j
y_i = (1/L) × (sum_jv_i − (i − L) × sum_v_i)
w_sum = m − m(m−1)/(2L),   y_i = raw / w_sum

L is configurable - smaller L = less memory, larger L = longer memory. Both use same O(1) update.

Circular buffer with two running sums showing constant memory

Memory stays SAME size forever: circular buffer (L slots) + two running sums

Why Hosting Companies Care

Fixed Memory = Higher Density

Cap per-user memory at ~L·d. Pick L=2,048 for ~64MB/user or L=8,192 for ~256MB/user.

Stable Latency = O(d) per token

No quadratic scan. Token 5 or token 500k costs the same — predictable SLOs.

No OOM Crashes

Long chats, agents, and logs can't blow up memory. Eliminate midnight pages.

Built-in Privacy

Auto-forget after L tokens. Perfect for privacy modes and compliance.

Edge Ready

Runs on L4 / CPU with tiny footprint. No custom kernels required.

Cost Math — Example 7B model, d=4096

Metric Standard Attention (32k ctx) Elliott L=2,048 Elliott L=8,192
Memory per user ~8-12 GB (grows) ~64 MB (flat) ~256 MB (flat)
Concurrent users per 80GB H100 ~6-8 ~1,000+ ~250+
Best use case short demos edge VPS, chatbots, AU hosting premium long-chat, code assistants
Latency at token 50k degrades constant constant

Illustrative numbers – actual depends on precision and implementation

Proven Novelty

Kernel Census 2026 showing Elliott Triangular as NEW

Listed as ELLIOTT TRIANGULAR – NEW in Kernel Census 2026. Not equivalent to standard Bartlett.