Straight answers. No marketing fluff.
Prompt caching only works within a single session and only for identical prefixes. MEMStorage works across sessions, across users, and uses semantic matching, not exact matching.
If someone asks the same question in a different way, we still catch it. That's the layer OpenAI and Anthropic don't cover — and the fact that they price cached inputs dramatically lower tells you they know a huge share of usage is repeated queries.
We use a three-tier scoring system. Score ≥75% is a memory hit, returned instantly at zero cost. Score 28–74% triggers a lightweight 20-token confirmation call. Below 28% is a novel query — goes to full inference.
The thresholds are configurable per enterprise client based on their risk tolerance for false positives.
The confirmation tier exists exactly for this. When we're uncertain, we don't just serve memory — we validate it with a minimal model call first.
Enterprises can also set their own confidence thresholds and flag certain query types for mandatory full inference regardless of match score.
They can. Some do. But building it means solving semantic embeddings, vector storage, threshold tuning, cross-session persistence, security isolation between users, and ongoing maintenance.
We've done all of that. It's live today. And it's patent pending — our specific approach to three-tier routing is protected.
We only store answers that came from successful full inference calls. We don't store uncertain or flagged responses.
Enterprises can also set expiry windows on stored memories so stale answers get refreshed automatically.
Each enterprise client has isolated memory storage. No cross-client data sharing. Queries are stored as vector embeddings, not raw text.
We can also deploy on-premise for clients who require it.
We filed the provisional patent application. A provisional gives us 12 months of protection while we file the nonprovisional.
The raise funds the nonprovisional filing among other milestones.
Enterprise: API layer for companies spending $30K+ per month on inference. Direct outreach, 30-day pilots, money-back if no measurable ROI.
Consumer: Chrome extension for individuals using ChatGPT or Claude. Bottom-up adoption that creates enterprise pull when teams start using it.
The capital deploys against four pillars: engineering (vector embeddings, Chrome extension, persistent infrastructure), go-to-market (enterprise pilots, beta launch, first sales hire), IP & legal (nonprovisional patent filing, trademark, corporate counsel), and operations (infrastructure, tooling, runway).
Round target: 3–5 paying enterprise pilots, 1,000 beta users, and Chrome Web Store launch.
Raise size and terms shared privately on request — patrick@memstorage.com.
Once we have 3–5 signed enterprise pilots and 1,000 beta users, we raise a Series A to build the full team and expand the enterprise sales motion.
The memory flywheel compounds with scale — the unit economics get better the more customers we have.
Talk to Patrick directly or see the benchmark data yourself.