Show HN: USST – A protocol to reduce LLM context redundancy by 98.5%

gist.github.com

2 points

6 months ago

I’ve been working on a primitive called User-Segmented Session Tokens (USST).

The Problem: Currently, if a teacher (or lead dev) wants 50 students (or junior devs) to use an LLM with a specific, deep context (e.g., a 50-page curriculum or a complex repo), all 50 users have to re-upload and re-tokenize that context. It’s redundant, expensive, and forces everyone to have a high-tier subscription.

The Solution: USST allows a "Sponsor" (authenticated, paid account) to run a Deep Research session once and mint a signed Context Token. Downstream users (anonymous/free tier) pass this token in their prompt. The provider loads the pre-computed KV cache/context state without re-processing the original tokens.

Decouples payment from utility: Sponsor pays the heavy compute; Users pay the inference. Privacy: Users don't need the Sponsor's credentials, just the token. Efficiency: Removes the "Linear Bleed" of context re-computation.

I wrote up the full architecture and the "why" here: https://medium.com/@madhusudan.gopanna/the-8-6-billion-oppor...

The Protocol Spec / Repo is the main link above.

Would love feedback on the abuse vectors and how this fits with current provider caching (like Anthropic’s prompt caching).

1 comment