Show HN: A context-aware semantic cache for reducing LLM app latency and cost | Heykuki News