As developers focused on iterating and building features as fast as possible, we felt observability would’ve been a helpful tool to have, but we found existing solutions had too much overhead to set up. So we gave ourselves the challenge of building an observability tool that you can integrate without changing a single line of code in your project.
How it works
Run your python application as usual with the same entry point, just with `subl` in front (e.g. `subl python xyz.py` or `subl sh xyz.sh` ). That allows all python subprocesses spawned to automatically patch the clients for OpenAI, Anthropic, etc., to log the LLM input/output traces locally. We provide a local dashboard as well for you to view these traces and perform some simple evals.
Designing with this paradigm of being maximally unintrusive has surprisingly resulted in some uniquely enabled features.
- Automatically detecting prompt templates. If you construct your prompt with a string format like f “What is {a} + {b}?”, we can use a combination of AST and reading runtime stack frames to log the prompt template strings as well.
- Tracking server requests. If you’re serving your application with Flask, Django, or FastAPI, we can keep track of all the LLM calls made in a single request as well by integrating into those server libraries.
- Disentangling serving and inference logic. Logging server failures will not cause your LLM inference call to fail as well.
Excited to share sublingual with everyone, and there’s still a lot more work to be done, but would love to hear whether this ease-of-integration first approach is helpful.