For my multi-threaded app, I needed an inline profiler.
Current Open Source offerings show wall-clock time only.
So I decided to write one that shows you how often a task got pre-empted by the scheduler.In just 200 lines of C code:
https://github.com/stolk/ThreadTracer