A New Approach to Performance Debugging
Trace arbitrary compiled binaries and functions on Linux, at runtime with 0 modifications.
Low overhead dynamic instrumentation
Wachy uses the magic of eBPF to dynamically instrument binaries with minimal overhead. This also means
there is 0 overhead for untraced functions.
Deep code integration
eBPF on its own can be difficult and time-consuming to use. The goal of wachy is to make userspace
eBPF tracing 10-100x faster and easier by connecting it back to your source code.
Understand real latencies
Stack sampling profilers only provide part of the picture as they usually show the proportion of
active CPU cycles.
With wachy, you get accurate function latencies including time spent in common blocking calls like
waiting on network, IO or mutexes.
It can also gather latency histograms.
Powerful runtime filtering
Add filters for conditions you want to trace. At runtime with no code changes. eBPF truly is magic.
Watch a 3 minute demo of wachy:
This is the best way to understand wachy. But if you prefer reading,
look at the guide.
What is it good for?
Frequently executed functions
The wachy interface displays average latency of each tracepoint, or the latency histogram of a
If you have something like an RPC or web server with frequent requests, this works great for
understanding latency, down to the level of individual functions.
Interactive debugging with filtering
Wachy maintains a stack of functions being traced, which lends itself well to iterative exploration of
You can also specify custom filters.
Want to only see the latency of function B called from function A where A's first argument is 0? No
Understanding tail latencies
Wachy allows specifying runtime filters to understand program behavior under various conditions.
For example, where is the time spent inside a function when it takes longer than 100ms to execute?
Debugging in production
There's often some performance issue which only occurs in production.
And yeah, sure you follow all the best practices but sometimes the best way to debug it is just to get
in there and examine at what's happening live.
eBPF guarantees that any tracing you do is completely safe (I'm looking at you, gdb) with the only
side effect being minor tracing overhead.
Wachy's TUI is designed with this use case in mind – there's no need to forward ports, all you need is
an SSH connection to the machine you want to debug on.
Debugging on arbitrary platforms
Necessary eBPF features are only available on Linux 4.6 or later kernels.
Debugging arbitrary languages
Wachy relies on eBPF uprobes and debugging symbols, which only work for compiled languages.
C++ symbol demangling for displaying human-readable function names is also supported.
Debugging extremely latency-sensitive
While eBPF overhead is fairly low, there is some overhead – in my measurements, about 3μs per traced
For functions that take less time than that and are frequently called, this may be unacceptable and
wachy's precision will not be good enough.