The DSP Lens on Real-World Events, Part 3: Building an Honest Instrument
Introduction: Applying the Framework
In Part 1, we introduced the DSP lens and the problem of aliasing. In Part 2, we built a detailed theoretical framework, our "Rosetta Stone," connecting the concepts of sample rates, filters, and critical frequencies.
Now, in this final part, we apply that framework to a real-world scenario. We will use the specific parameters of the Jittertrap tool to perform a rigorous quantitative analysis of the system's imperfections. This will allow us to determine, with numerical confidence, exactly what we can and cannot trust from our measurements, and ultimately, how to build a more honest instrument.
The Messy Signal: Non-Stationarity and Self-Similarity
Before we cite the formal research, it's important to build an intuition for the kind of "messiness" real-world signals exhibit. Two key concepts are non-stationarity and self-similarity.
- Non-Stationary: A signal is non-stationary if its fundamental statistical properties—like the mean (average value) and variance (spikiness)—change over time. Think of the audio signal of a quiet conversation that is suddenly interrupted by a loud shout. The statistics of the "shout" section are completely different from the "quiet" section.
- Self-Similar: This is a fractal-like property where a signal has "bursts within bursts." If you zoom in on a period of high activity, the zoomed-in view reveals its own complex pattern of smaller bursts and gaps, looking qualitatively similar to the wider view.
Self-similar signals are, by their nature, non-stationary. The diagram below illustrates both concepts. The "Wide View" is non-stationary because the quiet periods have a very different mean and variance from the central burst. The "Zoomed-In View" shows that the burst itself is not a solid block of activity, but has a complex, self-similar structure.
The clean, predictable sine waves of DSP textbooks rarely exist in the wild. Foundational research by Leland et al. (1994) revealed that network traffic is self-similar, meaning it exhibits "burstiness" across a vast range of time scales. This property makes the signal non-stationary: its core statistical properties, like its mean and variance, are not stable over time. This is true of many real-world systems, from stock market data to social media activity. This non-stationarity challenges the assumptions of many classical DSP techniques and warns us to be humble: we are analyzing an artificial signal created by our filter, and its relationship to the underlying chaotic reality must be carefully considered.
The Imperfect Tool: Quantifying Jitter
Just as our signals are messy, our tools are imperfect. JitterTrap runs on a general-purpose Linux system, where the operating system scheduler introduces small, random variations in timing. This scheduler jitter means our samples are not taken at perfectly regular intervals. While it's tempting to see this as a helpful "dither," the overwhelming consensus in modern signal processing is that jitter is a source of noise that degrades the signal-to-noise ratio (Analog Devices, 2009). It is a flaw to be quantified.
A critical question, now that we have the insight to ask it, is whether to treat our signal as a simple tone or something more complex. Because network traffic is self-similar and bursty, it is more accurately a broadband signal, containing energy across a wide spectrum of frequencies. The primary effect of jitter on a broadband signal is to raise the overall noise floor, which can obscure faint signals. While this is a real and important effect, for our pragmatic goal of observing human-perceptible events on a millisecond timescale, the worst-case amplitude error on specific transients (our sine-wave model) remains the more immediate and practical concern to bound.
Let's assume a worst-case, but realistic, scheduler jitter of ±100 microseconds (µs). What is its impact? The error depends entirely on what we are trying to measure.
The Path to Trust, Step 1: Narrow the Question
An overly broad goal like "measure throughput" is a recipe for confusion. A much better approach is to ask a more specific question. For JitterTrap, I pivoted from trying to precisely measure the magnitude of traffic spikes to a different goal: accurately measuring the duration of periods of zero traffic. This shift from measuring spikes to measuring silence is the key to building a trustworthy instrument, because the system's imperfections affect these two questions very differently.
The Path to Trust, Step 2: A Tale of Two Uncertainties
With a precise question in hand, we can now perform a quantitative analysis.
- Analyzing Spike Magnitude: For a fast-changing signal, the amplitude error caused by jitter is proportional to the signal's rate of change (slew rate). We can use a sine wave for a worst-case analysis. For a 200 Hz signal (representing a 5ms event), the maximum slew rate is
A * 2 * pi * 200
, whereA
is the peak amplitude. The jitter-induced error is this slew rate multiplied by the jitter time:Error ≈ (A * 1256.6) * 100µs ≈ A * 0.126
This calculation reveals a potential measurement error of up to 12.6% of the spike's peak amplitude.
Show the work: Calculating Jitter-Induced Amplitude Error
The error is a function of the signal's maximum rate of change (slew rate) and the timing jitter.
- Signal Model: We model a 200 Hz signal as a sine wave:
V(t) = A * sin(2 * π * f * t)
, whereA
is the peak amplitude andf
is the frequency (200 Hz). - Slew Rate: The rate of change is the first derivative:
dV/dt = A * 2 * π * f * cos(2 * π * f * t)
. - Maximum Slew Rate: The rate of change is maximum when
cos(...) = 1
, soMax Slew Rate = A * 2 * π * f
. - Error Calculation: The amplitude error is the maximum slew rate multiplied by the jitter time (100 µs):
Error ≈ (A * 2 * π * 200) * 100e-6 s Error ≈ A * 1256.6 * 100e-6 Error ≈ A * 0.12566
This is approximately 12.6% of the peak amplitude A
.
The conclusion is clear: we cannot trust the magnitude of short-lived spikes.
- Analyzing Gap Duration: Now consider our new question: measuring a gap of silence. When the signal is a flat line at zero, its slew rate is zero, so jitter has no effect on the measured value. The error shifts to the boundaries of the measurement. The
±100µs
jitter creates an uncertainty window at the start and end of the gap. The total uncertainty on the gap's duration is therefore±2 * jitter_amount
. For a measured 10ms gap, a±200µs
uncertainty is a modest 2% error.
Show the work: Deriving Gap Duration Uncertainty
The total uncertainty is the sum of the uncertainties at the start and end boundaries of the gap.
- Start Boundary: The gap begins on the first sample that measures zero. Due to jitter, this sample's actual time has an uncertainty of
±jitter_amount
. - End Boundary: The gap ends on the first sample that measures non-zero after the silence. This sample's time also has an uncertainty of
±jitter_amount
. - Total Uncertainty: The worst-case uncertainty for the total duration is the sum of the magnitudes of the two boundary uncertainties:
jitter_amount + jitter_amount = 2 * jitter_amount
.
This contrast is the entire payoff of our analysis. We have quantitatively demonstrated that the same instrument, with the same flaws, is untrustworthy for one question but highly reliable for another.
Conclusion: Building an Honest Instrument
The DSP lens doesn't just show us the ghosts in our data; it gives us a framework to perform the rigorous analysis needed to build confidence. By understanding our filters, acknowledging the non-stationary nature of our signals, and—most importantly—quantifying the impact of our tool's imperfections relative to a specific, narrow question, we can transform a simple data viewer into a trustworthy scientific instrument.
The ultimate goal is not a "perfect" instrument, but an honest one that can report its own uncertainty. A final result of "Gap Duration: 42.1ms (±0.2ms, worst-case)" is infinitely more valuable than a simple "42ms." To provide a statistical confidence interval (e.g., "at 99% confidence"), we would first need to measure and characterize the probability distribution of the scheduler jitter—a valuable next step for a more advanced instrument. This represents a hard-won, quantitative understanding of the boundary between what we can know and what we can only estimate.
The Path Forward
This series provides a framework for understanding the limits of a simple measurement tool. The logical next steps on this journey involve using more advanced techniques to mitigate the issues we've identified. Future work could explore:
- Better Anti-Aliasing Filters: Replacing the simple boxcar filter with a windowing function (e.g., Hann or Hamming) to dramatically reduce sidelobe leakage.
- Techniques for Irregular Sampling: Instead of ignoring jitter, one can embrace it by timestamping every sample and using algorithms designed for non-uniform data, like the Lomb-Scargle periodogram.
- Analysis for Self-Similar Signals: For a more statistically rigorous analysis of non-stationary traffic, one could move beyond the frequency domain entirely and explore methods like wavelet analysis or calculating the Hurst exponent.
Jittertrap is a useful tool within its known limits. The true goal of this analysis is not to achieve perfection, but to foster a deep understanding of the theory and practical techniques that allow for its continual, honest improvement.
References
- Leland, W. E., Taqqu, M. S., Willinger, W., & Wilson, D. V. (1994). On the self-similar nature of Ethernet traffic. IEEE/ACM Transactions on Networking, 2(1), 1–15.
- Smith, S. W. The Scientist and Engineer's Guide to Digital Signal Processing. Available: https://www.dspguide.com/
- Analog Devices. (2009). MT-007 Tutorial: Sampled Systems and the Effects of Clock Phase Jitter and Wander. Available: https://www.analog.com/media/en/training-seminars/tutorials/MT-007.pdf