Multi-scale persistence score for outlier flags
Source:R/mt_persistence_score.R
mt_persistence_score.RdAnnotates flagged fixes with a persistence score – the
number of temporal scales at which each flag's local geometry is
independently anomalous. The score is detector-agnostic: it
operates on any move2 object that carries an is_outlier
column, regardless of which primitive flagged the fixes
(mt_clean_track, mt_flag_outliers_bridge,
mt_flag_outliers_detour, mt_flag_outliers,
mt_flag_speed_cap, etc.).
Usage
mt_persistence_score(
x,
scales = c(2L, 4L, 8L),
threshold = NULL,
n_breaks = 20L,
silent = FALSE
)Arguments
- x
A
move2object with anis_outlierlogical column (single- or multi-track).- scales
Integer vector of validation scales \(k\). Each \(k\) must satisfy \(2 \le k \le \lfloor (n-1)/2 \rfloor\) on a track with \(n\) fixes (boundary fixes near the track ends cannot be evaluated at scale \(k\)). Default
c(2, 4, 8).- threshold
Numeric. Gap-threshold parameter passed to the internal
.gap_threshold_lowerfor per-scale flag detection on \(-\log(\text{prob})\). Default3.0.- n_breaks
Integer. Number of bins per axis for the 2-D per-scale joint-probability histogram. Default
20L.- silent
Logical. If
TRUEsuppress per-track summary messages. DefaultFALSE.
Value
The input x with added columns:
persistence_countInteger. The persistence score (1 = flagged only at native resolution; up to
1 + length(scales)= flagged at every validation scale).NA_integer_for fixes that were not flagged inis_outlier(the score is defined only for candidates).persistence_at_scale_<k>One logical column per scale \(k\) in
scales, indicating whether the flagged fix's scale-\(k\) geometry was anomalous against the scale-\(k\) reference distribution.NAfor non-flagged fixes.
is_outlier itself is not modified. The function is
a pure annotator – users decide whether and how to filter.
Details
Mechanism. For each flagged fix \(i\) and each scale \(k\), the function computes a "scale-\(k\) view" of \(i\)'s local geometry: the step lengths from \(x_{i-k}\) to \(x_i\) and from \(x_i\) to \(x_{i+k}\), and the turn angle at \(i\) between those two long arms. The reference distribution at scale \(k\) is built from the same scale-\(k\) geometry computed at every interior fix of the track. A 2-D histogram of \((\log\text{step}_k, \text{turn}_k)\) is the joint density; the flagged fix's bin density gives its scale-\(k\) probability; a gap-on-(-log probability) threshold determines whether the fix is anomalous at scale \(k\).
Crucially, the scale-\(k\) view is computed for every fix, not just for fixes that would be retained in a thinned-track grid. This dissolves the index-parity issue of conventional multi-scale voting (where some fixes can never be evaluated at coarser scales because the thinning grid skips them).
Persistence score. A flagged fix's persistence score is
$$p(i) = 1 + \sum_{k \in \text{scales}} \mathbb{1}[i \text{ flagged at scale } k],$$
where the constant 1 accounts for scale 1 (the original detector's
flag). With the default scales = c(2, 4, 8) the score
takes integer values from 1 (flagged only at native resolution)
to 4 (flagged at every scale tested). Higher scores indicate
anomalies whose geometric extent survives temporal coarsening.
Class-aware filtering recommendation. Empirical work on
the synthetic CPF ground truth (see vignette) shows persistence
is class-conditionally informative when the input is cascade
output (mt_clean_track):
geometric_spikeclass: empirically class-pure on the synthetic; persistence has nothing to filter.state_anomalyandconsensusclasses: TPs persist 80–95\ filter atpersistence_count >= 3substantively improves precision on these classes.kinematic_confluenceclass: small sample on the synthetic; signal direction is uncertain.
Recommendation: use the score as a confidence column, not
an automatic filter. Where a filter is desired, gate it on
error_class (when the input came from
mt_clean_track).
When to expect persistence to be informative. The persistence score is most useful when the underlying error type has geometric extent that survives coarsening: isolated single-fix spikes (whose 4-step or 8-step neighbourhood is still dominated by the spike), and block-boundary fixes (where the coarse-scale step crosses the spoof boundary). It is least useful for halo-style outliers (repeated wandering returns) whose anomaly averages out over wider windows.
See also
mt_clean_track for the cascade orchestrator
whose error_class column gates the filtering rule;
vignette("OUTLIER_5_persistence_score", package = "move2utils") for
the empirical class-conditional analysis on synthetic CPF data.
Examples
if (FALSE) { # \dontrun{
## Cascade output annotated with persistence scores
clean <- mt_clean_track(track, mass = 5, mode = "flying", remove = FALSE)
annotated <- mt_persistence_score(clean)
## Class-aware filter (recommended pattern):
is_low_confidence <-
annotated$is_outlier &
annotated$error_class %in% c("state_anomaly", "consensus") &
annotated$persistence_count < 3
annotated$is_outlier[is_low_confidence] <- FALSE
} # }