Multi-scale persistence score for outlier flags

Annotates flagged fixes with a persistence score – the number of temporal scales at which each flag's local geometry is independently anomalous. The score is detector-agnostic: it operates on any move2 object that carries an is_outlier column, regardless of which primitive flagged the fixes (mt_clean_track, mt_flag_outliers_bridge, mt_flag_outliers_detour, mt_flag_outliers, mt_flag_speed_cap, etc.).

Usage

mt_persistence_score(
  x,
  scales = c(2L, 4L, 8L),
  threshold = NULL,
  n_breaks = 20L,
  silent = FALSE
)

Arguments

x: A move2 object with an is_outlier logical column (single- or multi-track).
scales: Integer vector of validation scales $k$. Each $k$ must satisfy $2 \le k \le \lfloor (n-1)/2 \rfloor$ on a track with $n$ fixes (boundary fixes near the track ends cannot be evaluated at scale $k$). Default c(2, 4, 8).
threshold: Numeric. Gap-threshold parameter passed to the internal .gap_threshold_lower for per-scale flag detection on $-\log(\text{prob})$. Default 3.0.
n_breaks: Integer. Number of bins per axis for the 2-D per-scale joint-probability histogram. Default 20L.
silent: Logical. If TRUE suppress per-track summary messages. Default FALSE.

Value

The input x with added columns:

persistence_count: Integer. The persistence score (1 = flagged only at native resolution; up to 1 + length(scales) = flagged at every validation scale). NA_integer_ for fixes that were not flagged in is_outlier (the score is defined only for candidates).
persistence_at_scale_<k>: One logical column per scale $k$ in scales, indicating whether the flagged fix's scale-$k$ geometry was anomalous against the scale-$k$ reference distribution. NA for non-flagged fixes.

is_outlier itself is not modified. The function is a pure annotator – users decide whether and how to filter.

Details

Mechanism. For each flagged fix $i$ and each scale $k$, the function computes a "scale-$k$ view" of $i$'s local geometry: the step lengths from $x_{i-k}$ to $x_i$ and from $x_i$ to $x_{i+k}$, and the turn angle at $i$ between those two long arms. The reference distribution at scale $k$ is built from the same scale-$k$ geometry computed at every interior fix of the track. A 2-D histogram of $(\log\text{step}_k, \text{turn}_k)$ is the joint density; the flagged fix's bin density gives its scale-$k$ probability; a gap-on-(-log probability) threshold determines whether the fix is anomalous at scale $k$.

Crucially, the scale-$k$ view is computed for every fix, not just for fixes that would be retained in a thinned-track grid. This dissolves the index-parity issue of conventional multi-scale voting (where some fixes can never be evaluated at coarser scales because the thinning grid skips them).

Persistence score. A flagged fix's persistence score is $$p(i) = 1 + \sum_{k \in \text{scales}} \mathbb{1}[i \text{ flagged at scale } k],$$ where the constant 1 accounts for scale 1 (the original detector's flag). With the default scales = c(2, 4, 8) the score takes integer values from 1 (flagged only at native resolution) to 4 (flagged at every scale tested). Higher scores indicate anomalies whose geometric extent survives temporal coarsening.

Class-aware filtering recommendation. Empirical work on the synthetic CPF ground truth (see vignette) shows persistence is class-conditionally informative when the input is cascade output (mt_clean_track):

geometric_spike class: empirically class-pure on the synthetic; persistence has nothing to filter.
state_anomaly and consensus classes: TPs persist 80–95\ filter at persistence_count >= 3 substantively improves precision on these classes.
kinematic_confluence class: small sample on the synthetic; signal direction is uncertain.

Recommendation: use the score as a confidence column, not an automatic filter. Where a filter is desired, gate it on error_class (when the input came from mt_clean_track).

When to expect persistence to be informative. The persistence score is most useful when the underlying error type has geometric extent that survives coarsening: isolated single-fix spikes (whose 4-step or 8-step neighbourhood is still dominated by the spike), and block-boundary fixes (where the coarse-scale step crosses the spoof boundary). It is least useful for halo-style outliers (repeated wandering returns) whose anomaly averages out over wider windows.

Examples

if (FALSE) { # \dontrun{
## Cascade output annotated with persistence scores
clean <- mt_clean_track(track, mass = 5, mode = "flying", remove = FALSE)
annotated <- mt_persistence_score(clean)

## Class-aware filter (recommended pattern):
is_low_confidence <-
  annotated$is_outlier &
  annotated$error_class %in% c("state_anomaly", "consensus") &
  annotated$persistence_count < 3
annotated$is_outlier[is_low_confidence] <- FALSE
} # }

Usage

Arguments

Value

Details

See also

Examples