Flag outliers using path-vs-displacement detour ratio
Source:R/mt_flag_outliers_detour.R
mt_flag_outliers_detour.RdTime-insensitive, scale-invariant point-level outlier detector. For each interior fix \(N\) and window radius \(k\), computes $$ratio_k(N) = \frac{path\_length(N-k, \dots, N+k)}{displacement(N-k, N+k)}.$$ Legitimate animal movement has \(ratio_k \approx 1\) when motion is roughly straight and bounded above (typically \(< 2\)) even with hard turns. An out-and-back GPS spike has displacement near zero but path length preserved, so the ratio explodes – regardless of the time elapsed across the window.
Usage
mt_flag_outliers_detour(
x,
k = 1L,
threshold = 5,
threshold_type = c("fixed", "auto"),
min_leg = 0,
pool_by = NULL,
plot = TRUE,
remove = FALSE,
silent = FALSE
)Arguments
- x
A
move2object. Single- or multi-track. CRS: lon/lat is auto-handled via Haversine; otherwise coordinates are assumed to be in a projected CRS (metres).- k
Integer (single) or integer vector. Window radius in fixes. Default
1L. When a vector is supplied, the per-fix score is the maximum ratio across the suppliedkvalues.- threshold
Numeric. Detour ratio above which a fix is flagged. Used only when
threshold_type = "fixed"(the default). Default5. HEURISTIC – comfortably above the legitimate-flight ceiling for any sampling rate; plausible range 3–10. Ignored whenthreshold_type = "auto".- threshold_type
Character, one of
"fixed"(default, legacy) or"auto". In"fixed"mode, the user-supplied scalarthresholdis used. In"auto"mode the function computes the entropy valley in the empirical \(-\log(\rho)\) distribution – outliers sit in the lower tail since they have \(\rho \gg 1\) and therefore \(-\log(\rho) \ll 0\) – and uses the back- converted ratio (\(\exp(-\text{break})\)) as the threshold. Same threshold-detection machinery as the bridge / speed-cap / probability primitives' entropy paths (.entropy_threshold_lower, sweep-validated density-ratio of 0.3 unified package-wide 2026-05-09). When no entropy valley is found, no fixes are flagged (entropy-detector's safe-on-clean contract).- min_leg
Numeric, non-negative metres. Default
0. When> 0, only flag fixes where both incident step lengths exceedmin_leg. Useful for standalone use; leave at0when combined with another primitive via the conjunction rule.- pool_by
Optional character vector of length 1 or 2 naming column(s) in
mt_track_data(x). Length 1: single column used as both fit set and operating unit (e.g."individual_id"). Length 2:c(outer, inner)whereouternames the fit-source column (the union of its events supplies the entropy-break distribution) andinnernames the operating unit (within which pool-added flags are unioned). Length 2 requires strict nesting: every distinctinnervalue must map to exactly oneoutervalue. Length \(> 2\) is rejected – pool_by has exactly two semantic roles, and deeper hierarchies would require hierarchical threshold estimation, which this primitive does not perform. Pool-fit flags are unioned into per-track flags – the pool path can only add flags, never remove ones the per-track pass caught.NULL(default) preserves per-track behaviour byte-identically. NA values in the named column(s) cause those tracks to fall back to per-track processing with a warning. Only relevant forthreshold_type = "auto";"fixed"ignorespool_by(the user-supplied scalar is already the same across tracks). See?mt_clean_trackfor the full semantics of the orchestrator's post-cascade pool sweep.- plot
Logical. Diagnostic map of flagged fixes. Default
TRUE.- remove
Logical. If
TRUE, return only kept rows. DefaultFALSE(return all rows with flag columns attached).- silent
Logical. Suppress narration. Default
FALSE.
Value
A move2 object with added columns:
is_outlierLogical. TRUE where flagged.
flagged_by_detourLogical. Same as
is_outlierfor this primitive; named for parity with the other primitives' output schema.detour_ratioNumeric. Per-fix maximum ratio across the supplied
kvalues;NAat boundary fixes where the window cannot fit.
Details
Why a separate primitive. This metric is the topological
complement to the Brownian-bridge perpendicular residual produced
by mt_flag_outliers_bridge. Bridge residual is in
metres and scales with bridge length and Brownian variance; detour
is dimensionless and uses only successive coordinates. At sparse
sampling rates (e.g. 1-h GPS) the bridge \(\sigma\) broadens and
loses sensitivity to single-fix spikes whose implied step speed is
below physiological caps; detour stays robust because it only
compares spatial path to displacement. The two primitives flag
overlapping but distinct sets of fixes and combine well via the
conjunction rule in mt_clean_track.
Time-insensitivity. The detector uses only the spatial sequence of coordinates. It will produce the same flag whether your sampling is uniform 1-h, irregular bursts, or post-resampled. This is intentional and a strength when the kinematic primitives lose precision under irregular sampling.
Window radius k. k = 1 catches single-fix spikes
(the dominant pattern at the colony in real GPS data). Larger
k catches block-shaped errors (2+ consecutive bad fixes
through which the bridge primitive can pass cleanly). Pass an
integer (single k) or a vector to compute the per-fix
maximum ratio across a range of window sizes; in the latter
case the per-k threshold scaling matters because legitimate
movement at large k can produce ratio > 1.5 just from
mild detours, so prefer single k = 1 for flagging and
larger k as diagnostic.
Leg gate min_leg. Setting min_leg > 0 requires
both incident step lengths to exceed min_leg metres before
a fix can be flagged. Prevents the detector from firing on
small-displacement noise wiggles where the ratio can be large but
the absolute path is tiny. When the detector is used inside
mt_clean_track, the conjunction with another
primitive plays the same gating role, so min_leg = 0 is
usually fine in that context.
Examples
if (FALSE) { # \dontrun{
library(move2)
x <- movebank_download_study(study_id = 123)
x <- mt_filter_gps_quality(x)
out <- mt_flag_outliers_detour(x, k = 1, threshold = 5,
min_leg = 5000)
table(out$is_outlier)
} # }