Skip to contents

Time-insensitive, scale-invariant point-level outlier detector. For each interior fix \(N\) and window radius \(k\), computes $$ratio_k(N) = \frac{path\_length(N-k, \dots, N+k)}{displacement(N-k, N+k)}.$$ Legitimate animal movement has \(ratio_k \approx 1\) when motion is roughly straight and bounded above (typically \(< 2\)) even with hard turns. An out-and-back GPS spike has displacement near zero but path length preserved, so the ratio explodes – regardless of the time elapsed across the window.

Usage

mt_flag_outliers_detour(
  x,
  k = 1L,
  threshold = 5,
  threshold_type = c("fixed", "auto"),
  min_leg = 0,
  pool_by = NULL,
  plot = TRUE,
  remove = FALSE,
  silent = FALSE
)

Arguments

x

A move2 object. Single- or multi-track. CRS: lon/lat is auto-handled via Haversine; otherwise coordinates are assumed to be in a projected CRS (metres).

k

Integer (single) or integer vector. Window radius in fixes. Default 1L. When a vector is supplied, the per-fix score is the maximum ratio across the supplied k values.

threshold

Numeric. Detour ratio above which a fix is flagged. Used only when threshold_type = "fixed" (the default). Default 5. HEURISTIC – comfortably above the legitimate-flight ceiling for any sampling rate; plausible range 3–10. Ignored when threshold_type = "auto".

threshold_type

Character, one of "fixed" (default, legacy) or "auto". In "fixed" mode, the user-supplied scalar threshold is used. In "auto" mode the function computes the entropy valley in the empirical \(-\log(\rho)\) distribution – outliers sit in the lower tail since they have \(\rho \gg 1\) and therefore \(-\log(\rho) \ll 0\) – and uses the back- converted ratio (\(\exp(-\text{break})\)) as the threshold. Same threshold-detection machinery as the bridge / speed-cap / probability primitives' entropy paths (.entropy_threshold_lower, sweep-validated density-ratio of 0.3 unified package-wide 2026-05-09). When no entropy valley is found, no fixes are flagged (entropy-detector's safe-on-clean contract).

min_leg

Numeric, non-negative metres. Default 0. When > 0, only flag fixes where both incident step lengths exceed min_leg. Useful for standalone use; leave at 0 when combined with another primitive via the conjunction rule.

pool_by

Optional character vector of length 1 or 2 naming column(s) in mt_track_data(x). Length 1: single column used as both fit set and operating unit (e.g. "individual_id"). Length 2: c(outer, inner) where outer names the fit-source column (the union of its events supplies the entropy-break distribution) and inner names the operating unit (within which pool-added flags are unioned). Length 2 requires strict nesting: every distinct inner value must map to exactly one outer value. Length \(> 2\) is rejected – pool_by has exactly two semantic roles, and deeper hierarchies would require hierarchical threshold estimation, which this primitive does not perform. Pool-fit flags are unioned into per-track flags – the pool path can only add flags, never remove ones the per-track pass caught. NULL (default) preserves per-track behaviour byte-identically. NA values in the named column(s) cause those tracks to fall back to per-track processing with a warning. Only relevant for threshold_type = "auto"; "fixed" ignores pool_by (the user-supplied scalar is already the same across tracks). See ?mt_clean_track for the full semantics of the orchestrator's post-cascade pool sweep.

plot

Logical. Diagnostic map of flagged fixes. Default TRUE.

remove

Logical. If TRUE, return only kept rows. Default FALSE (return all rows with flag columns attached).

silent

Logical. Suppress narration. Default FALSE.

Value

A move2 object with added columns:

is_outlier

Logical. TRUE where flagged.

flagged_by_detour

Logical. Same as is_outlier for this primitive; named for parity with the other primitives' output schema.

detour_ratio

Numeric. Per-fix maximum ratio across the supplied k values; NA at boundary fixes where the window cannot fit.

Details

Why a separate primitive. This metric is the topological complement to the Brownian-bridge perpendicular residual produced by mt_flag_outliers_bridge. Bridge residual is in metres and scales with bridge length and Brownian variance; detour is dimensionless and uses only successive coordinates. At sparse sampling rates (e.g. 1-h GPS) the bridge \(\sigma\) broadens and loses sensitivity to single-fix spikes whose implied step speed is below physiological caps; detour stays robust because it only compares spatial path to displacement. The two primitives flag overlapping but distinct sets of fixes and combine well via the conjunction rule in mt_clean_track.

Time-insensitivity. The detector uses only the spatial sequence of coordinates. It will produce the same flag whether your sampling is uniform 1-h, irregular bursts, or post-resampled. This is intentional and a strength when the kinematic primitives lose precision under irregular sampling.

Window radius k. k = 1 catches single-fix spikes (the dominant pattern at the colony in real GPS data). Larger k catches block-shaped errors (2+ consecutive bad fixes through which the bridge primitive can pass cleanly). Pass an integer (single k) or a vector to compute the per-fix maximum ratio across a range of window sizes; in the latter case the per-k threshold scaling matters because legitimate movement at large k can produce ratio > 1.5 just from mild detours, so prefer single k = 1 for flagging and larger k as diagnostic.

Leg gate min_leg. Setting min_leg > 0 requires both incident step lengths to exceed min_leg metres before a fix can be flagged. Prevents the detector from firing on small-displacement noise wiggles where the ratio can be large but the absolute path is tiny. When the detector is used inside mt_clean_track, the conjunction with another primitive plays the same gating role, so min_leg = 0 is usually fine in that context.

Examples

if (FALSE) { # \dontrun{
library(move2)
x <- movebank_download_study(study_id = 123)
x <- mt_filter_gps_quality(x)
out <- mt_flag_outliers_detour(x, k = 1, threshold = 5,
                                min_leg = 5000)
table(out$is_outlier)
} # }