Bridge-residual outlier detection and error-morphology classification

You can use mt_flag_outliers_bridge() on its own when you want to spot GPS fixes that geometrically don’t belong — a single point that jumps far from where its neighbours sit, then comes right back. This function is the “geometric” detector in the four-detector cascade that mt_clean_track() orchestrates; running it standalone lets you look at just the geometric signal in isolation, or build your own composition on top of it.

This vignette walks through what the detector does, the directional variant that additionally tells you what kind of geometric anomaly you found, and how to read the outputs.

If you’re new to the package, you probably want mt_clean_track() instead — it calls this function for you alongside three other detectors and a consensus rule that combines their outputs. Come back here when you want fine-grained control over just one signal or when you’re building a custom pipeline.

The companion mt_flag_outliers() (probability-based, movement-metric) is covered in vignette("OUTLIER_1_getting_started", package = "move2utils"). Both detectors share the same threshold machinery; they target different kinds of error and can be composed.

How it works — intuition first

If a GPS fix doesn’t belong, the simplest test is: given where the animal was before and where it was just after, where would you expect the middle fix to sit? If the actual fix is far from that expectation — further than the time gap on either side could plausibly account for — it’s probably wrong.

That’s the bridge construction. For each fix, we draw a “bridge” between its two temporal neighbours, weighted by how much time sits on either side, and measure how far the actual fix is from that bridge. A small distance means the fix is roughly where it should be; a large one means something’s off.

Crucially, the width of the bridge — how much wandering is plausible over a given time gap — depends only on the timestamps, never on the fix’s own position. That’s the key design choice: an outlier can’t inflate the denominator that would have flagged it. Methods that normalise by a locally-estimated variance suffer from this “leverage problem” precisely because a bad fix pollutes its own scale.

The formal picture (for the curious)

For each fix i with projected coordinates p_i and time t_i, the temporal neighbours i-1 and i+1 define a Brownian-bridge expectation

m_i = α p_{i-1} + (1 − α) p_{i+1}, α = Δt₂ / (Δt₁ + Δt₂)

where Δt₁ = t_i − t_{i-1} and Δt₂ = t_{i+1} − t_i.

The residual is r_i = p_i − m_i. The natural scaling for that residual is the bridge’s own standard-deviation width,

w_i = √(Δt₁ · Δt₂ / (Δt₁ + Δt₂)).

The bridge score is η_i = ||r_i|| / w_i. The key property described in plain language above: w_i depends only on timestamps, never on positions. An outlier therefore cannot inflate its own denominator.

Two methods

method = "isotropic" uses the scalar residual magnitude. General-purpose; answers “is this fix geometrically far from where it should be?”.
method = "directional" decomposes the residual onto the local travel axis into parallel (η_para, along-track) and perpendicular (η_perp, across-track) components. Flags on η_perp alone, but the diagnostic value is in the (η_para, η_perp) pair — it classifies the kind of error at each flag.
method = "combined" (default) applies the threshold to both scores independently and flags a fix if either one trips.

The outlier bridge method here inherits the bridge-mean construction from the dynamic Brownian bridge (dBBMM) and the perpendicular decomposition from the dynamic bivariate Gaussian bridge (dBGB), but deliberately does not invoke their variance-estimation machinery — that avoids the leverage problem where an outlier inflates the very variance that would denominate it. See mt_dbbmm_variance() / mt_dbgb_variance() for the full dynamic models.

Requirements

The function accepts longitude/latitude or projected input; it auto-projects internally to a local AEQD for the Euclidean math and returns the result in your original CRS.
Enough context. With fewer than a handful of locations the bridge neighbours are undefined; in practice, tracks of at least a few dozen locations are needed for the threshold estimator to be stable.

A worked example on synthetic data

inst/extdata/synthetic_tracks.csv.gz contains three central-place-forager tracks with ground-truth outliers for validation. CPF_A has 23 injected outliers; CPF_B is clean; CPF_C has 4.

library(move2)
library(sf)
library(move2utils)

path <- system.file("extdata/synthetic_tracks.csv.gz",
                     package = "move2utils")
tracks <- mt_read(path)
tracks <- tracks[!st_is_empty(tracks), ]

cpfA <- tracks[mt_track_id(tracks) == "CPF_A", ]
cpfA_p <- st_transform(cpfA, mt_aeqd_crs(cpfA))
nrow(cpfA_p)
#> [1] 1748

Default run (combined method, entropy threshold)

res <- mt_flag_outliers_bridge(cpfA_p, plot = FALSE)
#> Running bridge-residual detection (method = combined) on 1748 locations...
#>   Iter 1: flagged 6 (break at eta = 1591.86 / eta_perp = 407.45).
#>   Iter 2: flagged 4 (break at eta = 1472.40 / eta_perp = 1182.96).
#>   Iter 3: flagged 7 (break at eta = 1255.72 / eta_perp = 1022.76).
#> === 17 outliers (0.97% of 1748) ===
table(res$is_outlier)
#> 
#> FALSE  TRUE 
#>  1731    17

coords <- sf::st_coordinates(res)
par(mar = c(4, 4, 3, 1))
plot(coords, type = "l", col = "grey80", asp = 1,
     xlab = "x (m)", ylab = "y (m)",
     main = sprintf("CPF_A — %d flags (combined, entropy)", sum(res$is_outlier)))
points(coords[!res$is_outlier, ], pch = 16, cex = 0.25, col = "grey50")
points(coords[res$is_outlier, ], pch = 1, cex = 1.6,
       col = "firebrick", lwd = 1.4)
legend("topright", pch = c(16, 1), col = c("grey50", "firebrick"),
       legend = c("kept", "flagged"), bty = "n")

CPF_A synthetic track in projected coordinates. Grey line traces the full track in order; firebrick circles mark the fixes mt_flag_outliers_bridge() flagged under the default combined-entropy configuration. Flagged points sit visibly off the otherwise smooth central-place-forager geometry.

The columns added to the input are

column	meaning
`bridge_residual`	scalar residual magnitude
`bridge_width`	bridge-width normalisation w_i
`bridge_eta`	score η_i = residual / width (scalar / isotropic)
`bridge_percentile`	empirical quantile of η within the track
`bridge_iteration`	iteration number on which the fix was flagged (0 = not flagged)
`is_outlier`	logical flag

The "entropy" threshold (default) is the deepest valley in the kernel-density estimate of log(η). On clean data the density is unimodal, there is no valley, and the function returns zero flags — a no-op guarantee that makes it safe to apply universally. The "gap" threshold is more sensitive: it uses a broken-stick null model plus tail-decay inflection, and will often pick up borderline points that entropy leaves alone.

res_gap <- mt_flag_outliers_bridge(
  cpfA_p,
  threshold_type = "gap",
  plot = FALSE
)
#> Running bridge-residual detection (method = combined) on 1748 locations...
#>   Iter 1: flagged 17 (break at eta = 14.95 / eta_perp = 17.63).
#>   Iter 2: flagged 12 (break at eta = 14.89 / eta_perp = 12.14).
#>   Iter 3: flagged 8 (break at eta = 14.70 / eta_perp = 14703.05).
#> === 37 outliers (2.12% of 1748) ===
table(res_gap$is_outlier)
#> 
#> FALSE  TRUE 
#>  1711    37

Use "gap" when you want sensitivity and expect tolerable false- positive rates; use "entropy" (the default) when false positives cost more than missed outliers, or when the track might be clean.

Directional variant — error-morphology classification

res_d <- mt_flag_outliers_bridge(
  cpfA_p,
  method = "directional",
  plot   = FALSE
)
#> Running bridge-residual detection (method = directional) on 1748 locations...
#>   Iter 1: flagged 5 (break at eta_perp = 407.45).
#>   Iter 2: flagged 2 (break at eta_perp = 1138.77).
#>   Iter 3: flagged 3 (break at eta_perp = 1022.82).
#> === 10 outliers (0.57% of 1748) ===
grep("bridge", names(res_d), value = TRUE)
#> [1] "bridge_residual"      "bridge_width"         "bridge_eta"          
#> [4] "bridge_eta_para"      "bridge_eta_perp"      "bridge_obs_inflation"
#> [7] "bridge_percentile"    "bridge_iteration"

Two additional columns appear:

bridge_eta_para — residual parallel to the local travel axis
bridge_eta_perp — residual perpendicular to it

Flags are placed on bridge_eta_perp (the across-track signal that is typically the more diagnostic of genuine measurement error). But the interesting thing is the pair of values for each flag.

Reading the (η_para, η_perp) plane

flags <- res_d[res_d$is_outlier, ]
n_flag <- nrow(flags)

plot(log10(flags$bridge_eta_para + 1),
     log10(flags$bridge_eta_perp + 1),
     xlab = expression(log[10](eta["para"] + 1)),
     ylab = expression(log[10](eta["perp"] + 1)),
     pch = 16, col = "firebrick", cex = 1.2,
     main = sprintf("error morphology on %d flagged points", n_flag))
abline(0, 1, lty = 2, col = "grey50")

Interpretation of the plane:

Points near the diagonal — η_para ≈ η_perp. Isotropic scatter; classic jitter-type error.
Points below the diagonal (high η_para, low η_perp) — along-track jumps. Ghost reports, clock glitches, GNSS fold-back.
Points above the diagonal (low η_para, high η_perp) — across-track drift. Characteristic of multipath, reflective environments, or constellation changes that bias one axis.

This is why the isotropic and directional flag sets can look very different on a real dataset even though both are built on the same residual: the isotropic score answers magnitude, the directional decomposition answers morphology. Together, they separate error classes that a single detector would merge.

Composing bridge with the other primitives

mt_flag_outliers_bridge() is one of four primitives in the move2utils outlier-detection framework; the others are mt_flag_outliers() (probability-based), mt_flag_outliers_detour() (geometric, time-insensitive path-vs-displacement ratio), and mt_flag_speed_cap() (step-level physiological cap). They are complementary:

mt_flag_outliers() is probability-based; it excels on multi-state, behaviourally heterogeneous tracks where the gap-aware auto-difference normalisation carries the signal.
mt_flag_outliers_bridge() is geometric and leverage-immune; it excels on long clean tracks and on detecting drift or spoofing.
mt_flag_outliers_detour() is time-insensitive and scale-invariant; it catches out-and-back GPS spikes at sparse sampling rates where the bridge primitive’s σ-scaling loses sensitivity.
mt_flag_speed_cap() is step-level; it catches the boundaries of coherent multi-fix error blocks that per-fix detectors structurally cannot reach.

The unified mt_clean_track() cleaner composes all four under class-aware flagging. A pragmatic standalone pipeline flags the union (or the intersection, depending on your cost function) of two of the four; the example below shows bridge + probability, which is the most common pairing for moderately-sampled tracks:

prob   <- mt_flag_outliers(cpfA_p)
bridge <- mt_flag_outliers_bridge(cpfA_p, method = "directional", plot = FALSE)

both_flags <- prob$is_outlier | bridge$is_outlier
cat("probability:", sum(prob$is_outlier), "\n")
cat("bridge     :", sum(bridge$is_outlier), "\n")
cat("union      :", sum(both_flags), "\n")

For routine use, mt_clean_track() performs this composition for you and adds a step-level speed cap, a per-fix consensus rule (configurable via consensus =; see ?mt_flag_consensus), and a topological block-expansion pass that catches coherent multi-fix clusters which per-fix scoring structurally cannot resolve. Supply v_max when species biology gives you a physiological cap — that lets the pipeline peel spoof- or teleport-class clusters at their boundaries before the bridge and probability detectors run on the survivors.

clean <- mt_clean_track(cpfA_p)
clean <- mt_clean_track(eagle_track, v_max = 30)  # eagle ~30 m/s ceiling

A single pass can miss outliers that were themselves used as neighbours by other outliers. iterations = n re-runs the detector on the currently-clean subset up to n times (default 3), stopping early on convergence. Adjust when your tracks have long runs of consecutive bad fixes.

Disabling neighbour smearing

Near a real outlier, its two immediate neighbours will look anomalous because one of their bridge expectations is a corrupted point. dedup_neighbours = TRUE (default) suppresses this neighbour-smearing artefact via a small peak-picking pass. Turn it off only if you have a reason to suspect it is masking real clustered errors.