Interpolating and thinning trajectories
Source:vignettes/interpolate_and_thin.Rmd
interpolate_and_thin.RmdReal tracks are sampled irregularly, and some downstream analyses assume either regular sampling or a specific geometry. This vignette covers:
-
Interpolation — producing additional locations at a
desired cadence or along a target line. Handled by
move2::mt_interpolate(). -
Time-thinning — downsampling a dense track to a
target cadence. Two complementary helpers:
move2::mt_filter_per_interval()for the bucketed case, andmove2utils::mt_thin_time()for tolerance-constrained downsampling. -
Distance-thinning — downsampling based on
cumulative travel distance. Provided by
move2utils::mt_thin_distance().
Setting the scene
library(move2)
library(dplyr)
library(sf)
library(move2utils) # loaded here so that future helpers are visible
fishers <- mt_read(mt_example())
fishers <- fishers[!st_is_empty(fishers), ]
leroy <- filter_track_data(fishers, .track_id = "M1")
leroy <- slice(leroy,1:300)
summary(mt_time_lags(leroy, units = "min"))
#> Min. 1st Qu. Median Mean 3rd Qu. Max. NAs
#> 13.38 14.73 15.02 26.64 15.40 779.53 1Interpolation
mt_interpolate() supports three use patterns.
A. Regular time cadence via an interval string
## one fix per five minutes, refusing to bridge gaps longer than 1 h
leroy_5min <- mt_interpolate(
leroy,
time = "5 mins",
max_time_lag = as.difftime(1, units = "hours")
)
nrow(leroy); nrow(leroy_5min)
#> [1] 300
#> [1] 1893
summary(mt_time_lags(leroy_5min, units = "min"))
#> Min. 1st Qu. Median Mean 3rd Qu. Max. NAs
#> 0.09997 4.45000 5.00000 4.20965 5.00000 5.00000 1The "5 mins" string is passed through
lubridate::floor_date(), so you can use any interval format
that accepts ("30 secs", "2 hours",
"1 day", …). max_time_lag is the policy for
gaps: any original lag longer than this cutoff is not bridged;
the output track has a break there rather than a dense string of
interpolated points across the gap.
B. Interpolate at a specific set of timestamps
t0 <- min(mt_time(leroy))
target_t <- t0 + as.difftime(c(10, 20, 30, 60, 120), units = "mins")
leroy_at <- mt_interpolate(leroy, time = target_t, omit = FALSE)
## the new locations
head(leroy_at[mt_time(leroy_at) %in% target_t, ])
#> A <move2> with `track_id_column` "individual-local-identifier" and
#> `time_column` "timestamp"
#> Containing 1 track lasting 1.83 hours in a
#> Simple feature collection with 5 features and 20 fields
#> Geometry type: POINT
#> Dimension: XY
#> Bounding box: xmin: -73.89874 ymin: 42.74363 xmax: -73.89869 ymax: 42.74369
#> Geodetic CRS: WGS 84
#> # A tibble: 5 × 21
#> `individual-local-identifier` `event-id` visible timestamp
#> * <fct> <int64> <lgl> <dttm>
#> 1 M1 NA NA 2009-02-11 12:26:45
#> 2 M1 NA NA 2009-02-11 12:36:45
#> 3 M1 NA NA 2009-02-11 12:46:45
#> 4 M1 NA NA 2009-02-11 13:16:45
#> 5 M1 NA NA 2009-02-11 14:16:45
#> # ℹ 17 more variables: `behavioural-classification` <fct>,
#> # `eobs:battery-voltage` [mV], `eobs:fix-battery-voltage` [mV],
#> # `eobs:horizontal-accuracy-estimate` [m], `eobs:key-bin-checksum` <int64>,
#> # `eobs:speed-accuracy-estimate` [m/s], `eobs:start-timestamp` <dttm>,
#> # `eobs:status` <ord>, `eobs:temperature` [°C], `eobs:type-of-fix` <fct>,
#> # `eobs:used-time-to-get-fix` [s], `ground-speed` [m/s], heading [°],
#> # `height-above-ellipsoid` [m], `manually-marked-outlier` <lgl>, …
#> Track features:
#> individual-local-identifier individual-taxon-canonical-name
#> 1 M1 Martes pennanti
#> study-name tag-local-identifier
#> 1 Martes pennanti LaPoint New York 74Setting omit = TRUE returns only the requested
timestamps and discards the originals.
C. Interpolate to where a line crosses the track
Useful for finding when an animal crossed a boundary — a road, a home-range edge, a territory line.
## example: bounding square line (plug in your own geometry in practice)
line <- sf::st_sfc(
sf::st_linestring(cbind(c(-1, 1, 1, -1, -1), c(-1, -1, 1, 1, -1))),
crs = sf::st_crs(leroy)
)
crossings <- mt_interpolate(leroy, sf = line)Comparison with legacy move::interpolateTime()
| move | move2 |
|---|---|
interpolateTime(x, time = 500) (count) |
mt_interpolate(x, time = seq(min(t), max(t), length.out = 500)) |
interpolateTime(x, time = as.difftime(5, "mins")) |
mt_interpolate(x, time = "5 mins", max_time_lag = ...) |
interpolateTime(x, time = timestamp_vec) |
mt_interpolate(x, time = timestamp_vec) |
The explicit max_time_lag is slightly more disciplined
than the legacy default — you decide what counts as “too long to
interpolate” rather than letting the function silently fill every
gap.
Time-thinning
The question “give me ~one fix per 45 min” has two sensibly different
answers. move2::mt_filter_per_interval() gives the
bucketed answer: divide time into 45-minute windows
(aligned to wall-clock intervals via floor_date), and pick
one fix per window.
leroy_45 <- mt_filter_per_interval(leroy, unit = "45 min",
criterion = "first")
nrow(leroy); nrow(leroy_45)
#> [1] 300
#> [1] 157
summary(mt_time_lags(leroy_45, units = "min"))
#> Min. 1st Qu. Median Mean 3rd Qu. Max. NAs
#> 14.28 15.10 37.48 51.06 45.00 779.53 1The criterion can be "first", "last", or
"random". Because the buckets are aligned to the wall
clock, the actual time lag between retained fixes is not guaranteed to
equal 45 minutes; if the first observed fix of a window falls at 10:55
and the next at 11:00, both are kept (they belong to different hour
windows but are only 5 minutes apart in time).
Tolerance-constrained thinning with mt_thin_time()
mt_filter_per_interval() is fine for visualisation,
quick-look downsampling, or feeding a method that is tolerant of
irregular sampling.
mt_thin_time() solves a stricter problem: find the
largest subset of fixes such that every successive retained pair
is within
[interval - tolerance, interval + tolerance]. The
answer is the longest path in a small DAG on sorted timestamps; it is
computed with a linear-time dynamic-programming sweep rather than the
memoised recursion of the legacy implementation, so it scales cleanly to
+ fixes.
thinned <- mt_thin_time(leroy,
interval = "30 min",
tolerance = "5 min",
remove = TRUE)
nrow(leroy); nrow(thinned)
#> [1] 300
#> [1] 153
lags <- mt_time_lags(thinned, units = "min")
summary(lags[!is.na(lags)])
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 28.35 29.78 30.08 52.30 30.47 794.25The retained-lag distribution shows the two regimes the function produces by construction:
-
Within-run lags fall inside
[interval - tolerance, interval + tolerance]— these are the tolerance-admissible transitions that the DP maximised. -
Between-run lags exceed
interval + tolerance— they are the original gaps that were too long to bridge; the run-splitter pre-pass decomposed the track at these gaps so that thinning never pretends to bridge them.
No retained lag is ever shorter than
interval - tolerance; that is the hard contract the
function enforces.
Tie-breaking when several chains are equally long:
-
criterion = "closest"(default) — minimum summed deviation from the target interval, preferring chains that hit the target most evenly. -
criterion = "first"— earliest-starting chain, matching the legacy"first"behaviour.
Distance-thinning
A further common pattern — “give me one fix every 300 metres of
travel” — is handled by mt_thin_distance(). It walks along
the track, accumulates great-circle (lon/lat) or planar (projected)
distance via move2::mt_distance(), and retains a fix each
time the cumulative distance crosses a new multiple of the
distance step.
leroy_300m <- mt_thin_distance(leroy, distance = 300)
nrow(leroy); sum(leroy_300m$thin_selected)
#> [1] 300
#> [1] 95
## filtered output for downstream use
leroy_300m_kept <- mt_thin_distance(leroy, distance = 300, remove = TRUE)Multi-track input is thinned per individual, the first fix of each
track is always retained, and the output is labelled in place with a
thin_selected column (matching the
mt_thin_time() convention). distance can be a
numeric metre count or a units object
(units::set_units(0.5, "km")), so the call can read
naturally.
Summary
| Task | Function |
|---|---|
| Interpolate at regular intervals | mt_interpolate(x, time = "5 mins", max_time_lag = ...) |
| Interpolate at specific timestamps | mt_interpolate(x, time = t_vec) |
| Interpolate where a line crosses | mt_interpolate(x, sf = line) |
| Time-thin (bucketed) | mt_filter_per_interval(x, unit = "45 min", criterion = "first") |
| Time-thin (tolerance-constrained) | mt_thin_time(x, interval, tolerance) |
| Distance-thin | mt_thin_distance(x, distance) |