Skip to contents

Given a vector of log-scores where low values indicate outliers, estimate the density via KDE and search for the deepest local minimum (valley) below the main mode. A valley indicates a natural separation between an outlier regime and the bulk. If no valley meets the depth criterion, no outliers are declared – so this threshold is safe on clean data where the gap detector over-flags.

Usage

.entropy_threshold_lower(log_scores, threshold = 0.3, n_grid = 512)

Arguments

log_scores

Numeric vector. NAs are tolerated.

threshold

Numeric in (0, 1). Maximum allowed ratio of valley density to peak density. A valley whose density exceeds threshold * peak_density is not deep enough to justify splitting the distribution. Default 0.3 is the unified package- wide entropy default, validated by the 2026-05-06 Raven sensitivity sweep across 65 stratified Movebank tracks (the only level inside the strict stability window for cohort flag rate; K-W p = 0.076, i.e. cohort flag rate is statistically insensitive in the 0.3–0.7 range). Plausible range: 0.3–0.7.

n_grid

KDE grid size, default 512 (standard density-estimation grid; grid resolution rarely matters for valley detection).

Value

A list with is_outlier, break_value, method (one of "valley", "none", "insufficient"), and percentile.