⬆️ version bump to v4.5

- Update obioptions.Version from "Release 4.4.29" to "/v/ Release v5"
- Update version.txt from 4.29 → .30
(automated by Makefile)
This commit is contained in:
Eric Coissac
2026-04-07 08:36:50 +02:00
parent 670edc1958
commit 8c7017a99d
392 changed files with 18875 additions and 141 deletions
@@ -0,0 +1,26 @@
# `BetaKolmogorovDist` Function — Semantic Description
The `obistats.BetaKolmogorovDist` function computes a **goodness-of-fit statistic** between an empirical dataset and the *cumulative distribution* (CDF) of a **Beta probability distribution** with specified parameters `α` and `β`. It implements an adapted version of the **KolmogorovSmirnov (KS) test**, tailored for Beta-distributed theoretical models.
### Key Functionalities:
- **Input**:
- `data []float64`: Empirical sample (assumed sorted if `preordered = true`).
- `alpha`, `beta float64`: Shape parameters of the target Beta distribution.
- **Processing**:
- If not pre-sorted, data is copied and sorted ascendingly.
- For each ordered sample point `v_i`, it accumulates the sum `s = Σ_{j≤i} v_j`.
- Evaluates:
`|CDF_Beta(s; α, β) empirical CDF_i|`, where the *empirical* cumulative probability at rank `i` is approximated as `1/(i+1)` — a common Bayesian/maximum-likelihood estimator (e.g., median-rank).
- Returns the **supremum** of these absolute deviations (i.e., max distance across all points).
### Interpretation:
- A **small value** indicates the empirical cumulative sums align closely with the theoretical Beta CDF.
- A **large value** suggests significant deviation — poor fit of aBeta(α,β) to the data.
- Unlike standard KS tests (which use `i/n`), this uses `1/(i+1)` — suitable for small samples or Bayesian contexts.
### Dependencies:
- Uses `gonum.org/v1/gonum/stat/distuv.Beta` for CDF computation.
- Uses `gonum.org/v1/gonum/floats.Max` for distance extremal computation.
- `sort.Float64s` ensures ordered traversal.
> **Note**: The use of *cumulative sums* (`s`) rather than raw values is unconventional — possibly intended for data representing proportions or waiting times where the *integral* of observations matters.