Files
obitools4/autodoc/docmd/pkg/obistats/kolmogorovbeta.md
T
Eric Coissac 8c7017a99d ⬆️ version bump to v4.5
- Update obioptions.Version from "Release 4.4.29" to "/v/ Release v5"
- Update version.txt from 4.29 → .30
(automated by Makefile)
2026-04-13 13:34:53 +02:00

27 lines
1.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# `BetaKolmogorovDist` Function — Semantic Description
The `obistats.BetaKolmogorovDist` function computes a **goodness-of-fit statistic** between an empirical dataset and the *cumulative distribution* (CDF) of a **Beta probability distribution** with specified parameters `α` and `β`. It implements an adapted version of the **KolmogorovSmirnov (KS) test**, tailored for Beta-distributed theoretical models.
### Key Functionalities:
- **Input**:
- `data []float64`: Empirical sample (assumed sorted if `preordered = true`).
- `alpha`, `beta float64`: Shape parameters of the target Beta distribution.
- **Processing**:
- If not pre-sorted, data is copied and sorted ascendingly.
- For each ordered sample point `v_i`, it accumulates the sum `s = Σ_{j≤i} v_j`.
- Evaluates:
`|CDF_Beta(s; α, β) empirical CDF_i|`, where the *empirical* cumulative probability at rank `i` is approximated as `1/(i+1)` — a common Bayesian/maximum-likelihood estimator (e.g., median-rank).
- Returns the **supremum** of these absolute deviations (i.e., max distance across all points).
### Interpretation:
- A **small value** indicates the empirical cumulative sums align closely with the theoretical Beta CDF.
- A **large value** suggests significant deviation — poor fit of aBeta(α,β) to the data.
- Unlike standard KS tests (which use `i/n`), this uses `1/(i+1)` — suitable for small samples or Bayesian contexts.
### Dependencies:
- Uses `gonum.org/v1/gonum/stat/distuv.Beta` for CDF computation.
- Uses `gonum.org/v1/gonum/floats.Max` for distance extremal computation.
- `sort.Float64s` ensures ordered traversal.
> **Note**: The use of *cumulative sums* (`s`) rather than raw values is unconventional — possibly intended for data representing proportions or waiting times where the *integral* of observations matters.