feat: introduce trait-based distance aggregation and layered store
Introduces ColumnWeights, CountPartials, and BitPartials traits to compute and finalize partial distance matrices. Implements these traits for PersistentBitMatrix, PersistentCompactIntMatrix, and a new LayeredStore<S> wrapper that aggregates metrics across layers via parallel reduction. Adds ndarray for numerical aggregation and updates architecture documentation to reflect the trait-driven design and pending refactoring roadmap.
This commit is contained in:
@@ -968,94 +968,33 @@
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#layereddatastore-aggregation-within-one-partition" class="md-nav__link">
|
||||
<a href="#traits-obicompactvectraits" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
LayeredDataStore — aggregation within one partition
|
||||
Traits — obicompactvec::traits
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
<nav class="md-nav" aria-label="LayeredDataStore — aggregation within one partition">
|
||||
<ul class="md-nav__list">
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#column-statistics" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
Column statistics
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#self-contained-partials" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
Self-contained partials
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#normalised-partials-require-global-sums-from-above" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
Normalised partials (require global sums from above)
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
</ul>
|
||||
</nav>
|
||||
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#partitioneddatastore-aggregation-across-all-partitions" class="md-nav__link">
|
||||
<a href="#layeredstores-obilayeredmap" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
PartitionedDataStore — aggregation across all partitions
|
||||
LayeredStore<S> — obilayeredmap
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
<nav class="md-nav" aria-label="PartitionedDataStore — aggregation across all partitions">
|
||||
<nav class="md-nav" aria-label="LayeredStore<S> — obilayeredmap">
|
||||
<ul class="md-nav__list">
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#column-statistics_1" class="md-nav__link">
|
||||
<a href="#normalised-metrics-two-pass-cascade" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
Column statistics
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#self-contained-metrics-single-pass" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
Self-contained metrics — single pass
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#normalised-metrics-two-passes" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
Normalised metrics — two passes
|
||||
Normalised metrics — two-pass cascade
|
||||
|
||||
</span>
|
||||
</a>
|
||||
@@ -1137,6 +1076,45 @@
|
||||
</span>
|
||||
</a>
|
||||
|
||||
<nav class="md-nav" aria-label="Relationship to current implementation">
|
||||
<ul class="md-nav__list">
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#what-is-implemented" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
What is implemented
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#what-is-not-yet-implemented" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
What is not yet implemented
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#planned-refactoring" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
Planned refactoring
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
</ul>
|
||||
</nav>
|
||||
|
||||
</li>
|
||||
|
||||
</ul>
|
||||
@@ -1276,94 +1254,33 @@
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#layereddatastore-aggregation-within-one-partition" class="md-nav__link">
|
||||
<a href="#traits-obicompactvectraits" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
LayeredDataStore — aggregation within one partition
|
||||
Traits — obicompactvec::traits
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
<nav class="md-nav" aria-label="LayeredDataStore — aggregation within one partition">
|
||||
<ul class="md-nav__list">
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#column-statistics" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
Column statistics
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#self-contained-partials" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
Self-contained partials
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#normalised-partials-require-global-sums-from-above" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
Normalised partials (require global sums from above)
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
</ul>
|
||||
</nav>
|
||||
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#partitioneddatastore-aggregation-across-all-partitions" class="md-nav__link">
|
||||
<a href="#layeredstores-obilayeredmap" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
PartitionedDataStore — aggregation across all partitions
|
||||
LayeredStore<S> — obilayeredmap
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
<nav class="md-nav" aria-label="PartitionedDataStore — aggregation across all partitions">
|
||||
<nav class="md-nav" aria-label="LayeredStore<S> — obilayeredmap">
|
||||
<ul class="md-nav__list">
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#column-statistics_1" class="md-nav__link">
|
||||
<a href="#normalised-metrics-two-pass-cascade" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
Column statistics
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#self-contained-metrics-single-pass" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
Self-contained metrics — single pass
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#normalised-metrics-two-passes" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
Normalised metrics — two passes
|
||||
Normalised metrics — two-pass cascade
|
||||
|
||||
</span>
|
||||
</a>
|
||||
@@ -1445,6 +1362,45 @@
|
||||
</span>
|
||||
</a>
|
||||
|
||||
<nav class="md-nav" aria-label="Relationship to current implementation">
|
||||
<ul class="md-nav__list">
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#what-is-implemented" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
What is implemented
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#what-is-not-yet-implemented" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
What is not yet implemented
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#planned-refactoring" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
Planned refactoring
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
</ul>
|
||||
</nav>
|
||||
|
||||
</li>
|
||||
|
||||
</ul>
|
||||
@@ -1581,99 +1537,98 @@
|
||||
<hr />
|
||||
<h2 id="progressive-aggregation-principle">Progressive aggregation principle</h2>
|
||||
<p>Aggregation is <strong>hierarchical</strong>: each level computes its contribution by aggregating from the level immediately below it. No level skips a level or collects raw data from two levels down.</p>
|
||||
<div class="highlight"><pre><span></span><code>PersistentCompactIntMatrix::sum() — column sums for one (partition, layer) matrix
|
||||
<div class="highlight"><pre><span></span><code>PersistentCompactIntMatrix::col_weights() — column sums for one (partition, layer) matrix
|
||||
↓ Σ across layers
|
||||
LayeredCompactIntMatrix::sum() — column sums for one partition
|
||||
LayeredStore<PersistentCompactIntMatrix>::col_weights() — column sums for one partition
|
||||
↓ Σ across partitions
|
||||
PartitionedCompactIntMatrix::sum() — global column sums
|
||||
LayeredStore<LayeredStore<…>>::col_weights() — global column sums
|
||||
</code></pre></div>
|
||||
<p>The same cascade applies to every partial computation:</p>
|
||||
<div class="highlight"><pre><span></span><code>PersistentCompactIntMatrix::partial_bray_dist_matrix() — one (partition, layer)
|
||||
<p>The same cascade applies to every partial:</p>
|
||||
<div class="highlight"><pre><span></span><code>PersistentCompactIntMatrix::partial_bray() — one (partition, layer)
|
||||
↓ element-wise Σ across layers
|
||||
LayeredCompactIntMatrix::partial_bray() — one partition
|
||||
LayeredStore<PersistentCompactIntMatrix>::partial_bray() — one partition
|
||||
↓ element-wise Σ across partitions
|
||||
PartitionedCompactIntMatrix::partial_bray() — global partial → final dist
|
||||
LayeredStore<LayeredStore<…>>::partial_bray() — global partial → final dist
|
||||
</code></pre></div>
|
||||
<p>This means <code>LayeredCompactIntMatrix</code> never inspects individual <code>PersistentCompactIntVec</code> columns directly, and <code>PartitionedCompactIntMatrix</code> never inspects individual layers. Each level presents a stable API surface to the level above.</p>
|
||||
<p>Each level presents a stable trait surface to the level above; no level reaches two levels down.</p>
|
||||
<hr />
|
||||
<h2 id="layereddatastore-aggregation-within-one-partition">LayeredDataStore — aggregation within one partition</h2>
|
||||
<p>A <code>LayeredDataStore</code> holds one <code>DataStore</code> per layer within a single partition:</p>
|
||||
<div class="highlight"><pre><span></span><code><span class="k">struct</span><span class="w"> </span><span class="nc">LayeredCompactIntMatrix</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">layers</span><span class="p">:</span><span class="w"> </span><span class="nb">Vec</span><span class="o"><</span><span class="n">PersistentCompactIntMatrix</span><span class="o">></span><span class="w"> </span><span class="p">}</span>
|
||||
<span class="k">struct</span><span class="w"> </span><span class="nc">LayeredBitMatrix</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">layers</span><span class="p">:</span><span class="w"> </span><span class="nb">Vec</span><span class="o"><</span><span class="n">PersistentBitMatrix</span><span class="o">></span><span class="w"> </span><span class="p">}</span>
|
||||
</code></pre></div>
|
||||
<h3 id="column-statistics">Column statistics</h3>
|
||||
<div class="highlight"><pre><span></span><code><span class="c1">// LayeredCompactIntMatrix</span>
|
||||
<span class="k">fn</span><span class="w"> </span><span class="nf">sum</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">Array1</span><span class="o"><</span><span class="kt">u64</span><span class="o">></span>
|
||||
<span class="w"> </span><span class="c1">// = layers.par_iter().map(|m| m.sum()).reduce(element-wise +)</span>
|
||||
<h2 id="traits-obicompactvectraits">Traits — <code>obicompactvec::traits</code></h2>
|
||||
<p>Three traits unify the aggregation API across all levels of the hierarchy.</p>
|
||||
<div class="highlight"><pre><span></span><code><span class="k">trait</span><span class="w"> </span><span class="n">ColumnWeights</span><span class="p">:</span><span class="w"> </span><span class="nb">Send</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Sync</span><span class="w"> </span><span class="p">{</span>
|
||||
<span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">col_weights</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">Array1</span><span class="o"><</span><span class="kt">u64</span><span class="o">></span><span class="p">;</span>
|
||||
<span class="p">}</span>
|
||||
|
||||
<span class="c1">// LayeredBitMatrix</span>
|
||||
<span class="k">fn</span><span class="w"> </span><span class="nf">count_ones</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">Array1</span><span class="o"><</span><span class="kt">u64</span><span class="o">></span>
|
||||
<span class="w"> </span><span class="c1">// = layers.par_iter().map(|m| m.count_ones()).reduce(element-wise +)</span>
|
||||
</code></pre></div>
|
||||
<h3 id="self-contained-partials">Self-contained partials</h3>
|
||||
<p>Each method reduces across layers by element-wise addition of per-layer matrices:</p>
|
||||
<div class="highlight"><pre><span></span><code><span class="k">fn</span><span class="w"> </span><span class="nf">partial_bray</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="p">(</span><span class="n">Array2</span><span class="o"><</span><span class="kt">u64</span><span class="o">></span><span class="p">,</span><span class="w"> </span><span class="n">Array1</span><span class="o"><</span><span class="kt">u64</span><span class="o">></span><span class="p">)</span>
|
||||
<span class="w"> </span><span class="c1">// Σ_l layer_l.partial_bray_dist_matrix()</span>
|
||||
<span class="k">trait</span><span class="w"> </span><span class="n">CountPartials</span><span class="p">:</span><span class="w"> </span><span class="nc">ColumnWeights</span><span class="w"> </span><span class="p">{</span>
|
||||
<span class="w"> </span><span class="c1">// self-contained partials (additive, no parameter)</span>
|
||||
<span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">partial_bray</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">Array2</span><span class="o"><</span><span class="kt">u64</span><span class="o">></span><span class="p">;</span>
|
||||
<span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">partial_euclidean</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">Array2</span><span class="o"><</span><span class="kt">f64</span><span class="o">></span><span class="p">;</span>
|
||||
<span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">partial_threshold_jaccard</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">threshold</span><span class="p">:</span><span class="w"> </span><span class="kt">u32</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="p">(</span><span class="n">Array2</span><span class="o"><</span><span class="kt">u64</span><span class="o">></span><span class="p">,</span><span class="w"> </span><span class="n">Array2</span><span class="o"><</span><span class="kt">u64</span><span class="o">></span><span class="p">);</span>
|
||||
<span class="w"> </span><span class="c1">// normalised partials (global col_weights passed in cascade)</span>
|
||||
<span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">partial_relfreq_bray</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">global</span><span class="p">:</span><span class="w"> </span><span class="kp">&</span><span class="nc">Array1</span><span class="o"><</span><span class="kt">u64</span><span class="o">></span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">Array2</span><span class="o"><</span><span class="kt">f64</span><span class="o">></span><span class="p">;</span>
|
||||
<span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">partial_relfreq_euclidean</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">global</span><span class="p">:</span><span class="w"> </span><span class="kp">&</span><span class="nc">Array1</span><span class="o"><</span><span class="kt">u64</span><span class="o">></span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">Array2</span><span class="o"><</span><span class="kt">f64</span><span class="o">></span><span class="p">;</span>
|
||||
<span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">partial_hellinger</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">global</span><span class="p">:</span><span class="w"> </span><span class="kp">&</span><span class="nc">Array1</span><span class="o"><</span><span class="kt">u64</span><span class="o">></span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">Array2</span><span class="o"><</span><span class="kt">f64</span><span class="o">></span><span class="p">;</span>
|
||||
<span class="w"> </span><span class="c1">// provided finalisation methods (default implementations)</span>
|
||||
<span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">bray_dist_matrix</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">Array2</span><span class="o"><</span><span class="kt">f64</span><span class="o">></span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="err">…</span><span class="w"> </span><span class="p">}</span>
|
||||
<span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">euclidean_dist_matrix</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">Array2</span><span class="o"><</span><span class="kt">f64</span><span class="o">></span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="err">…</span><span class="w"> </span><span class="p">}</span>
|
||||
<span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">threshold_jaccard_dist_matrix</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">threshold</span><span class="p">:</span><span class="w"> </span><span class="kt">u32</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">Array2</span><span class="o"><</span><span class="kt">f64</span><span class="o">></span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="err">…</span><span class="w"> </span><span class="p">}</span>
|
||||
<span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">relfreq_bray_dist_matrix</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">Array2</span><span class="o"><</span><span class="kt">f64</span><span class="o">></span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="err">…</span><span class="w"> </span><span class="p">}</span>
|
||||
<span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">relfreq_euclidean_dist_matrix</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">Array2</span><span class="o"><</span><span class="kt">f64</span><span class="o">></span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="err">…</span><span class="w"> </span><span class="p">}</span>
|
||||
<span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">hellinger_dist_matrix</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">Array2</span><span class="o"><</span><span class="kt">f64</span><span class="o">></span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="err">…</span><span class="w"> </span><span class="p">}</span>
|
||||
<span class="p">}</span>
|
||||
|
||||
<span class="k">fn</span><span class="w"> </span><span class="nf">partial_euclidean</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">Array2</span><span class="o"><</span><span class="kt">f64</span><span class="o">></span>
|
||||
<span class="w"> </span><span class="c1">// Σ_l layer_l.partial_euclidean_dist_matrix()</span>
|
||||
|
||||
<span class="k">fn</span><span class="w"> </span><span class="nf">partial_jaccard</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="p">(</span><span class="n">Array2</span><span class="o"><</span><span class="kt">u64</span><span class="o">></span><span class="p">,</span><span class="w"> </span><span class="n">Array2</span><span class="o"><</span><span class="kt">u64</span><span class="o">></span><span class="p">)</span>
|
||||
<span class="w"> </span><span class="c1">// Σ_l layer_l.partial_jaccard_dist_matrix() [bit matrix]</span>
|
||||
<span class="w"> </span><span class="c1">// Σ_l layer_l.partial_threshold_jaccard_dist_matrix() [int matrix]</span>
|
||||
|
||||
<span class="k">fn</span><span class="w"> </span><span class="nf">partial_hamming</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">Array2</span><span class="o"><</span><span class="kt">u64</span><span class="o">></span>
|
||||
<span class="w"> </span><span class="c1">// Σ_l layer_l.partial_hamming_dist_matrix() [bit matrix]</span>
|
||||
</code></pre></div>
|
||||
<h3 id="normalised-partials-require-global-sums-from-above">Normalised partials (require global sums from above)</h3>
|
||||
<div class="highlight"><pre><span></span><code><span class="k">fn</span><span class="w"> </span><span class="nf">partial_relfreq_bray</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">global_sums</span><span class="p">:</span><span class="w"> </span><span class="kp">&</span><span class="nc">Array1</span><span class="o"><</span><span class="kt">u64</span><span class="o">></span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">Array2</span><span class="o"><</span><span class="kt">f64</span><span class="o">></span>
|
||||
<span class="w"> </span><span class="c1">// Σ_l layer_l.partial_relfreq_bray_dist_matrix(global_sums)</span>
|
||||
|
||||
<span class="k">fn</span><span class="w"> </span><span class="nf">partial_relfreq_euclidean</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">global_sums</span><span class="p">:</span><span class="w"> </span><span class="kp">&</span><span class="nc">Array1</span><span class="o"><</span><span class="kt">u64</span><span class="o">></span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">Array2</span><span class="o"><</span><span class="kt">f64</span><span class="o">></span>
|
||||
<span class="w"> </span><span class="c1">// Σ_l layer_l.partial_relfreq_euclidean_dist_matrix(global_sums)</span>
|
||||
|
||||
<span class="k">fn</span><span class="w"> </span><span class="nf">partial_hellinger</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">global_sums</span><span class="p">:</span><span class="w"> </span><span class="kp">&</span><span class="nc">Array1</span><span class="o"><</span><span class="kt">u64</span><span class="o">></span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">Array2</span><span class="o"><</span><span class="kt">f64</span><span class="o">></span>
|
||||
<span class="w"> </span><span class="c1">// Σ_l layer_l.partial_hellinger_euclidean_dist_matrix(global_sums)</span>
|
||||
</code></pre></div>
|
||||
<p><code>global_sums</code> is provided by the <code>PartitionedDataStore</code>; this level does not compute it.</p>
|
||||
<hr />
|
||||
<h2 id="partitioneddatastore-aggregation-across-all-partitions">PartitionedDataStore — aggregation across all partitions</h2>
|
||||
<p>A <code>PartitionedDataStore</code> holds one <code>LayeredDataStore</code> per partition:</p>
|
||||
<div class="highlight"><pre><span></span><code><span class="k">struct</span><span class="w"> </span><span class="nc">PartitionedCompactIntMatrix</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">partitions</span><span class="p">:</span><span class="w"> </span><span class="nb">Vec</span><span class="o"><</span><span class="n">LayeredCompactIntMatrix</span><span class="o">></span><span class="w"> </span><span class="p">}</span>
|
||||
<span class="k">struct</span><span class="w"> </span><span class="nc">PartitionedBitMatrix</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">partitions</span><span class="p">:</span><span class="w"> </span><span class="nb">Vec</span><span class="o"><</span><span class="n">LayeredBitMatrix</span><span class="o">></span><span class="w"> </span><span class="p">}</span>
|
||||
</code></pre></div>
|
||||
<h3 id="column-statistics_1">Column statistics</h3>
|
||||
<div class="highlight"><pre><span></span><code><span class="k">fn</span><span class="w"> </span><span class="nf">sum</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">Array1</span><span class="o"><</span><span class="kt">u64</span><span class="o">></span>
|
||||
<span class="w"> </span><span class="c1">// = partitions.par_iter().map(|p| p.sum()).reduce(element-wise +)</span>
|
||||
</code></pre></div>
|
||||
<p><code>p.sum()</code> is itself a reduction across layers (see above) — the cascade is preserved.</p>
|
||||
<h3 id="self-contained-metrics-single-pass">Self-contained metrics — single pass</h3>
|
||||
<div class="highlight"><pre><span></span><code><span class="k">fn</span><span class="w"> </span><span class="nf">bray_dist_matrix</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">Array2</span><span class="o"><</span><span class="kt">f64</span><span class="o">></span><span class="w"> </span><span class="p">{</span>
|
||||
<span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="p">(</span><span class="n">sum_min</span><span class="p">,</span><span class="w"> </span><span class="n">col_sums</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">partitions</span>
|
||||
<span class="w"> </span><span class="p">.</span><span class="n">par_iter</span><span class="p">()</span>
|
||||
<span class="w"> </span><span class="p">.</span><span class="n">map</span><span class="p">(</span><span class="o">|</span><span class="n">p</span><span class="o">|</span><span class="w"> </span><span class="n">p</span><span class="p">.</span><span class="n">partial_bray</span><span class="p">())</span>
|
||||
<span class="w"> </span><span class="p">.</span><span class="n">reduce</span><span class="p">(</span><span class="n">element</span><span class="o">-</span><span class="n">wise</span><span class="w"> </span><span class="o">+</span><span class="p">);</span>
|
||||
<span class="w"> </span><span class="c1">// finalise</span>
|
||||
<span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="p">(</span><span class="n">i</span><span class="p">,</span><span class="n">j</span><span class="p">):</span><span class="w"> </span><span class="nc">dist</span><span class="p">[</span><span class="n">i</span><span class="p">,</span><span class="n">j</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">1</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="mi">2</span><span class="err">·</span><span class="n">sum_min</span><span class="p">[</span><span class="n">i</span><span class="p">,</span><span class="n">j</span><span class="p">]</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="p">(</span><span class="n">col_sums</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">col_sums</span><span class="p">[</span><span class="n">j</span><span class="p">])</span>
|
||||
<span class="k">trait</span><span class="w"> </span><span class="n">BitPartials</span><span class="p">:</span><span class="w"> </span><span class="nc">ColumnWeights</span><span class="w"> </span><span class="p">{</span>
|
||||
<span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">partial_jaccard</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="p">(</span><span class="n">Array2</span><span class="o"><</span><span class="kt">u64</span><span class="o">></span><span class="p">,</span><span class="w"> </span><span class="n">Array2</span><span class="o"><</span><span class="kt">u64</span><span class="o">></span><span class="p">);</span>
|
||||
<span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">partial_hamming</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">Array2</span><span class="o"><</span><span class="kt">u64</span><span class="o">></span><span class="p">;</span>
|
||||
<span class="w"> </span><span class="c1">// provided</span>
|
||||
<span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">jaccard_dist_matrix</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">Array2</span><span class="o"><</span><span class="kt">f64</span><span class="o">></span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="err">…</span><span class="w"> </span><span class="p">}</span>
|
||||
<span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">hamming_dist_matrix</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">Array2</span><span class="o"><</span><span class="kt">u64</span><span class="o">></span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="err">…</span><span class="w"> </span><span class="p">}</span>
|
||||
<span class="p">}</span>
|
||||
</code></pre></div>
|
||||
<h3 id="normalised-metrics-two-passes">Normalised metrics — two passes</h3>
|
||||
<div class="highlight"><pre><span></span><code><span class="k">fn</span><span class="w"> </span><span class="nf">relfreq_bray_dist_matrix</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">Array2</span><span class="o"><</span><span class="kt">f64</span><span class="o">></span><span class="w"> </span><span class="p">{</span>
|
||||
<span class="w"> </span><span class="c1">// pass 1 — progressive: PartitionedDataStore::sum()</span>
|
||||
<span class="w"> </span><span class="c1">// calls LayeredDataStore::sum() per partition (parallel)</span>
|
||||
<span class="w"> </span><span class="c1">// calls PersistentCompactIntMatrix::sum() per layer (parallel)</span>
|
||||
<span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">global_sums</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">sum</span><span class="p">();</span>
|
||||
|
||||
<span class="w"> </span><span class="c1">// pass 2 — per-partition partial using global_sums (parallel)</span>
|
||||
<span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">matrix</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">partitions</span>
|
||||
<span class="w"> </span><span class="p">.</span><span class="n">par_iter</span><span class="p">()</span>
|
||||
<span class="w"> </span><span class="p">.</span><span class="n">map</span><span class="p">(</span><span class="o">|</span><span class="n">p</span><span class="o">|</span><span class="w"> </span><span class="n">p</span><span class="p">.</span><span class="n">partial_relfreq_bray</span><span class="p">(</span><span class="o">&</span><span class="n">global_sums</span><span class="p">))</span>
|
||||
<span class="w"> </span><span class="p">.</span><span class="n">reduce</span><span class="p">(</span><span class="n">element</span><span class="o">-</span><span class="n">wise</span><span class="w"> </span><span class="o">+</span><span class="p">);</span>
|
||||
<span class="w"> </span><span class="c1">// finalise</span>
|
||||
<span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="p">(</span><span class="n">i</span><span class="p">,</span><span class="n">j</span><span class="p">):</span><span class="w"> </span><span class="nc">dist</span><span class="p">[</span><span class="n">i</span><span class="p">,</span><span class="n">j</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">1</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">matrix</span><span class="p">[</span><span class="n">i</span><span class="p">,</span><span class="n">j</span><span class="p">]</span>
|
||||
<p><strong>Leaf implementors</strong> (in <code>obicompactvec</code>):</p>
|
||||
<table>
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<th>Traits</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr>
|
||||
<td><code>PersistentCompactIntMatrix</code></td>
|
||||
<td><code>ColumnWeights</code> (via <code>sum()</code>), <code>CountPartials</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><code>PersistentBitMatrix</code></td>
|
||||
<td><code>ColumnWeights</code> (via <code>count_ones()</code>), <code>BitPartials</code></td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
<p><code>PersistentCompactIntVec</code> and <code>PersistentBitVec</code> do <strong>not</strong> implement these traits — they are single-column primitives, not matrix-level aggregators.</p>
|
||||
<hr />
|
||||
<h2 id="layeredstores-obilayeredmap"><code>LayeredStore<S></code> — <code>obilayeredmap</code></h2>
|
||||
<p>A single generic wrapper replaces the need for named <code>LayeredDataStore</code> and <code>PartitionedDataStore</code> types:</p>
|
||||
<div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">struct</span><span class="w"> </span><span class="nc">LayeredStore</span><span class="o"><</span><span class="n">S</span><span class="o">></span><span class="p">(</span><span class="nb">Vec</span><span class="o"><</span><span class="n">S</span><span class="o">></span><span class="p">);</span>
|
||||
</code></pre></div>
|
||||
<p>Three blanket impls propagate the traits up the hierarchy:</p>
|
||||
<div class="highlight"><pre><span></span><code><span class="k">impl</span><span class="o"><</span><span class="n">S</span><span class="p">:</span><span class="w"> </span><span class="nc">ColumnWeights</span><span class="o">></span><span class="w"> </span><span class="n">ColumnWeights</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">LayeredStore</span><span class="o"><</span><span class="n">S</span><span class="o">></span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="err">…</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="c1">// Σ across inner stores</span>
|
||||
<span class="k">impl</span><span class="o"><</span><span class="n">S</span><span class="p">:</span><span class="w"> </span><span class="nc">CountPartials</span><span class="o">></span><span class="w"> </span><span class="n">CountPartials</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">LayeredStore</span><span class="o"><</span><span class="n">S</span><span class="o">></span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="err">…</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="c1">// same pattern</span>
|
||||
<span class="k">impl</span><span class="o"><</span><span class="n">S</span><span class="p">:</span><span class="w"> </span><span class="nc">BitPartials</span><span class="o">></span><span class="w"> </span><span class="n">BitPartials</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">LayeredStore</span><span class="o"><</span><span class="n">S</span><span class="o">></span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="err">…</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="c1">// same pattern</span>
|
||||
</code></pre></div>
|
||||
<p>Because the blanket impl is recursive, <strong><code>LayeredStore<LayeredStore<S>></code></strong> automatically inherits all three traits when <code>S</code> does — no separate <code>PartitionedStore</code> type is needed:</p>
|
||||
<div class="highlight"><pre><span></span><code>PersistentCompactIntMatrix implements CountPartials
|
||||
LayeredStore<PersistentCompactIntMatrix> via blanket impl (= one partition)
|
||||
LayeredStore<LayeredStore<…>> via blanket impl (= partitioned index)
|
||||
</code></pre></div>
|
||||
<h3 id="normalised-metrics-two-pass-cascade">Normalised metrics — two-pass cascade</h3>
|
||||
<p>The normalised finalisation methods call <code>col_weights()</code> first (pass 1), then the normalised partial (pass 2). Both calls go through the same blanket impl, so the cascade is automatic:</p>
|
||||
<div class="highlight"><pre><span></span><code><span class="c1">// called on LayeredStore<LayeredStore<PersistentCompactIntMatrix>></span>
|
||||
<span class="k">fn</span><span class="w"> </span><span class="nf">relfreq_bray_dist_matrix</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">Array2</span><span class="o"><</span><span class="kt">f64</span><span class="o">></span><span class="w"> </span><span class="p">{</span>
|
||||
<span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">global</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">col_weights</span><span class="p">();</span><span class="w"> </span><span class="c1">// pass 1 — progressive sum at every level</span>
|
||||
<span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">p</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">partial_relfreq_bray</span><span class="p">(</span><span class="o">&</span><span class="n">global</span><span class="p">);</span><span class="w"> </span><span class="c1">// pass 2 — global passed in cascade</span>
|
||||
<span class="w"> </span><span class="n">p</span><span class="p">.</span><span class="n">mapv</span><span class="p">(</span><span class="o">|</span><span class="n">v</span><span class="o">|</span><span class="w"> </span><span class="mf">1.0</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">v</span><span class="p">)</span><span class="w"> </span><span class="c1">// finalise (diagonal zeroed separately)</span>
|
||||
<span class="p">}</span>
|
||||
</code></pre></div>
|
||||
<p><code>global_sums</code> is exact because each kmer belongs to exactly one (partition, layer) pair — no double-counting. Pass 1 is itself fully parallel at every level of the hierarchy.</p>
|
||||
<p><code>global</code> is exact: each kmer belongs to exactly one <code>(partition, layer)</code> pair, so there is no double-counting across the hierarchy.</p>
|
||||
<hr />
|
||||
<h2 id="parallelism-model">Parallelism model</h2>
|
||||
<table>
|
||||
@@ -1687,31 +1642,32 @@ PartitionedCompactIntMatrix::partial_bray() — global partial →
|
||||
<tbody>
|
||||
<tr>
|
||||
<td>Across partitions</td>
|
||||
<td><code>LayeredDataStore</code></td>
|
||||
<td><code>LayeredStore<LayeredStore<S>></code> inner stores</td>
|
||||
<td>none — fully independent</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Across layers (self-contained)</td>
|
||||
<td><code>(partition, layer)</code> pair</td>
|
||||
<td>Across layers within a partition</td>
|
||||
<td><code>LayeredStore<S></code> inner stores</td>
|
||||
<td>none — disjoint kmer sets</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Across layers (normalised, pass 1)</td>
|
||||
<td><code>(partition, layer)</code> pair</td>
|
||||
<td>none — sums are additive</td>
|
||||
<td>Normalised pass 1 (<code>col_weights</code>)</td>
|
||||
<td>per inner store</td>
|
||||
<td>none — additive</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Across layers (normalised, pass 2)</td>
|
||||
<td><code>(partition, layer)</code> pair</td>
|
||||
<td>global_sums broadcast read-only</td>
|
||||
<td>Normalised pass 2 (partial)</td>
|
||||
<td>per inner store</td>
|
||||
<td><code>global</code> broadcast read-only</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Within a DataStore (distance matrix)</td>
|
||||
<td>Within a matrix (distance)</td>
|
||||
<td>upper-triangle pair <code>(i,j)</code></td>
|
||||
<td>none — rayon par_iter</td>
|
||||
<td>none — rayon <code>par_iter</code></td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
<p>All levels use rayon <code>par_iter</code> internally; <code>reduce_with</code> performs a parallel tree reduction.</p>
|
||||
<hr />
|
||||
<h2 id="query-model">Query model</h2>
|
||||
<h3 id="point-query-kmer-optionitem">Point query — <code>kmer → Option<Item></code></h3>
|
||||
@@ -1742,19 +1698,24 @@ for (p, l) in all_partition_layer_pairs().par_iter():
|
||||
<p>Other derivations: threshold a count matrix → binary presence matrix; union two presence matrices; merge two count matrices (saturating add, column-wise). All are local to one <code>(partition, layer)</code> pair.</p>
|
||||
<hr />
|
||||
<h2 id="relationship-to-current-implementation">Relationship to current implementation</h2>
|
||||
<p>The current <code>obilayeredmap</code> crate implements a subset of this architecture. Key divergences:</p>
|
||||
<h3 id="what-is-implemented">What is implemented</h3>
|
||||
<ul>
|
||||
<li><code>Layer<D: LayerData></code> fuses <code>MphfLayer</code> and one <code>DataStore</code> into a single generic type. Multiple data stores on the same MPHF are not supported.</li>
|
||||
<li><code>LayerData::open(dir)</code> embeds the path convention (<code>counts/</code>, <code>presence/</code>) inside the store type, preventing the <code>PartitionedIndex</code> from managing paths externally.</li>
|
||||
<li><code>LayeredDataStore</code> and <code>PartitionedDataStore</code> do not yet exist; <code>LayeredMap</code> is a single-partition structure without a distance matrix API.</li>
|
||||
<li>The partial distance methods exist on <code>PersistentCompactIntMatrix</code> and <code>PersistentBitMatrix</code> and are tested; they are not yet composed across layers and partitions.</li>
|
||||
<li><strong><code>obicompactvec::traits</code></strong>: <code>ColumnWeights</code>, <code>CountPartials</code>, <code>BitPartials</code> are defined and implemented on <code>PersistentCompactIntMatrix</code> and <code>PersistentBitMatrix</code>.</li>
|
||||
<li><strong><code>obilayeredmap::LayeredStore<S></code></strong>: generic wrapper with blanket impls for all three traits. <code>LayeredStore<LayeredStore<S>></code> is the partitioned level — no separate type needed. Tests confirm that splitting data across layers and across partitions gives the same distance matrices as computing on flat combined data.</li>
|
||||
</ul>
|
||||
<p>Planned refactoring:
|
||||
1. Extract <code>MphfLayer</code> from <code>Layer<D></code> as an autonomous type.
|
||||
2. Replace <code>LayerData</code> trait with <code>DataStore</code> trait (no path knowledge).
|
||||
3. Implement <code>LayeredCompactIntMatrix</code> / <code>LayeredBitMatrix</code> with the partial + full distance APIs described above.
|
||||
4. Implement <code>PartitionedCompactIntMatrix</code> / <code>PartitionedBitMatrix</code> with two-pass support for normalised metrics.
|
||||
5. Implement <code>PartitionedIndex</code> for point queries with parallel dispatch.</p>
|
||||
<h3 id="what-is-not-yet-implemented">What is not yet implemented</h3>
|
||||
<ul>
|
||||
<li><code>Layer<D: LayerData></code> still fuses <code>MphfLayer</code> and one <code>DataStore</code>. Multiple data stores on the same MPHF are not supported.</li>
|
||||
<li><code>LayeredMap</code> is a single-partition structure without distance matrix API; it does not yet use <code>LayeredStore</code>.</li>
|
||||
<li>No <code>PartitionedIndex</code> type for point queries with parallel partition dispatch.</li>
|
||||
</ul>
|
||||
<h3 id="planned-refactoring">Planned refactoring</h3>
|
||||
<ol>
|
||||
<li>Extract <code>MphfLayer</code> from <code>Layer<D></code> as an autonomous type.</li>
|
||||
<li>Replace <code>LayerData</code> trait with the <code>DataStore</code> / <code>ColumnWeights</code> / <code>CountPartials</code> / <code>BitPartials</code> system.</li>
|
||||
<li>Rewire <code>LayeredMap</code> to hold <code>LayeredStore<PersistentCompactIntMatrix></code> (or bit variant) alongside the MPHF layers.</li>
|
||||
<li>Implement <code>PartitionedIndex</code> using <code>LayeredStore<LayeredStore<S>></code> for data and parallel dispatch for queries.</li>
|
||||
</ol>
|
||||
|
||||
|
||||
|
||||
|
||||
Reference in New Issue
Block a user