45d49ed501
Add comprehensive documentation for the `obilayeredmap` crate, `PersistentCompactIntVec`, `PersistentBitVec`, and the hierarchical k-mer index architecture, including sidebar navigation updates across all documentation pages. Refactor the Bray-Curtis distance computation in `obicompactvec` to decouple numerator and denominator calculations, replacing direct pairwise calls with explicit loops over precomputed sums. Update tests to verify column sum accuracy and align with the simplified API.
626 lines
20 KiB
HTML
626 lines
20 KiB
HTML
|
|
<!DOCTYPE html>
|
|
|
|
<html class="no-js" lang="en">
|
|
<head>
|
|
<meta charset="utf-8"/>
|
|
<meta content="width=device-width,initial-scale=1" name="viewport"/>
|
|
<link href="../kmer/" rel="prev"/>
|
|
<link href="../pipeline/" rel="next"/>
|
|
<link href="../../assets/images/favicon.png" rel="icon"/>
|
|
<meta content="mkdocs-1.6.1, mkdocs-material-9.7.6" name="generator"/>
|
|
<title>Chunk reader - obikmer</title>
|
|
<link href="../../assets/stylesheets/main.484c7ddc.min.css" rel="stylesheet"/>
|
|
<link crossorigin="" href="https://fonts.gstatic.com" rel="preconnect"/>
|
|
<link href="https://fonts.googleapis.com/css?family=Roboto:300,300i,400,400i,700,700i%7CRoboto+Mono:400,400i,700,700i&display=fallback" rel="stylesheet"/>
|
|
<style>:root{--md-text-font:"Roboto";--md-code-font:"Roboto Mono"}</style>
|
|
<script>__md_scope=new URL("../..",location),__md_hash=e=>[...e].reduce(((e,_)=>(e<<5)-e+_.charCodeAt(0)),0),__md_get=(e,_=localStorage,t=__md_scope)=>JSON.parse(_.getItem(t.pathname+"."+e)),__md_set=(e,_,t=localStorage,a=__md_scope)=>{try{t.setItem(a.pathname+"."+e,JSON.stringify(_))}catch(e){}}</script>
|
|
</head>
|
|
<body dir="ltr">
|
|
<input autocomplete="off" class="md-toggle" data-md-toggle="drawer" id="__drawer" type="checkbox"/>
|
|
<input autocomplete="off" class="md-toggle" data-md-toggle="search" id="__search" type="checkbox"/>
|
|
<label class="md-overlay" for="__drawer"></label>
|
|
<div data-md-component="skip">
|
|
<a class="md-skip" href="#chunk-reader-implementation">
|
|
Skip to content
|
|
</a>
|
|
</div>
|
|
<div data-md-component="announce">
|
|
</div>
|
|
<header class="md-header md-header--shadow" data-md-component="header">
|
|
<nav aria-label="Header" class="md-header__inner md-grid">
|
|
<a aria-label="obikmer" class="md-header__button md-logo" data-md-component="logo" href="../.." title="obikmer">
|
|
<svg viewbox="0 0 24 24" xmlns="http://www.w3.org/2000/svg"><path d="M12 8a3 3 0 0 0 3-3 3 3 0 0 0-3-3 3 3 0 0 0-3 3 3 3 0 0 0 3 3m0 3.54C9.64 9.35 6.5 8 3 8v11c3.5 0 6.64 1.35 9 3.54 2.36-2.19 5.5-3.54 9-3.54V8c-3.5 0-6.64 1.35-9 3.54"></path></svg>
|
|
</a>
|
|
<label class="md-header__button md-icon" for="__drawer">
|
|
<svg viewbox="0 0 24 24" xmlns="http://www.w3.org/2000/svg"><path d="M3 6h18v2H3zm0 5h18v2H3zm0 5h18v2H3z"></path></svg>
|
|
</label>
|
|
<div class="md-header__title" data-md-component="header-title">
|
|
<div class="md-header__ellipsis">
|
|
<div class="md-header__topic">
|
|
<span class="md-ellipsis">
|
|
obikmer
|
|
</span>
|
|
</div>
|
|
<div class="md-header__topic" data-md-component="header-topic">
|
|
<span class="md-ellipsis">
|
|
|
|
Chunk reader
|
|
|
|
</span>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
<script>var palette=__md_get("__palette");if(palette&&palette.color){if("(prefers-color-scheme)"===palette.color.media){var media=matchMedia("(prefers-color-scheme: light)"),input=document.querySelector(media.matches?"[data-md-color-media='(prefers-color-scheme: light)']":"[data-md-color-media='(prefers-color-scheme: dark)']");palette.color.media=input.getAttribute("data-md-color-media"),palette.color.scheme=input.getAttribute("data-md-color-scheme"),palette.color.primary=input.getAttribute("data-md-color-primary"),palette.color.accent=input.getAttribute("data-md-color-accent")}for(var[key,value]of Object.entries(palette.color))document.body.setAttribute("data-md-color-"+key,value)}</script>
|
|
</nav>
|
|
</header>
|
|
<div class="md-container" data-md-component="container">
|
|
<main class="md-main" data-md-component="main">
|
|
<div class="md-main__inner md-grid">
|
|
<div class="md-sidebar md-sidebar--primary" data-md-component="sidebar" data-md-type="navigation">
|
|
<div class="md-sidebar__scrollwrap">
|
|
<div class="md-sidebar__inner">
|
|
<nav aria-label="Navigation" class="md-nav md-nav--primary" data-md-level="0">
|
|
<label class="md-nav__title" for="__drawer">
|
|
<a aria-label="obikmer" class="md-nav__button md-logo" data-md-component="logo" href="../.." title="obikmer">
|
|
<svg viewbox="0 0 24 24" xmlns="http://www.w3.org/2000/svg"><path d="M12 8a3 3 0 0 0 3-3 3 3 0 0 0-3-3 3 3 0 0 0-3 3 3 3 0 0 0 3 3m0 3.54C9.64 9.35 6.5 8 3 8v11c3.5 0 6.64 1.35 9 3.54 2.36-2.19 5.5-3.54 9-3.54V8c-3.5 0-6.64 1.35-9 3.54"></path></svg>
|
|
</a>
|
|
obikmer
|
|
</label>
|
|
<ul class="md-nav__list" data-md-scrollfix="">
|
|
<li class="md-nav__item">
|
|
<a class="md-nav__link" href="../..">
|
|
<span class="md-ellipsis">
|
|
|
|
|
|
Home
|
|
|
|
|
|
|
|
</span>
|
|
</a>
|
|
</li>
|
|
<li class="md-nav__item md-nav__item--nested">
|
|
<input class="md-nav__toggle md-toggle" id="__nav_2" type="checkbox"/>
|
|
<label class="md-nav__link" for="__nav_2" id="__nav_2_label" tabindex="0">
|
|
<span class="md-ellipsis">
|
|
|
|
|
|
Theory
|
|
|
|
|
|
|
|
</span>
|
|
<span class="md-nav__icon md-icon"></span>
|
|
</label>
|
|
<nav aria-expanded="false" aria-labelledby="__nav_2_label" class="md-nav" data-md-level="1">
|
|
<label class="md-nav__title" for="__nav_2">
|
|
<span class="md-nav__icon md-icon"></span>
|
|
|
|
|
|
Theory
|
|
|
|
|
|
</label>
|
|
<ul class="md-nav__list" data-md-scrollfix="">
|
|
<li class="md-nav__item">
|
|
<a class="md-nav__link" href="../../kmers/">
|
|
<span class="md-ellipsis">
|
|
|
|
|
|
Kmers and super-kmers
|
|
|
|
|
|
|
|
</span>
|
|
</a>
|
|
</li>
|
|
<li class="md-nav__item">
|
|
<a class="md-nav__link" href="../../theory/encoding/">
|
|
<span class="md-ellipsis">
|
|
|
|
|
|
DNA encoding
|
|
|
|
|
|
|
|
</span>
|
|
</a>
|
|
</li>
|
|
<li class="md-nav__item">
|
|
<a class="md-nav__link" href="../../theory/entropy/">
|
|
<span class="md-ellipsis">
|
|
|
|
|
|
Entropy filter
|
|
|
|
|
|
|
|
</span>
|
|
</a>
|
|
</li>
|
|
<li class="md-nav__item">
|
|
<a class="md-nav__link" href="../../theory/minimizer/">
|
|
<span class="md-ellipsis">
|
|
|
|
|
|
Minimizer selection
|
|
|
|
|
|
|
|
</span>
|
|
</a>
|
|
</li>
|
|
<li class="md-nav__item">
|
|
<a class="md-nav__link" href="../../theory/indexing/">
|
|
<span class="md-ellipsis">
|
|
|
|
|
|
Partitioning architecture
|
|
|
|
|
|
|
|
</span>
|
|
</a>
|
|
</li>
|
|
</ul>
|
|
</nav>
|
|
</li>
|
|
<li class="md-nav__item md-nav__item--active md-nav__item--nested">
|
|
<input checked="" class="md-nav__toggle md-toggle" id="__nav_3" type="checkbox"/>
|
|
<label class="md-nav__link" for="__nav_3" id="__nav_3_label" tabindex="0">
|
|
<span class="md-ellipsis">
|
|
|
|
|
|
Implementation
|
|
|
|
|
|
|
|
</span>
|
|
<span class="md-nav__icon md-icon"></span>
|
|
</label>
|
|
<nav aria-expanded="true" aria-labelledby="__nav_3_label" class="md-nav" data-md-level="1">
|
|
<label class="md-nav__title" for="__nav_3">
|
|
<span class="md-nav__icon md-icon"></span>
|
|
|
|
|
|
Implementation
|
|
|
|
|
|
</label>
|
|
<ul class="md-nav__list" data-md-scrollfix="">
|
|
<li class="md-nav__item">
|
|
<a class="md-nav__link" href="../superkmer/">
|
|
<span class="md-ellipsis">
|
|
|
|
|
|
SuperKmer
|
|
|
|
|
|
|
|
</span>
|
|
</a>
|
|
</li>
|
|
<li class="md-nav__item">
|
|
<a class="md-nav__link" href="../kmer/">
|
|
<span class="md-ellipsis">
|
|
|
|
|
|
Kmer
|
|
|
|
|
|
|
|
</span>
|
|
</a>
|
|
</li>
|
|
<li class="md-nav__item md-nav__item--active">
|
|
<input class="md-nav__toggle md-toggle" id="__toc" type="checkbox"/>
|
|
<label class="md-nav__link md-nav__link--active" for="__toc">
|
|
<span class="md-ellipsis">
|
|
|
|
|
|
Chunk reader
|
|
|
|
|
|
|
|
</span>
|
|
<span class="md-nav__icon md-icon"></span>
|
|
</label>
|
|
<a class="md-nav__link md-nav__link--active" href="./">
|
|
<span class="md-ellipsis">
|
|
|
|
|
|
Chunk reader
|
|
|
|
|
|
|
|
</span>
|
|
</a>
|
|
<nav aria-label="Table of contents" class="md-nav md-nav--secondary">
|
|
<label class="md-nav__title" for="__toc">
|
|
<span class="md-nav__icon md-icon"></span>
|
|
Table of contents
|
|
</label>
|
|
<ul class="md-nav__list" data-md-component="toc" data-md-scrollfix="">
|
|
<li class="md-nav__item">
|
|
<a class="md-nav__link" href="#output-type-rope">
|
|
<span class="md-ellipsis">
|
|
|
|
Output type: rope
|
|
|
|
</span>
|
|
</a>
|
|
</li>
|
|
<li class="md-nav__item">
|
|
<a class="md-nav__link" href="#allocation-policy">
|
|
<span class="md-ellipsis">
|
|
|
|
Allocation policy
|
|
|
|
</span>
|
|
</a>
|
|
</li>
|
|
<li class="md-nav__item">
|
|
<a class="md-nav__link" href="#seqchunkiter">
|
|
<span class="md-ellipsis">
|
|
|
|
SeqChunkIter
|
|
|
|
</span>
|
|
</a>
|
|
</li>
|
|
<li class="md-nav__item">
|
|
<a class="md-nav__link" href="#boundary-detection-fasta">
|
|
<span class="md-ellipsis">
|
|
|
|
Boundary detection — FASTA
|
|
|
|
</span>
|
|
</a>
|
|
</li>
|
|
<li class="md-nav__item">
|
|
<a class="md-nav__link" href="#boundary-detection-fastq">
|
|
<span class="md-ellipsis">
|
|
|
|
Boundary detection — FASTQ
|
|
|
|
</span>
|
|
</a>
|
|
</li>
|
|
</ul>
|
|
</nav>
|
|
</li>
|
|
<li class="md-nav__item">
|
|
<a class="md-nav__link" href="../pipeline/">
|
|
<span class="md-ellipsis">
|
|
|
|
|
|
Construction pipeline
|
|
|
|
|
|
|
|
</span>
|
|
</a>
|
|
</li>
|
|
<li class="md-nav__item">
|
|
<a class="md-nav__link" href="../obipipeline/">
|
|
<span class="md-ellipsis">
|
|
|
|
|
|
obipipeline library
|
|
|
|
|
|
|
|
</span>
|
|
</a>
|
|
</li>
|
|
<li class="md-nav__item">
|
|
<a class="md-nav__link" href="../storage/">
|
|
<span class="md-ellipsis">
|
|
|
|
|
|
On-disk storage
|
|
|
|
|
|
|
|
</span>
|
|
</a>
|
|
</li>
|
|
<li class="md-nav__item">
|
|
<a class="md-nav__link" href="../mphf/">
|
|
<span class="md-ellipsis">
|
|
|
|
|
|
MPHF selection
|
|
|
|
|
|
|
|
</span>
|
|
</a>
|
|
</li>
|
|
<li class="md-nav__item">
|
|
<a class="md-nav__link" href="../unitig_evidence/">
|
|
<span class="md-ellipsis">
|
|
|
|
|
|
Unitig evidence encoding
|
|
|
|
|
|
|
|
</span>
|
|
</a>
|
|
</li>
|
|
<li class="md-nav__item">
|
|
<a class="md-nav__link" href="../obilayeredmap/">
|
|
<span class="md-ellipsis">
|
|
|
|
|
|
obilayeredmap crate
|
|
|
|
|
|
|
|
</span>
|
|
</a>
|
|
</li>
|
|
<li class="md-nav__item">
|
|
<a class="md-nav__link" href="../persistent_compact_int_vec/">
|
|
<span class="md-ellipsis">
|
|
|
|
|
|
PersistentCompactIntVec
|
|
|
|
|
|
|
|
</span>
|
|
</a>
|
|
</li>
|
|
<li class="md-nav__item">
|
|
<a class="md-nav__link" href="../persistent_bit_vec/">
|
|
<span class="md-ellipsis">
|
|
|
|
|
|
PersistentBitVec
|
|
|
|
|
|
|
|
</span>
|
|
</a>
|
|
</li>
|
|
</ul>
|
|
</nav>
|
|
</li>
|
|
<li class="md-nav__item md-nav__item--nested">
|
|
<input class="md-nav__toggle md-toggle" id="__nav_4" type="checkbox"/>
|
|
<label class="md-nav__link" for="__nav_4" id="__nav_4_label" tabindex="0">
|
|
<span class="md-ellipsis">
|
|
|
|
|
|
Architecture
|
|
|
|
|
|
|
|
</span>
|
|
<span class="md-nav__icon md-icon"></span>
|
|
</label>
|
|
<nav aria-expanded="false" aria-labelledby="__nav_4_label" class="md-nav" data-md-level="1">
|
|
<label class="md-nav__title" for="__nav_4">
|
|
<span class="md-nav__icon md-icon"></span>
|
|
|
|
|
|
Architecture
|
|
|
|
|
|
</label>
|
|
<ul class="md-nav__list" data-md-scrollfix="">
|
|
<li class="md-nav__item">
|
|
<a class="md-nav__link" href="../../architecture/sequences/invariant/">
|
|
<span class="md-ellipsis">
|
|
|
|
|
|
Sequences
|
|
|
|
|
|
|
|
</span>
|
|
</a>
|
|
</li>
|
|
<li class="md-nav__item">
|
|
<a class="md-nav__link" href="../../architecture/index_architecture/">
|
|
<span class="md-ellipsis">
|
|
|
|
|
|
Kmer index
|
|
|
|
|
|
|
|
</span>
|
|
</a>
|
|
</li>
|
|
</ul>
|
|
</nav>
|
|
</li>
|
|
</ul>
|
|
</nav>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
<div class="md-sidebar md-sidebar--secondary" data-md-component="sidebar" data-md-type="toc">
|
|
<div class="md-sidebar__scrollwrap">
|
|
<div class="md-sidebar__inner">
|
|
<nav aria-label="Table of contents" class="md-nav md-nav--secondary">
|
|
<label class="md-nav__title" for="__toc">
|
|
<span class="md-nav__icon md-icon"></span>
|
|
Table of contents
|
|
</label>
|
|
<ul class="md-nav__list" data-md-component="toc" data-md-scrollfix="">
|
|
<li class="md-nav__item">
|
|
<a class="md-nav__link" href="#output-type-rope">
|
|
<span class="md-ellipsis">
|
|
|
|
Output type: rope
|
|
|
|
</span>
|
|
</a>
|
|
</li>
|
|
<li class="md-nav__item">
|
|
<a class="md-nav__link" href="#allocation-policy">
|
|
<span class="md-ellipsis">
|
|
|
|
Allocation policy
|
|
|
|
</span>
|
|
</a>
|
|
</li>
|
|
<li class="md-nav__item">
|
|
<a class="md-nav__link" href="#seqchunkiter">
|
|
<span class="md-ellipsis">
|
|
|
|
SeqChunkIter
|
|
|
|
</span>
|
|
</a>
|
|
</li>
|
|
<li class="md-nav__item">
|
|
<a class="md-nav__link" href="#boundary-detection-fasta">
|
|
<span class="md-ellipsis">
|
|
|
|
Boundary detection — FASTA
|
|
|
|
</span>
|
|
</a>
|
|
</li>
|
|
<li class="md-nav__item">
|
|
<a class="md-nav__link" href="#boundary-detection-fastq">
|
|
<span class="md-ellipsis">
|
|
|
|
Boundary detection — FASTQ
|
|
|
|
</span>
|
|
</a>
|
|
</li>
|
|
</ul>
|
|
</nav>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
<div class="md-content" data-md-component="content">
|
|
<article class="md-content__inner md-typeset">
|
|
<h1 id="chunk-reader-implementation">Chunk reader — implementation</h1>
|
|
<p>The <code>obiread</code> crate provides a streaming iterator that reads FASTA or FASTQ files in fixed-size blocks and yields self-contained chunks, each ending on a complete sequence record boundary. Chunks are consumed in parallel by downstream workers.</p>
|
|
<h2 id="output-type-rope">Output type: rope</h2>
|
|
<p>Each chunk is a <code>Vec<Bytes></code> — a <strong>rope</strong>: a list of reference-counted byte slices that are not necessarily contiguous in memory. The consumer iterates over the slices in order.</p>
|
|
<p>Using <code>bytes::Bytes</code> means the split at the record boundary is O(1): <code>Bytes::split_to(n)</code> adjusts a reference counter, not memory. No <code>memcpy</code> in the common case.</p>
|
|
<h2 id="allocation-policy">Allocation policy</h2>
|
|
<table>
|
|
<thead>
|
|
<tr>
|
|
<th>Case</th>
|
|
<th>Cost</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td>Boundary found in the current block (common)</td>
|
|
<td>zero extra allocation — <code>split_to</code> only</td>
|
|
</tr>
|
|
<tr>
|
|
<td>Boundary straddles multiple blocks (sequence > block size, rare)</td>
|
|
<td>one allocation to pack the rope into a flat buffer</td>
|
|
</tr>
|
|
<tr>
|
|
<td>EOF flush</td>
|
|
<td>zero extra allocation</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
<h2 id="seqchunkiter">SeqChunkIter</h2>
|
|
<div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">struct</span><span class="w"> </span><span class="nc">SeqChunkIter</span><span class="o"><</span><span class="n">R</span><span class="p">:</span><span class="w"> </span><span class="nc">Read</span><span class="o">></span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="cm">/* private */</span><span class="w"> </span><span class="p">}</span>
|
|
|
|
<span class="k">impl</span><span class="o"><</span><span class="n">R</span><span class="p">:</span><span class="w"> </span><span class="nc">Read</span><span class="o">></span><span class="w"> </span><span class="nb">Iterator</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">SeqChunkIter</span><span class="o"><</span><span class="n">R</span><span class="o">></span><span class="w"> </span><span class="p">{</span>
|
|
<span class="w"> </span><span class="k">type</span><span class="w"> </span><span class="nc">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">io</span><span class="p">::</span><span class="nb">Result</span><span class="o"><</span><span class="nb">Vec</span><span class="o"><</span><span class="n">Bytes</span><span class="o">>></span><span class="p">;</span>
|
|
<span class="p">}</span>
|
|
|
|
<span class="k">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">fasta_chunks</span><span class="o"><</span><span class="n">R</span><span class="p">:</span><span class="w"> </span><span class="nc">Read</span><span class="o">></span><span class="p">(</span><span class="n">source</span><span class="p">:</span><span class="w"> </span><span class="nc">R</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">SeqChunkIter</span><span class="o"><</span><span class="n">R</span><span class="o">></span>
|
|
<span class="k">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">fastq_chunks</span><span class="o"><</span><span class="n">R</span><span class="p">:</span><span class="w"> </span><span class="nc">Read</span><span class="o">></span><span class="p">(</span><span class="n">source</span><span class="p">:</span><span class="w"> </span><span class="nc">R</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">SeqChunkIter</span><span class="o"><</span><span class="n">R</span><span class="o">></span>
|
|
</code></pre></div>
|
|
<p><code>next()</code> loop:</p>
|
|
<div class="highlight"><pre><span></span><code>1. read one block of block_size bytes → push onto rope
|
|
2. probe check: if the boundary marker ("\n>" or "\n@") is absent from the
|
|
last block, skip the splitter (avoids a full backward scan for nothing)
|
|
3. call splitter on last block
|
|
if found at offset n:
|
|
remainder = last_block.split_to(n) ← O(1), zero copy
|
|
return std::mem::take(&mut self.rope) ← the chunk
|
|
4. if rope.len() > 1 (multi-block accumulation):
|
|
pack rope into one flat buffer ← one alloc
|
|
retry splitter on flat buffer
|
|
5. if EOF: flush remaining rope as final chunk
|
|
</code></pre></div>
|
|
<h2 id="boundary-detection-fasta">Boundary detection — FASTA</h2>
|
|
<p>Backward scan with a 2-state machine. Searches for <code>></code> immediately preceded by <code>\n</code> or <code>\r</code>:</p>
|
|
<pre class="mermaid"><code>stateDiagram-v2
|
|
direction LR
|
|
[*] --> Scanning
|
|
Scanning --> FoundGt : '>'
|
|
FoundGt --> Scanning : other
|
|
FoundGt --> [*] : '\\n' / '\\r' ✓</code></pre>
|
|
<p>Returns the byte offset of the <code>></code> that starts the last complete record.</p>
|
|
<h2 id="boundary-detection-fastq">Boundary detection — FASTQ</h2>
|
|
<p>FASTQ records have a rigid 4-line structure (<code>@header</code>, sequence, <code>+</code>, quality). The <code>@</code> character (ASCII 64, Phred score 31) can appear legitimately in quality lines, making any forward heuristic unreliable. The backward scanner verifies the full structural context before accepting a candidate <code>@</code>.</p>
|
|
<p>7-state machine (port of Go's <code>EndOfLastFastqEntry</code>), scanning from <strong>right to left</strong>. Each time a <code>+</code> is found, its position is saved as <code>restart</code>; any state mismatch resets the scan to that position.</p>
|
|
<pre class="mermaid"><code>stateDiagram-v2
|
|
direction LR
|
|
|
|
[*] --> Scanning
|
|
|
|
Scanning --> FoundPlus : '+' (save restart)
|
|
FoundPlus --> AfterNlPlus : '\\n' / '\\r'
|
|
FoundPlus --> Scanning : other → backtrack
|
|
|
|
AfterNlPlus --> AfterNlPlus : séparateur
|
|
AfterNlPlus --> InSequence : lettre / - / . / [ / ]
|
|
AfterNlPlus --> Scanning : other → backtrack
|
|
|
|
InSequence --> AfterSequence : '\\n' / '\\r'
|
|
InSequence --> InSequence : lettre / - / . / [ / ]
|
|
InSequence --> Scanning : other → backtrack
|
|
|
|
AfterSequence --> AfterSequence : '\\n' / '\\r'
|
|
AfterSequence --> InHeader : other
|
|
|
|
InHeader --> FoundAt : '@' (save cut)
|
|
InHeader --> Scanning : '\\n' / '\\r' → backtrack
|
|
InHeader --> InHeader : other
|
|
|
|
FoundAt --> [*] : '\\n' / '\\r' ✓
|
|
FoundAt --> InHeader : other</code></pre>
|
|
<p><code>restart</code> is updated each time a <code>+</code> is found. When any state fails its expected input, the scan jumps back to <code>restart</code> and continues from there — guaranteeing that a <code>@</code> in a quality line cannot be accepted as a record start, because the <code>\n+\n</code> structure immediately following it (going backward) will not be found.</p>
|
|
<p>Returns the byte offset of the <code>@</code> that starts the last complete record.</p>
|
|
</article>
|
|
</div>
|
|
<script>var target=document.getElementById(location.hash.slice(1));target&&target.name&&(target.checked=target.name.startsWith("__tabbed_"))</script>
|
|
</div>
|
|
</main>
|
|
<footer class="md-footer">
|
|
<div class="md-footer-meta md-typeset">
|
|
<div class="md-footer-meta__inner md-grid">
|
|
<div class="md-copyright">
|
|
|
|
|
|
Made with
|
|
<a href="https://squidfunk.github.io/mkdocs-material/" rel="noopener" target="_blank">
|
|
Material for MkDocs
|
|
</a>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
</footer>
|
|
</div>
|
|
<div class="md-dialog" data-md-component="dialog">
|
|
<div class="md-dialog__inner md-typeset"></div>
|
|
</div>
|
|
<script id="__config" type="application/json">{"annotate": null, "base": "../..", "features": [], "search": "../../assets/javascripts/workers/search.2c215733.min.js", "tags": null, "translations": {"clipboard.copied": "Copied to clipboard", "clipboard.copy": "Copy to clipboard", "search.result.more.one": "1 more on this page", "search.result.more.other": "# more on this page", "search.result.none": "No matching documents", "search.result.one": "1 matching document", "search.result.other": "# matching documents", "search.result.placeholder": "Type to start searching", "search.result.term.missing": "Missing", "select.version": "Select version"}, "version": null}</script>
|
|
<script src="../../assets/javascripts/bundle.79ae519e.min.js"></script>
|
|
<script src="https://unpkg.com/mathjax@3/es5/tex-mml-chtml.js"></script>
|
|
</body>
|
|
</html> |