Files
obitools4/autodoc/docmd/pkg/obiformats/rope_scanner.md
T
Eric Coissac 8c7017a99d ⬆️ version bump to v4.5
- Update obioptions.Version from "Release 4.4.29" to "/v/ Release v5"
- Update version.txt from 4.29 → .30
(automated by Makefile)
2026-04-13 13:34:53 +02:00

1.5 KiB
Raw Blame History

ropeScanner — Line-by-Line Text Scanning over a Rope Data Structure

The obiformats package provides the ropeScanner, an efficient line-oriented iterator over a Rope (a tree-based immutable string representation, implemented here as PieceOfChunk). This scanner supports streaming large texts without full materialization.

Core Functionality

  • newRopeScanner(rope *PieceOfChunk)
    Constructs a new scanner starting at the root of the rope.

  • ReadLine() []byte
    Returns the next line (without trailing \n, or \r\n) as a byte slice.

    • Returns nil when the end of the rope is reached.
    • Reuses internal buffers (carry) to handle lines spanning multiple nodes efficiently.
    • The returned slice aliases rope data and is only valid until the next call.
  • skipToNewline()
    Advances internal position to just after the next newline (\n), discarding content. Useful for skipping unwanted lines or headers.

Implementation Highlights

  • Buffered carry-over: Lines split across rope nodes are assembled incrementally in the carry buffer, which grows dynamically.
  • Cross-platform line endings: Automatically strips \r\n, leaving only the content (no trailing CR).
  • Zero-copy where possible: When a line fits entirely within one node and no carry exists, it returns a slice directly into the ropes underlying data.

Use Case

Ideal for parsing large text files or streams (e.g., OBIE/Obi formats) where memory efficiency and streaming behavior are critical—without loading the entire content into RAM.