Replaces deeply nested branching with early returns and `then_some`. Introduces a cycle-detecting `find_chain_start` method and updates `UnitigNucIter` to use step-based iteration with atomic node claiming. This eliminates nested iterators and redundant state management, improving code readability and maintainability.
Refactor `leavable` and `reachable` to eliminate duplicated graph traversal logic by mutually delegating via `WalkState`. `leavable` now returns `self.walk(graph).is_some()`, while `reachable` delegates to the inverted `direct` state's `leavable` check. This centralizes kmer extension and visited-state validation in `walk`, simplifying control flow and reducing code duplication.
Remove the `std::arch::naked_asm` import as it is no longer required. Introduce a `collect_unitigs` helper to abstract nucleotide sequence extraction from `GraphDeBruijn`, and refactor the test suite to use it, eliminating repetitive collection code and standardizing graph iteration logic.
Renames `compute_degrees` to `compute_degrees_and_mark_starts` across the De Bruijn graph and partitioner layers to consolidate degree calculation and start-node flagging. Introduces safe neighbor iteration methods and a debug validation block to verify graph consistency. Refactors unitig extraction to use sequential execution with a `Mutex` for safe error propagation. Fixes malformed and duplicated method calls, adds auto-generation of missing `meta.json` files, and ensures persistent matrix builders are explicitly closed to finalize metadata.
Switches unitig processing to a lazy, fallible `try_for_each_unitig` API across partitioner layers, reducing intermediate allocations and enabling proper error propagation. Refactors de Bruijn graph traversal into a two-pass algorithm with explicit node flags, named constants, and diagnostic logging. Introduces parallel chain processing and staged performance profiling for the unitig command, and adds a memory-efficient `FromIterator` implementation for packed nucleotide sequences.
Replace the non-atomic `set_visited` with atomic `fetch_or` bitmask operations to enable thread-safe node claiming. Introduce a two-phase extraction pipeline where `par_for_each_chain_unitig` builds chains in parallel and `for_each_remaining_unitig` sequentially handles residual cycles and junctions. Add `is_start` and `collect_from_start` to explicitly define unitig boundaries. Wrap `BufWriter` in a `Mutex` and use an `AtomicUsize` counter to ensure thread-safe concurrent FASTA output, refactoring the write logic into a shared closure for safe multi-threaded execution.
Integrate rayon to enable parallel processing of k-mer partitions and degree computation. Replace Cell with AtomicU8 to ensure thread-safe node state management, and add a merge method for combining disjoint graphs. Additionally, introduce progress tracking utilities and a test-utils feature flag for development dependencies.
Extracted core indexing logic, state tracking, and metadata management into a new `obikindex` crate. Refactored the `index` and `unitig` commands to leverage the `KmerIndex` abstraction and state-driven pipeline transitions. Removed obsolete CLI subcommands (`count`, `fasta`, `longtig`, `partition`) and their associated pipeline steps. Updated FASTA writing utilities for single-line output and deterministic identifiers, and refreshed workspace dependencies.
Centralize k-mer and minimizer configuration using a thread-safe global module, and replace manual bit-packing with a memory-efficient `PackedSeq` type. Refactor core sequence and k-mer types to use compile-time length enforcement and centralized hashing. Introduce a new De Bruijn graph implementation with compact node encoding and traversal iterators. Update I/O, partitioning, and builder modules to align with the new architecture, and add the `xxhash-rust` dependency.
- Refactored Node representation using compact bitfields for neighbor counts
and nucleotides; added count_neighbors helper to compute_degrees()
- Introduced StartIter iterator for unitig/longtigu generation with revised
traversal semantics (e.g., interior node marking)
- Added nucleotide() accessor to Kmer type for 2-bit extraction at position i
- Renamed unitig.rs → longtigs, updated CLI command and output filenames to reflect "long t ig"
- Extended extract_kmers() in scripts/compare.py with duplication statistics
```
Refactors the `start_iter` function to separate chain/cycle detection into two passes, consolidates visited marking logic internally in `next_unitig_kmer`, and streamlines the public API by removing redundant methods, updating iteration parameters to `(start_first_next)`, and eliminating external state like `at_start`.
Refactor core types to consistently use `CanonicalKMer` (lexicographically minimal of k-mer and its reverse complement) as the canonical representation, ensuring deterministic behavior in graph traversal (unitig decomposition), neighbor resolution (`unique_neighbor` with `[CanonicalKmer; 4]` input) and scatter output generation. Introduce `RoutableSuperKmer`, add `.seq_hash()` support, fix type syntax errors in unitig extraction methods and deduplication tests. Update all k-mer construction to use canonical-aware APIs, including unsafe unchecked constructors for performance-critical paths.