Adds the ability to read gzip-tar file for the taxonomy dump

This commit is contained in:
Eric Coissac
2025-01-24 11:47:59 +01:00
parent ffd67252c3
commit 3137c1f841
17 changed files with 305 additions and 64 deletions

View File

@ -2,6 +2,14 @@
## Latest changes
### Breaking changes
- In `obimultiplex`, the short version of the **--tag-list** option used to specify the list
of tags and primers to be used for the demultiplexing has been changed from `-t` to `-s`.
- The **--taxdump** option used to specify the path to the taxdump containing the NCBI taxonomy
has been renamed to **--taxonomy**.
### Bug fixes
- In `obipairing`, correct the stats `seq_a_single` and `seq_b_single` when
@ -13,6 +21,10 @@
### New features
- NCBI Taxonomy dump does not need to be uncompressed and unarchived anymore. The
path of the tar and gziped dump file can be directly specified using the
**--taxonomy** option.
- Most of the time obitools identify automatically sequence file format. But
it fails sometimes. Two new option **--fasta** and **--fastq** are added to
allow the processing of the rare fasta and fastq files not recognized.