mirror of
https://github.com/metabarcoding/obitools4.git
synced 2025-06-29 16:20:46 +00:00
Last commit version
This commit is contained in:
118
Release-notes.md
118
Release-notes.md
@ -1,19 +1,29 @@
|
|||||||
# OBITools release notes
|
# OBITools release notes
|
||||||
|
|
||||||
## Latest changes
|
## March 2nd, 2025. Release 4.3.0
|
||||||
|
|
||||||
|
A new documentation website is available at https://obitools4.metabarcoding.org.
|
||||||
|
Its development is still in progress.
|
||||||
|
|
||||||
### Breaking changes
|
### Breaking changes
|
||||||
|
|
||||||
- In `obimultiplex`, the short version of the **--tag-list** option used to specify the list
|
- In `obimultiplex`, the short version of the **--tag-list** option used to
|
||||||
of tags and primers to be used for the demultiplexing has been changed from `-t` to `-s`.
|
specify the list of tags and primers to be used for the demultiplexing has
|
||||||
|
been changed from `-t` to `-s`.
|
||||||
|
|
||||||
- The command `obifind` is now renamed `obitaxonomy`.
|
- The command `obifind` is now renamed `obitaxonomy`.
|
||||||
|
|
||||||
- The **--taxdump** option used to specify the path to the taxdump containing the NCBI taxonomy
|
- The **--taxdump** option used to specify the path to the taxdump containing
|
||||||
has been renamed to **--taxonomy**.
|
the NCBI taxonomy has been renamed to **--taxonomy**.
|
||||||
|
|
||||||
### Bug fixes
|
### Bug fixes
|
||||||
|
|
||||||
|
- Correction of a bug when using paired sequence file with the **--out** option.
|
||||||
|
|
||||||
|
- Correction of a bug in `obitag` when trying to annotate very short sequence of
|
||||||
|
4 bases or less.
|
||||||
|
|
||||||
|
|
||||||
- In `obipairing`, correct the stats `seq_a_single` and `seq_b_single` when
|
- In `obipairing`, correct the stats `seq_a_single` and `seq_b_single` when
|
||||||
on right alignment mode
|
on right alignment mode
|
||||||
|
|
||||||
@ -21,16 +31,32 @@
|
|||||||
the batch size and not reading the qualities from the fastq files as `obiuniq`
|
the batch size and not reading the qualities from the fastq files as `obiuniq`
|
||||||
is producing only fasta output without qualities.
|
is producing only fasta output without qualities.
|
||||||
|
|
||||||
|
- In `obitag`, correct the wrong assignment of the **obitag_bestmatch**
|
||||||
|
attribute.
|
||||||
|
|
||||||
|
- In `obiclean`, the **--no-progress-bar** option disables all progress bars,
|
||||||
|
not just the data.
|
||||||
|
|
||||||
|
- Several fixes in reading FASTA and FASTQ files, including some code
|
||||||
|
simplification and factorization.
|
||||||
|
|
||||||
|
- Fixed a bug in all obitools that caused the same file to be processed
|
||||||
|
multiple times, when specifying a directory name as input.
|
||||||
|
|
||||||
|
|
||||||
### New features
|
### New features
|
||||||
|
|
||||||
|
- `obigrep` add a new **--valid-taxid** option to keep only sequence with a
|
||||||
|
valid taxid
|
||||||
|
|
||||||
- `obiclean` add a new **--min-sample-count** option with a default value of 1,
|
- `obiclean` add a new **--min-sample-count** option with a default value of 1,
|
||||||
asking to filter out sequences which are not occurring in at least the
|
asking to filter out sequences which are not occurring in at least the
|
||||||
specified number of samples.
|
specified number of samples.
|
||||||
|
|
||||||
- `obitoaxonomy` a new **--dump|D** option allows for dumping a sub-taxonomy.
|
- `obitoaxonomy` a new **--dump|D** option allows for dumping a sub-taxonomy.
|
||||||
|
|
||||||
- Taxonomy dump can now be provided as a four-columns CSV file to the **--taxonomy**
|
- Taxonomy dump can now be provided as a four-columns CSV file to the
|
||||||
option.
|
**--taxonomy** option.
|
||||||
|
|
||||||
- NCBI Taxonomy dump does not need to be uncompressed and unarchived anymore. The
|
- NCBI Taxonomy dump does not need to be uncompressed and unarchived anymore. The
|
||||||
path of the tar and gziped dump file can be directly specified using the
|
path of the tar and gziped dump file can be directly specified using the
|
||||||
@ -44,8 +70,23 @@
|
|||||||
- `md5_string()`: returning the MD5 check sum as a hexadecimal string,
|
- `md5_string()`: returning the MD5 check sum as a hexadecimal string,
|
||||||
- `subsequence(from,to)`: allows extracting a subsequence on a 0 based
|
- `subsequence(from,to)`: allows extracting a subsequence on a 0 based
|
||||||
coordinate system, upper bound excluded like in go.
|
coordinate system, upper bound excluded like in go.
|
||||||
- `reverse_complement`: returning a sequence object corresponding to the reverse complement
|
- `reverse_complement`: returning a sequence object corresponding to the
|
||||||
of the current sequence.
|
reverse complement of the current sequence.
|
||||||
|
|
||||||
|
### Enhancement
|
||||||
|
|
||||||
|
- In every *OBITools* command, the progress bar is automatically deactivated
|
||||||
|
when the standard error output is redirected.
|
||||||
|
- Because Genbank and ENA:EMBL contain very large sequences, while OBITools4
|
||||||
|
are optimized As Genbank and ENA:EMBL contain very large sequences, while
|
||||||
|
OBITools4 is optimized for short sequences, `obipcr` faces some problems
|
||||||
|
with excessive consumption of computer resources, especially memory. Several
|
||||||
|
improvements in the tuning of the default `obipcr` parameters and some new
|
||||||
|
features, currently only available for FASTA and FASTQ file readers, have
|
||||||
|
been implemented to limit the memory impact of `obipcr` without changing the
|
||||||
|
computational efficiency too much.
|
||||||
|
- Logging system and therefore format, have been homogenized.
|
||||||
|
|
||||||
|
|
||||||
### Change of git repository
|
### Change of git repository
|
||||||
|
|
||||||
@ -54,35 +95,16 @@
|
|||||||
Take care for using the new install script for retrieving the new version.
|
Take care for using the new install script for retrieving the new version.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
curl -L https://raw.githubusercontent.com/metabarcoding/obitools4/master/install_obitools.sh \
|
curl -L https://metabarcoding.org/obitools4/install.sh \
|
||||||
| bash
|
| bash
|
||||||
```
|
```
|
||||||
|
|
||||||
or with options:
|
or with options:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
curl -L https://raw.githubusercontent.com/metabarcoding/obitools4/master/install_obitools.sh \
|
curl -L https://metabarcoding.org/obitools4/install.sh \
|
||||||
| bash -s -- --install-dir test_install --obitools-prefix k
|
| bash -s -- --install-dir test_install --obitools-prefix k
|
||||||
```
|
```
|
||||||
|
|
||||||
### CPU limitation
|
|
||||||
|
|
||||||
- By default, *OBITools4* tries to use all the computing power available on
|
|
||||||
your computer. In some circumstances this can be problematic (e.g. if you
|
|
||||||
are running on a computer cluster managed by your university). You can limit
|
|
||||||
the number of CPU cores used by *OBITools4* or by using the **--max-cpu**
|
|
||||||
option or by setting the **OBIMAXCPU** environment variable. Some strange
|
|
||||||
behavior of *OBITools4* has been observed when users try to limit the
|
|
||||||
maximum number of usable CPU cores to one. This seems to be caused by the Go
|
|
||||||
language, and it is not obvious to get *OBITools4* to run correctly on a
|
|
||||||
single core in all circumstances. Therefore, if you ask to use a single
|
|
||||||
core, **OBITools4** will print a warning message and actually set this
|
|
||||||
parameter to two cores. If you really want a single core, you can use the
|
|
||||||
**--force-one-core** option. But be aware that this can lead to incorrect
|
|
||||||
calculations.
|
|
||||||
|
|
||||||
### New features
|
|
||||||
|
|
||||||
- The output of the obitools will evolve to produce results only in standard
|
- The output of the obitools will evolve to produce results only in standard
|
||||||
formats such as fasta and fastq. For non-sequential data, the output will be
|
formats such as fasta and fastq. For non-sequential data, the output will be
|
||||||
in CSV format, with the separator `,`, the decimal separator `.`, and a
|
in CSV format, with the separator `,`, the decimal separator `.`, and a
|
||||||
@ -162,30 +184,22 @@
|
|||||||
Special data lines starting with `@param` in the first column allow configuring the algorithm. The options **--template** provided an over
|
Special data lines starting with `@param` in the first column allow configuring the algorithm. The options **--template** provided an over
|
||||||
commented example of the CSV format, including all the possible options.
|
commented example of the CSV format, including all the possible options.
|
||||||
|
|
||||||
### Enhancement
|
### CPU limitation
|
||||||
|
|
||||||
- In every *OBITools* command, the progress bar is automatically deactivated
|
- By default, *OBITools4* tries to use all the computing power available on
|
||||||
when the standard error output is redirected.
|
your computer. In some circumstances this can be problematic (e.g. if you
|
||||||
- Because Genbank and ENA:EMBL contain very large sequences, while OBITools4
|
are running on a computer cluster managed by your university). You can limit
|
||||||
are optimized As Genbank and ENA:EMBL contain very large sequences, while
|
the number of CPU cores used by *OBITools4* or by using the **--max-cpu**
|
||||||
OBITools4 is optimized for short sequences, `obipcr` faces some problems
|
option or by setting the **OBIMAXCPU** environment variable. Some strange
|
||||||
with excessive consumption of computer resources, especially memory. Several
|
behavior of *OBITools4* has been observed when users try to limit the
|
||||||
improvements in the tuning of the default `obipcr` parameters and some new
|
maximum number of usable CPU cores to one. This seems to be caused by the Go
|
||||||
features, currently only available for FASTA and FASTQ file readers, have
|
language, and it is not obvious to get *OBITools4* to run correctly on a
|
||||||
been implemented to limit the memory impact of `obipcr` without changing the
|
single core in all circumstances. Therefore, if you ask to use a single
|
||||||
computational efficiency too much.
|
core, **OBITools4** will print a warning message and actually set this
|
||||||
- Logging system and therefore format, have been homogenized.
|
parameter to two cores. If you really want a single core, you can use the
|
||||||
|
**--force-one-core** option. But be aware that this can lead to incorrect
|
||||||
|
calculations.
|
||||||
|
|
||||||
### Bug
|
|
||||||
|
|
||||||
- In `obitag`, correct the wrong assignment of the **obitag_bestmatch**
|
|
||||||
attribute.
|
|
||||||
- In `obiclean`, the **--no-progress-bar** option disables all progress bars,
|
|
||||||
not just the data.
|
|
||||||
- Several fixes in reading FASTA and FASTQ files, including some code
|
|
||||||
simplification and factorization.
|
|
||||||
- Fixed a bug in all obitools that caused the same file to be processed
|
|
||||||
multiple times, when specifying a directory name as input.
|
|
||||||
|
|
||||||
## April 2nd, 2024. Release 4.2.0
|
## April 2nd, 2024. Release 4.2.0
|
||||||
|
|
||||||
|
Reference in New Issue
Block a user