Complement to the documentation

Former-commit-id: 89952a6f3bb261a6aaec24430906e635914ffce4
2026-02-03 06:40:33 +00:00 · 2023-12-04 13:16:34 +01:00
parent eb351a7530
commit 03bef6461d
83 changed files with 65993 additions and 10547 deletions
--- a/doc/book/_freeze/formats/execute-results/epub.json
+++ b/doc/book/_freeze/formats/execute-results/epub.json
--- a/doc/book/_freeze/formats/execute-results/html.json
+++ b/doc/book/_freeze/formats/execute-results/html.json
--- a/doc/book/_freeze/formats/execute-results/tex.json
+++ b/doc/book/_freeze/formats/execute-results/tex.json
--- a/doc/book/_freeze/index/execute-results/epub.json
+++ b/doc/book/_freeze/index/execute-results/epub.json
@@ -0,0 +1,19 @@
+{
+  "hash": "0d2b5eca01fd0d516bf9dee813b291f8",
+  "result": {
+    "markdown": "```{css}\ncode.sourceCode div.cell-output-stdout {\n  font-size: 0.8em;\n}\n\ndiv.cell-output-stdout {\n  font-size: 0.8em;\n}\n```\n\n\n# Preface {.unnumbered}\n\nThe first version of *OBITools* started to be developed in 2005. This was at the beginning of the DNA metabarcoding story at the Laboratoire d'Ecologie Alpine (LECA) in Grenoble. At that time, with Pierre Taberlet and François Pompanon, we were thinking about the potential of this new methodology under development. PIerre and François developed more the laboratory methods, while I was thinking more about the tools for analysing the sequences produced. Two ideas were behind this development. I wanted something modular, and something easy to extend. To achieve the first goal, I decided to implement obitools as a suite of unix commands mimicking the classic unix commands but dedicated to sequence files. The basic unix commands are very useful for automatically manipulating, parsing and editing text files. They work in flow, line by line on the input text. The result is a new text file that can be used as input for the next command. Such a design makes it possible to quickly develop a text processing pipeline by chaining simple elementary operations. The *OBITools* are the exact counterpart of these basic Unix commands, but the basic information they process is a sequence (potentially spanning several lines of text), not a single line of text. Most *OBITools* consume sequence files and produce sequence files. Thus, the principles of chaining and modularity are respected. In order to be able to easily extend the *OBITools* to keep up with our evolving ideas about processing DNA metabarcoding data, it was decided to develop them using an interpreted language: Python. Python 2, the version available at the time, allowed us to develop the *OBITools* efficiently. When parts of the algorithms were computationally demanding, they were implemented in C and linked to the Python code. Even though Python is not the most efficient language available, even though computers were not as powerful as they are today, the size of the data we could produce using 454 sequencers or early solexa machines was small enough to be processed in a reasonable time.\n\nThe first public version of obitools was [*OBITools2*](https://metabarcoding.org/obitools) [@Boyer2016-gq], this was actually a cleaned up and documented version of *OBITools* that had been running at LECA for years and was not really distributed except to a few collaborators. This is where *OBITools* started its public life from then on. The DNA metabarcoding spring schools provided and still provide user training every year. But *OBITools2* soon suffered from two limitations: it was developed in Python2, which was increasingly abandoned in favour of Python3, and the data size kept increasing with the new illumina machines. Python's intrinsic slowness coupled with the increasing size of the datasets made OBITools computation times increasingly long. The abandonment of all maintenance of Python2 by its developers also imposed the need for a new version of OBITools. \n\n[*OBITools3*](https://metabarcoding.org/obitools3) was the first response to this crisis. Developed and maintained by [Céline Mercier](https://www.celine-mercier.info), *OBITools3* attempted to address several limitations of *OBITools2*. It is a complete new code, mainly developed in Python3, with most of the lower layer code written in C for efficiency. OBITools3 has also abandoned text files for binary files for the same reason of efficiency. They have been replaced by a database structure that keeps track of every operation performed on the data. \n\nHere we present *OBITools4* which can be seen as a return to the origins of OBITools. While *OBITools3* offered traceability of analyses, which is in line with the concept of open science, and faster execution, *OBITools2* was more versatile and not only usable for the analysis of DNA metabarcoding data. *OBITools4* is the third full implementation of *OBITools*. The idea behind this new version is to go back to the original design of *OBITools* which ran on text files containing sequences, like the classic Unix commands, but running at least as fast as *OBITools3* and taking advantage of the multicore architecture of all modern laptops. For this, the idea of relying on an interpreted language was abandoned. The *OBITools4* are now fully implemented in the [GO](https://go.dev) language with the exception of a few small pieces of specific code already implemented very efficiently in C. *OBITools4* also implement a new format for the annotations inserted in the header of every sequences. Rather tha relying on a format specific to *OBITools*, by default *OBITools4* use the [JSON](https://www.json.org) format. This simplifies the writing of parsers in any languages, and thus allows obitools to easiestly interact with other software.\n\n",
+    "supporting": [
+      "index_files"
+    ],
+    "filters": [],
+    "engineDependencies": {
+      "jupyter": [
+        {
+          "jsWidgets": false,
+          "jupyterWidgets": false,
+          "htmlLibraries": []
+        }
+      ]
+    }
+  }
+}
--- a/doc/book/_freeze/index/execute-results/html.json
+++ b/doc/book/_freeze/index/execute-results/html.json
@@ -0,0 +1,11 @@
+{
+  "hash": "0d2b5eca01fd0d516bf9dee813b291f8",
+  "result": {
+    "markdown": "```{css}\ncode.sourceCode div.cell-output-stdout {\n  font-size: 0.8em;\n}\n\ndiv.cell-output-stdout {\n  font-size: 0.8em;\n}\n```\n\n\n# Preface {.unnumbered}\n\nThe first version of *OBITools* started to be developed in 2005. This was at the beginning of the DNA metabarcoding story at the Laboratoire d'Ecologie Alpine (LECA) in Grenoble. At that time, with Pierre Taberlet and François Pompanon, we were thinking about the potential of this new methodology under development. PIerre and François developed more the laboratory methods, while I was thinking more about the tools for analysing the sequences produced. Two ideas were behind this development. I wanted something modular, and something easy to extend. To achieve the first goal, I decided to implement obitools as a suite of unix commands mimicking the classic unix commands but dedicated to sequence files. The basic unix commands are very useful for automatically manipulating, parsing and editing text files. They work in flow, line by line on the input text. The result is a new text file that can be used as input for the next command. Such a design makes it possible to quickly develop a text processing pipeline by chaining simple elementary operations. The *OBITools* are the exact counterpart of these basic Unix commands, but the basic information they process is a sequence (potentially spanning several lines of text), not a single line of text. Most *OBITools* consume sequence files and produce sequence files. Thus, the principles of chaining and modularity are respected. In order to be able to easily extend the *OBITools* to keep up with our evolving ideas about processing DNA metabarcoding data, it was decided to develop them using an interpreted language: Python. Python 2, the version available at the time, allowed us to develop the *OBITools* efficiently. When parts of the algorithms were computationally demanding, they were implemented in C and linked to the Python code. Even though Python is not the most efficient language available, even though computers were not as powerful as they are today, the size of the data we could produce using 454 sequencers or early solexa machines was small enough to be processed in a reasonable time.\n\nThe first public version of obitools was [*OBITools2*](https://metabarcoding.org/obitools) [@Boyer2016-gq], this was actually a cleaned up and documented version of *OBITools* that had been running at LECA for years and was not really distributed except to a few collaborators. This is where *OBITools* started its public life from then on. The DNA metabarcoding spring schools provided and still provide user training every year. But *OBITools2* soon suffered from two limitations: it was developed in Python2, which was increasingly abandoned in favour of Python3, and the data size kept increasing with the new illumina machines. Python's intrinsic slowness coupled with the increasing size of the datasets made OBITools computation times increasingly long. The abandonment of all maintenance of Python2 by its developers also imposed the need for a new version of OBITools. \n\n[*OBITools3*](https://metabarcoding.org/obitools3) was the first response to this crisis. Developed and maintained by [Céline Mercier](https://www.celine-mercier.info), *OBITools3* attempted to address several limitations of *OBITools2*. It is a complete new code, mainly developed in Python3, with most of the lower layer code written in C for efficiency. OBITools3 has also abandoned text files for binary files for the same reason of efficiency. They have been replaced by a database structure that keeps track of every operation performed on the data. \n\nHere we present *OBITools4* which can be seen as a return to the origins of OBITools. While *OBITools3* offered traceability of analyses, which is in line with the concept of open science, and faster execution, *OBITools2* was more versatile and not only usable for the analysis of DNA metabarcoding data. *OBITools4* is the third full implementation of *OBITools*. The idea behind this new version is to go back to the original design of *OBITools* which ran on text files containing sequences, like the classic Unix commands, but running at least as fast as *OBITools3* and taking advantage of the multicore architecture of all modern laptops. For this, the idea of relying on an interpreted language was abandoned. The *OBITools4* are now fully implemented in the [GO](https://go.dev) language with the exception of a few small pieces of specific code already implemented very efficiently in C. *OBITools4* also implement a new format for the annotations inserted in the header of every sequences. Rather tha relying on a format specific to *OBITools*, by default *OBITools4* use the [JSON](https://www.json.org) format. This simplifies the writing of parsers in any languages, and thus allows obitools to easiestly interact with other software.\n\n",
+    "supporting": [
+      "index_files/figure-html"
+    ],
+    "filters": [],
+    "includes": {}
+  }
+}
--- a/doc/book/_freeze/index/execute-results/tex.json
+++ b/doc/book/_freeze/index/execute-results/tex.json
@@ -0,0 +1,10 @@
+{
+  "hash": "6372b3136d3b9e38461c2fd369bcdefd",
+  "result": {
+    "markdown": "```{css eval=FALSE}\ncode.sourceCode div.cell-output-stdout {\n  font-size: 0.8em;\n}\n\ndiv.cell-output-stdout {\n  font-size: 0.8em;\n}\n```\n\n\n# Preface {.unnumbered}\n\nThe first version of *OBITools* started to be developed in 2005. This was at the beginning of the DNA metabarcoding story at the Laboratoire d'Ecologie Alpine (LECA) in Grenoble. At that time, with Pierre Taberlet and François Pompanon, we were thinking about the potential of this new methodology under development. PIerre and François developed more the laboratory methods, while I was thinking more about the tools for analysing the sequences produced. Two ideas were behind this development. I wanted something modular, and something easy to extend. To achieve the first goal, I decided to implement obitools as a suite of unix commands mimicking the classic unix commands but dedicated to sequence files. The basic unix commands are very useful for automatically manipulating, parsing and editing text files. They work in flow, line by line on the input text. The result is a new text file that can be used as input for the next command. Such a design makes it possible to quickly develop a text processing pipeline by chaining simple elementary operations. The *OBITools* are the exact counterpart of these basic Unix commands, but the basic information they process is a sequence (potentially spanning several lines of text), not a single line of text. Most *OBITools* consume sequence files and produce sequence files. Thus, the principles of chaining and modularity are respected. In order to be able to easily extend the *OBITools* to keep up with our evolving ideas about processing DNA metabarcoding data, it was decided to develop them using an interpreted language: Python. Python 2, the version available at the time, allowed us to develop the *OBITools* efficiently. When parts of the algorithms were computationally demanding, they were implemented in C and linked to the Python code. Even though Python is not the most efficient language available, even though computers were not as powerful as they are today, the size of the data we could produce using 454 sequencers or early solexa machines was small enough to be processed in a reasonable time.\n\nThe first public version of obitools was [*OBITools2*](https://metabarcoding.org/obitools) [@Boyer2016-gq], this was actually a cleaned up and documented version of *OBITools* that had been running at LECA for years and was not really distributed except to a few collaborators. This is where *OBITools* started its public life from then on. The DNA metabarcoding spring schools provided and still provide user training every year. But *OBITools2* soon suffered from two limitations: it was developed in Python2, which was increasingly abandoned in favour of Python3, and the data size kept increasing with the new illumina machines. Python's intrinsic slowness coupled with the increasing size of the datasets made OBITools computation times increasingly long. The abandonment of all maintenance of Python2 by its developers also imposed the need for a new version of OBITools. \n\n[*OBITools3*](https://metabarcoding.org/obitools3) was the first response to this crisis. Developed and maintained by [Céline Mercier](https://www.celine-mercier.info), *OBITools3* attempted to address several limitations of *OBITools2*. It is a complete new code, mainly developed in Python3, with most of the lower layer code written in C for efficiency. OBITools3 has also abandoned text files for binary files for the same reason of efficiency. They have been replaced by a database structure that keeps track of every operation performed on the data. \n\nHere we present *OBITools4* which can be seen as a return to the origins of OBITools. While *OBITools3* offered traceability of analyses, which is in line with the concept of open science, and faster execution, *OBITools2* was more versatile and not only usable for the analysis of DNA metabarcoding data. *OBITools4* is the third full implementation of *OBITools*. The idea behind this new version is to go back to the original design of *OBITools* which ran on text files containing sequences, like the classic Unix commands, but running at least as fast as *OBITools3* and taking advantage of the multicore architecture of all modern laptops. For this, the idea of relying on an interpreted language was abandoned. The *OBITools4* are now fully implemented in the [GO](https://go.dev) language with the exception of a few small pieces of specific code already implemented very efficiently in C. *OBITools4* also implement a new format for the annotations inserted in the header of every sequences. Rather tha relying on a format specific to *OBITools*, by default *OBITools4* use the [JSON](https://www.json.org) format. This simplifies the writing of parsers in any languages, and thus allows obitools to easiestly interact with other software.\n\n",
+    "supporting": [
+      "index_files/figure-pdf"
+    ],
+    "filters": []
+  }
+}
--- a/doc/book/_freeze/installation/execute-results/epub.json
+++ b/doc/book/_freeze/installation/execute-results/epub.json
@@ -0,0 +1,19 @@
+{
+  "hash": "cdc3ee1d58e10538db75a4867b10ee0e",
+  "result": {
+    "markdown": "# Installation of the *OBITools*\n\n## Availability of the *OBITools*\n\nThe *OBITools* are open source and protected by the [CeCILL 2.1 license](http://www.cecill.info/licences/Licence_CeCILL_V2.1-en.html).\n\nAll the sources of the [*OBITools4*](http://metabarcoding.org/obitools4) can be downloaded from the metabarcoding git server (https://git.metabarcoding.org).\n\n## Prerequisites\n\nThe *OBITools4* are developped using the [GO programming language](https://go.dev/), we stick to the latest version of the language, today the $1.21.4$. If you want to download and compile the sources yourself, you first need to install the corresponding compiler on your system. Some parts of the soft are also written in C, therefore a recent C compiler is also requested, GCC on Linux or Windows, the Developer Tools on Mac.\n\nWhatever the installation you decide for, you will have to ensure that a C compiler is available on your system.\n\n## Installation with the install script\n\nAn installation script that compiles the new *OBITools* on your Unix-like system is available online.\nThe easiest way to run it is to copy and paste the following command into your terminal\n\n::: {.cell execution_count=1}\n``` {.bash .cell-code}\ncurl -L https://metabarcoding.org/obitools4/install.sh | bash\n```\n:::\n\n\nBy default, the script installs the *OBITools* commands and other associated files into the `/usr/local` directory.\nThe names of the commands in the new *OBITools4* are mostly identical to those in *OBITools2*.\nTherefore, installing the new *OBITools* may hide or delete the old ones. If you want both versions to be \navailable on your system, the installation script offers two options:\n\n\n>  -i, --install-dir       Directory where *OBITools* are installed \n>                          (as example use `/usr/local` not `/usr/local/bin`).\n> \n>  -p, --obitools-prefix   Prefix added to the *OBITools* command names if you\n>                          want to have several versions of obitools at the\n>                          same time on your system (as example `-p g` will produce \n>                          `gobigrep` command instead of `obigrep`).\n\nYou can use these options by following the installation command:\n\n::: {.cell execution_count=2}\n``` {.bash .cell-code}\ncurl -L https://metabarcoding.org/obitools4/install.sh | \\\n      bash -s -- --install-dir test_install --obitools-prefix k\n```\n:::\n\n\nIn this case, the binaries will be installed in the `test_install` directory and all command names will be prefixed with the letter `k`. Thus `obigrep` will be named `kobigrep`.\n\n\n## Compilation from sources\n\n",
+    "supporting": [
+      "installation_files/figure-epub"
+    ],
+    "filters": [],
+    "engineDependencies": {
+      "jupyter": [
+        {
+          "jsWidgets": false,
+          "jupyterWidgets": false,
+          "htmlLibraries": []
+        }
+      ]
+    }
+  }
+}
--- a/doc/book/_freeze/installation/execute-results/html.json
+++ b/doc/book/_freeze/installation/execute-results/html.json
@@ -0,0 +1,11 @@
+{
+  "hash": "cdc3ee1d58e10538db75a4867b10ee0e",
+  "result": {
+    "markdown": "# Installation of the *OBITools*\n\n## Availability of the *OBITools*\n\nThe *OBITools* are open source and protected by the [CeCILL 2.1 license](http://www.cecill.info/licences/Licence_CeCILL_V2.1-en.html).\n\nAll the sources of the [*OBITools4*](http://metabarcoding.org/obitools4) can be downloaded from the metabarcoding git server (https://git.metabarcoding.org).\n\n## Prerequisites\n\nThe *OBITools4* are developped using the [GO programming language](https://go.dev/), we stick to the latest version of the language, today the $1.21.4$. If you want to download and compile the sources yourself, you first need to install the corresponding compiler on your system. Some parts of the soft are also written in C, therefore a recent C compiler is also requested, GCC on Linux or Windows, the Developer Tools on Mac.\n\nWhatever the installation you decide for, you will have to ensure that a C compiler is available on your system.\n\n## Installation with the install script\n\nAn installation script that compiles the new *OBITools* on your Unix-like system is available online.\nThe easiest way to run it is to copy and paste the following command into your terminal\n\n::: {.cell execution_count=1}\n``` {.bash .cell-code}\ncurl -L https://metabarcoding.org/obitools4/install.sh | bash\n```\n:::\n\n\nBy default, the script installs the *OBITools* commands and other associated files into the `/usr/local` directory.\nThe names of the commands in the new *OBITools4* are mostly identical to those in *OBITools2*.\nTherefore, installing the new *OBITools* may hide or delete the old ones. If you want both versions to be \navailable on your system, the installation script offers two options:\n\n\n>  -i, --install-dir       Directory where *OBITools* are installed \n>                          (as example use `/usr/local` not `/usr/local/bin`).\n> \n>  -p, --obitools-prefix   Prefix added to the *OBITools* command names if you\n>                          want to have several versions of obitools at the\n>                          same time on your system (as example `-p g` will produce \n>                          `gobigrep` command instead of `obigrep`).\n\nYou can use these options by following the installation command:\n\n::: {.cell execution_count=2}\n``` {.bash .cell-code}\ncurl -L https://metabarcoding.org/obitools4/install.sh | \\\n      bash -s -- --install-dir test_install --obitools-prefix k\n```\n:::\n\n\nIn this case, the binaries will be installed in the `test_install` directory and all command names will be prefixed with the letter `k`. Thus `obigrep` will be named `kobigrep`.\n\n\n## Compilation from sources\n\n",
+    "supporting": [
+      "installation_files/figure-html"
+    ],
+    "filters": [],
+    "includes": {}
+  }
+}
--- a/doc/book/_freeze/installation/execute-results/tex.json
+++ b/doc/book/_freeze/installation/execute-results/tex.json
@@ -0,0 +1,10 @@
+{
+  "hash": "cdc3ee1d58e10538db75a4867b10ee0e",
+  "result": {
+    "markdown": "# Installation of the *OBITools*\n\n## Availability of the *OBITools*\n\nThe *OBITools* are open source and protected by the [CeCILL 2.1 license](http://www.cecill.info/licences/Licence_CeCILL_V2.1-en.html).\n\nAll the sources of the [*OBITools4*](http://metabarcoding.org/obitools4) can be downloaded from the metabarcoding git server (https://git.metabarcoding.org).\n\n## Prerequisites\n\nThe *OBITools4* are developped using the [GO programming language](https://go.dev/), we stick to the latest version of the language, today the $1.21.4$. If you want to download and compile the sources yourself, you first need to install the corresponding compiler on your system. Some parts of the soft are also written in C, therefore a recent C compiler is also requested, GCC on Linux or Windows, the Developer Tools on Mac.\n\nWhatever the installation you decide for, you will have to ensure that a C compiler is available on your system.\n\n## Installation with the install script\n\nAn installation script that compiles the new *OBITools* on your Unix-like system is available online.\nThe easiest way to run it is to copy and paste the following command into your terminal\n\n::: {.cell execution_count=1}\n``` {.bash .cell-code}\ncurl -L https://metabarcoding.org/obitools4/install.sh | bash\n```\n:::\n\n\nBy default, the script installs the *OBITools* commands and other associated files into the `/usr/local` directory.\nThe names of the commands in the new *OBITools4* are mostly identical to those in *OBITools2*.\nTherefore, installing the new *OBITools* may hide or delete the old ones. If you want both versions to be \navailable on your system, the installation script offers two options:\n\n\n>  -i, --install-dir       Directory where *OBITools* are installed \n>                          (as example use `/usr/local` not `/usr/local/bin`).\n> \n>  -p, --obitools-prefix   Prefix added to the *OBITools* command names if you\n>                          want to have several versions of obitools at the\n>                          same time on your system (as example `-p g` will produce \n>                          `gobigrep` command instead of `obigrep`).\n\nYou can use these options by following the installation command:\n\n::: {.cell execution_count=2}\n``` {.bash .cell-code}\ncurl -L https://metabarcoding.org/obitools4/install.sh | \\\n      bash -s -- --install-dir test_install --obitools-prefix k\n```\n:::\n\n\nIn this case, the binaries will be installed in the `test_install` directory and all command names will be prefixed with the letter `k`. Thus `obigrep` will be named `kobigrep`.\n\n\n## Compilation from sources\n\n",
+    "supporting": [
+      "installation_files/figure-pdf"
+    ],
+    "filters": []
+  }
+}