remap every 100000 values

2016-03-14 17:03:24 +01:00
395 changed files with 6416 additions and 57005 deletions
--- a/518
+++ b/518
@ -1,518 +0,0 @@
  CeCILL FREE SOFTWARE LICENSE AGREEMENT
 Version 2.1 dated 2013-06-21
    Notice
 This Agreement is a Free Software license agreement that is the result
 of discussions between its authors in order to ensure compliance with
 the two main principles guiding its drafting:
  * firstly, compliance with the principles governing the distribution
    of Free Software: access to source code, broad rights granted to users,
  * secondly, the election of a governing law, French law, with which it
    is conformant, both as regards the law of torts and intellectual
    property law, and the protection that it offers to both authors and
    holders of the economic rights over software.
 The authors of the CeCILL (for Ce[a] C[nrs] I[nria] L[ogiciel] L[ibre]) 
 license are: 
 Commissariat à l'énergie atomique et aux énergies alternatives - CEA, a
 public scientific, technical and industrial research establishment,
 having its principal place of business at 25 rue Leblanc, immeuble Le
 Ponant D, 75015 Paris, France.
 Centre National de la Recherche Scientifique - CNRS, a public scientific
 and technological establishment, having its principal place of business
 at 3 rue Michel-Ange, 75794 Paris cedex 16, France.
 Institut National de Recherche en Informatique et en Automatique -
 Inria, a public scientific and technological establishment, having its
 principal place of business at Domaine de Voluceau, Rocquencourt, BP
 105, 78153 Le Chesnay cedex, France.
    Preamble
 The purpose of this Free Software license agreement is to grant users
 the right to modify and redistribute the software governed by this
 license within the framework of an open source distribution model.
 The exercising of this right is conditional upon certain obligations for
 users so as to preserve this status for all subsequent redistributions.
 In consideration of access to the source code and the rights to copy,
 modify and redistribute granted by the license, users are provided only
 with a limited warranty and the software's author, the holder of the
 economic rights, and the successive licensors only have limited liability.
 In this respect, the risks associated with loading, using, modifying
 and/or developing or reproducing the software by the user are brought to
 the user's attention, given its Free Software status, which may make it
 complicated to use, with the result that its use is reserved for
 developers and experienced professionals having in-depth computer
 knowledge. Users are therefore encouraged to load and test the
 suitability of the software as regards their requirements in conditions
 enabling the security of their systems and/or data to be ensured and,
 more generally, to use and operate it in the same conditions of
 security. This Agreement may be freely reproduced and published,
 provided it is not altered, and that no provisions are either added or
 removed herefrom.
 This Agreement may apply to any or all software for which the holder of
 the economic rights decides to submit the use thereof to its provisions.
 Frequently asked questions can be found on the official website of the
 CeCILL licenses family (http://www.cecill.info/index.en.html) for any 
 necessary clarification.
    Article 1 - DEFINITIONS
 For the purpose of this Agreement, when the following expressions
 commence with a capital letter, they shall have the following meaning:
 Agreement: means this license agreement, and its possible subsequent
 versions and annexes.
 Software: means the software in its Object Code and/or Source Code form
 and, where applicable, its documentation, "as is" when the Licensee
 accepts the Agreement.
 Initial Software: means the Software in its Source Code and possibly its
 Object Code form and, where applicable, its documentation, "as is" when
 it is first distributed under the terms and conditions of the Agreement.
 Modified Software: means the Software modified by at least one
 Contribution.
 Source Code: means all the Software's instructions and program lines to
 which access is required so as to modify the Software.
 Object Code: means the binary files originating from the compilation of
 the Source Code.
 Holder: means the holder(s) of the economic rights over the Initial
 Software.
 Licensee: means the Software user(s) having accepted the Agreement.
 Contributor: means a Licensee having made at least one Contribution.
 Licensor: means the Holder, or any other individual or legal entity, who
 distributes the Software under the Agreement.
 Contribution: means any or all modifications, corrections, translations,
 adaptations and/or new functions integrated into the Software by any or
 all Contributors, as well as any or all Internal Modules.
 Module: means a set of sources files including their documentation that
 enables supplementary functions or services in addition to those offered
 by the Software.
 External Module: means any or all Modules, not derived from the
 Software, so that this Module and the Software run in separate address
 spaces, with one calling the other when they are run.
 Internal Module: means any or all Module, connected to the Software so
 that they both execute in the same address space.
 GNU GPL: means the GNU General Public License version 2 or any
 subsequent version, as published by the Free Software Foundation Inc.
 GNU Affero GPL: means the GNU Affero General Public License version 3 or
 any subsequent version, as published by the Free Software Foundation Inc.
 EUPL: means the European Union Public License version 1.1 or any
 subsequent version, as published by the European Commission.
 Parties: mean both the Licensee and the Licensor.
 These expressions may be used both in singular and plural form.
    Article 2 - PURPOSE
 The purpose of the Agreement is the grant by the Licensor to the
 Licensee of a non-exclusive, transferable and worldwide license for the
 Software as set forth in Article 5 <#scope> hereinafter for the whole
 term of the protection granted by the rights over said Software.
    Article 3 - ACCEPTANCE
 3.1 The Licensee shall be deemed as having accepted the terms and
 conditions of this Agreement upon the occurrence of the first of the
 following events:
  * (i) loading the Software by any or all means, notably, by
    downloading from a remote server, or by loading from a physical medium;
  * (ii) the first time the Licensee exercises any of the rights granted
    hereunder.
 3.2 One copy of the Agreement, containing a notice relating to the
 characteristics of the Software, to the limited warranty, and to the
 fact that its use is restricted to experienced users has been provided
 to the Licensee prior to its acceptance as set forth in Article 3.1
 <#accepting> hereinabove, and the Licensee hereby acknowledges that it
 has read and understood it.
    Article 4 - EFFECTIVE DATE AND TERM
      4.1 EFFECTIVE DATE
 The Agreement shall become effective on the date when it is accepted by
 the Licensee as set forth in Article 3.1 <#accepting>.
      4.2 TERM
 The Agreement shall remain in force for the entire legal term of
 protection of the economic rights over the Software.
    Article 5 - SCOPE OF RIGHTS GRANTED
 The Licensor hereby grants to the Licensee, who accepts, the following
 rights over the Software for any or all use, and for the term of the
 Agreement, on the basis of the terms and conditions set forth hereinafter.
 Besides, if the Licensor owns or comes to own one or more patents
 protecting all or part of the functions of the Software or of its
 components, the Licensor undertakes not to enforce the rights granted by
 these patents against successive Licensees using, exploiting or
 modifying the Software. If these patents are transferred, the Licensor
 undertakes to have the transferees subscribe to the obligations set
 forth in this paragraph.
      5.1 RIGHT OF USE
 The Licensee is authorized to use the Software, without any limitation
 as to its fields of application, with it being hereinafter specified
 that this comprises:
 1. permanent or temporary reproduction of all or part of the Software
    by any or all means and in any or all form.
 2. loading, displaying, running, or storing the Software on any or all
    medium.
 3. entitlement to observe, study or test its operation so as to
    determine the ideas and principles behind any or all constituent
    elements of said Software. This shall apply when the Licensee
    carries out any or all loading, displaying, running, transmission or
    storage operation as regards the Software, that it is entitled to
    carry out hereunder.
      5.2 ENTITLEMENT TO MAKE CONTRIBUTIONS
 The right to make Contributions includes the right to translate, adapt,
 arrange, or make any or all modifications to the Software, and the right
 to reproduce the resulting software.
 The Licensee is authorized to make any or all Contributions to the
 Software provided that it includes an explicit notice that it is the
 author of said Contribution and indicates the date of the creation thereof.
      5.3 RIGHT OF DISTRIBUTION
 In particular, the right of distribution includes the right to publish,
 transmit and communicate the Software to the general public on any or
 all medium, and by any or all means, and the right to market, either in
 consideration of a fee, or free of charge, one or more copies of the
 Software by any means.
 The Licensee is further authorized to distribute copies of the modified
 or unmodified Software to third parties according to the terms and
 conditions set forth hereinafter.
        5.3.1 DISTRIBUTION OF SOFTWARE WITHOUT MODIFICATION
 The Licensee is authorized to distribute true copies of the Software in
 Source Code or Object Code form, provided that said distribution
 complies with all the provisions of the Agreement and is accompanied by:
 1. a copy of the Agreement,
 2. a notice relating to the limitation of both the Licensor's warranty
    and liability as set forth in Articles 8 and 9,
 and that, in the event that only the Object Code of the Software is
 redistributed, the Licensee allows effective access to the full Source
 Code of the Software for a period of at least three years from the
 distribution of the Software, it being understood that the additional
 acquisition cost of the Source Code shall not exceed the cost of the
 data transfer.
        5.3.2 DISTRIBUTION OF MODIFIED SOFTWARE
 When the Licensee makes a Contribution to the Software, the terms and
 conditions for the distribution of the resulting Modified Software
 become subject to all the provisions of this Agreement.
 The Licensee is authorized to distribute the Modified Software, in
 source code or object code form, provided that said distribution
 complies with all the provisions of the Agreement and is accompanied by:
 1. a copy of the Agreement,
 2. a notice relating to the limitation of both the Licensor's warranty
    and liability as set forth in Articles 8 and 9,
 and, in the event that only the object code of the Modified Software is
 redistributed,
 3. a note stating the conditions of effective access to the full source
    code of the Modified Software for a period of at least three years
    from the distribution of the Modified Software, it being understood
    that the additional acquisition cost of the source code shall not
    exceed the cost of the data transfer.
        5.3.3 DISTRIBUTION OF EXTERNAL MODULES
 When the Licensee has developed an External Module, the terms and
 conditions of this Agreement do not apply to said External Module, that
 may be distributed under a separate license agreement.
        5.3.4 COMPATIBILITY WITH OTHER LICENSES
 The Licensee can include a code that is subject to the provisions of one
 of the versions of the GNU GPL, GNU Affero GPL and/or EUPL in the
 Modified or unmodified Software, and distribute that entire code under
 the terms of the same version of the GNU GPL, GNU Affero GPL and/or EUPL.
 The Licensee can include the Modified or unmodified Software in a code
 that is subject to the provisions of one of the versions of the GNU GPL,
 GNU Affero GPL and/or EUPL and distribute that entire code under the
 terms of the same version of the GNU GPL, GNU Affero GPL and/or EUPL.
    Article 6 - INTELLECTUAL PROPERTY
      6.1 OVER THE INITIAL SOFTWARE
 The Holder owns the economic rights over the Initial Software. Any or
 all use of the Initial Software is subject to compliance with the terms
 and conditions under which the Holder has elected to distribute its work
 and no one shall be entitled to modify the terms and conditions for the
 distribution of said Initial Software.
 The Holder undertakes that the Initial Software will remain ruled at
 least by this Agreement, for the duration set forth in Article 4.2 <#term>.
      6.2 OVER THE CONTRIBUTIONS
 The Licensee who develops a Contribution is the owner of the
 intellectual property rights over this Contribution as defined by
 applicable law.
      6.3 OVER THE EXTERNAL MODULES
 The Licensee who develops an External Module is the owner of the
 intellectual property rights over this External Module as defined by
 applicable law and is free to choose the type of agreement that shall
 govern its distribution.
      6.4 JOINT PROVISIONS
 The Licensee expressly undertakes:
 1. not to remove, or modify, in any manner, the intellectual property
    notices attached to the Software;
 2. to reproduce said notices, in an identical manner, in the copies of
    the Software modified or not.
 The Licensee undertakes not to directly or indirectly infringe the
 intellectual property rights on the Software of the Holder and/or
 Contributors, and to take, where applicable, vis-à-vis its staff, any
 and all measures required to ensure respect of said intellectual
 property rights of the Holder and/or Contributors.
    Article 7 - RELATED SERVICES
 7.1 Under no circumstances shall the Agreement oblige the Licensor to
 provide technical assistance or maintenance services for the Software.
 However, the Licensor is entitled to offer this type of services. The
 terms and conditions of such technical assistance, and/or such
 maintenance, shall be set forth in a separate instrument. Only the
 Licensor offering said maintenance and/or technical assistance services
 shall incur liability therefor.
 7.2 Similarly, any Licensor is entitled to offer to its licensees, under
 its sole responsibility, a warranty, that shall only be binding upon
 itself, for the redistribution of the Software and/or the Modified
 Software, under terms and conditions that it is free to decide. Said
 warranty, and the financial terms and conditions of its application,
 shall be subject of a separate instrument executed between the Licensor
 and the Licensee.
    Article 8 - LIABILITY
 8.1 Subject to the provisions of Article 8.2, the Licensee shall be
 entitled to claim compensation for any direct loss it may have suffered
 from the Software as a result of a fault on the part of the relevant
 Licensor, subject to providing evidence thereof.
 8.2 The Licensor's liability is limited to the commitments made under
 this Agreement and shall not be incurred as a result of in particular:
 (i) loss due the Licensee's total or partial failure to fulfill its
 obligations, (ii) direct or consequential loss that is suffered by the
 Licensee due to the use or performance of the Software, and (iii) more
 generally, any consequential loss. In particular the Parties expressly
 agree that any or all pecuniary or business loss (i.e. loss of data,
 loss of profits, operating loss, loss of customers or orders,
 opportunity cost, any disturbance to business activities) or any or all
 legal proceedings instituted against the Licensee by a third party,
 shall constitute consequential loss and shall not provide entitlement to
 any or all compensation from the Licensor.
    Article 9 - WARRANTY
 9.1 The Licensee acknowledges that the scientific and technical
 state-of-the-art when the Software was distributed did not enable all
 possible uses to be tested and verified, nor for the presence of
 possible defects to be detected. In this respect, the Licensee's
 attention has been drawn to the risks associated with loading, using,
 modifying and/or developing and reproducing the Software which are
 reserved for experienced users.
 The Licensee shall be responsible for verifying, by any or all means,
 the suitability of the product for its requirements, its good working
 order, and for ensuring that it shall not cause damage to either persons
 or properties.
 9.2 The Licensor hereby represents, in good faith, that it is entitled
 to grant all the rights over the Software (including in particular the
 rights set forth in Article 5 <#scope>).
 9.3 The Licensee acknowledges that the Software is supplied "as is" by
 the Licensor without any other express or tacit warranty, other than
 that provided for in Article 9.2 <#good-faith> and, in particular,
 without any warranty as to its commercial value, its secured, safe,
 innovative or relevant nature.
 Specifically, the Licensor does not warrant that the Software is free
 from any error, that it will operate without interruption, that it will
 be compatible with the Licensee's own equipment and software
 configuration, nor that it will meet the Licensee's requirements.
 9.4 The Licensor does not either expressly or tacitly warrant that the
 Software does not infringe any third party intellectual property right
 relating to a patent, software or any other property right. Therefore,
 the Licensor disclaims any and all liability towards the Licensee
 arising out of any or all proceedings for infringement that may be
 instituted in respect of the use, modification and redistribution of the
 Software. Nevertheless, should such proceedings be instituted against
 the Licensee, the Licensor shall provide it with technical and legal
 expertise for its defense. Such technical and legal expertise shall be
 decided on a case-by-case basis between the relevant Licensor and the
 Licensee pursuant to a memorandum of understanding. The Licensor
 disclaims any and all liability as regards the Licensee's use of the
 name of the Software. No warranty is given as regards the existence of
 prior rights over the name of the Software or as regards the existence
 of a trademark.
    Article 10 - TERMINATION
 10.1 In the event of a breach by the Licensee of its obligations
 hereunder, the Licensor may automatically terminate this Agreement
 thirty (30) days after notice has been sent to the Licensee and has
 remained ineffective.
 10.2 A Licensee whose Agreement is terminated shall no longer be
 authorized to use, modify or distribute the Software. However, any
 licenses that it may have granted prior to termination of the Agreement
 shall remain valid subject to their having been granted in compliance
 with the terms and conditions hereof.
    Article 11 - MISCELLANEOUS
      11.1 EXCUSABLE EVENTS
 Neither Party shall be liable for any or all delay, or failure to
 perform the Agreement, that may be attributable to an event of force
 majeure, an act of God or an outside cause, such as defective
 functioning or interruptions of the electricity or telecommunications
 networks, network paralysis following a virus attack, intervention by
 government authorities, natural disasters, water damage, earthquakes,
 fire, explosions, strikes and labor unrest, war, etc.
 11.2 Any failure by either Party, on one or more occasions, to invoke
 one or more of the provisions hereof, shall under no circumstances be
 interpreted as being a waiver by the interested Party of its right to
 invoke said provision(s) subsequently.
 11.3 The Agreement cancels and replaces any or all previous agreements,
 whether written or oral, between the Parties and having the same
 purpose, and constitutes the entirety of the agreement between said
 Parties concerning said purpose. No supplement or modification to the
 terms and conditions hereof shall be effective as between the Parties
 unless it is made in writing and signed by their duly authorized
 representatives.
 11.4 In the event that one or more of the provisions hereof were to
 conflict with a current or future applicable act or legislative text,
 said act or legislative text shall prevail, and the Parties shall make
 the necessary amendments so as to comply with said act or legislative
 text. All other provisions shall remain effective. Similarly, invalidity
 of a provision of the Agreement, for any reason whatsoever, shall not
 cause the Agreement as a whole to be invalid.
      11.5 LANGUAGE
 The Agreement is drafted in both French and English and both versions
 are deemed authentic.
    Article 12 - NEW VERSIONS OF THE AGREEMENT
 12.1 Any person is authorized to duplicate and distribute copies of this
 Agreement.
 12.2 So as to ensure coherence, the wording of this Agreement is
 protected and may only be modified by the authors of the License, who
 reserve the right to periodically publish updates or new versions of the
 Agreement, each with a separate number. These subsequent versions may
 address new issues encountered by Free Software.
 12.3 Any Software distributed under a given version of the Agreement may
 only be subsequently distributed under the same version of the Agreement
 or a subsequent version, subject to the provisions of Article 5.3.4
 <#compatibility>.
    Article 13 - GOVERNING LAW AND JURISDICTION
 13.1 The Agreement is governed by French law. The Parties agree to
 endeavor to seek an amicable solution to any disagreements or disputes
 that may arise during the performance of the Agreement.
 13.2 Failing an amicable solution within two (2) months as from their
 occurrence, and unless emergency proceedings are necessary, the
 disagreements or disputes shall be referred to the Paris Courts having
 jurisdiction, by the more diligent Party.
--- a/MANIFEST.in
+++ b/MANIFEST.in
--- a/README.md
+++ b/README.md
@ -1,40 +0,0 @@
 The `OBITools3`: A package for the management of analyses and data in DNA metabarcoding   
 ---------------------------------------------
 DNA metabarcoding offers new perspectives for biodiversity research [1]. This approach of ecosystem studies relies heavily on the use of Next-Generation Sequencing (NGS), and consequently requires the ability to to treat large volumes of data. The `OBITools` package  satisfies this requirement thanks to a set of programs specifically designed for analyzing NGS data in a DNA metabarcoding context [2] - <http://metabarcoding.org/obitools>. Their capacity to filter and edit sequences while taking into account taxonomic annotation helps to setup tailored-made analysis pipelines for a broad range of DNA metabarcoding applications, including biodiversity surveys or diet analyses.   
 **The `OBITools3`.** This new version of the `OBITools` looks to significantly improve the storage efficiency and the data processing speed. To this end, the `OBITools3` rely on an ad hoc database system, inside which all the data that a DNA metabarcoding experiment must consider is stored: the sequences, the metadata (describing for instance the samples), the database containing the reference sequences used for the taxonomic annotation, as well as the taxonomic databases. Besides the gain in efficiency, this new structure allows an easier access to all the data associated with an experiment.   
 **Column-oriented storage.** An analysis pipeline corresponds to a succession of commands, each computing one step of the analysis, and where the result of the command *n* is used by the command *n+1*. DNA metabarcoding data can easily be represented in the form of tables, and each command can be regarded as an operation transforming one or several 'input' tables into one or several 'output' tables, which can be used by the next command. Many of the basic operations in a pipeline copy without modification an important part of the input tables to the result tables, and use for their calculations only a small part of the input data. In the original `OBITools`, those tables are kept in the form of annotated sequence files in the FASTA or FASTQ format. This has two consequences: i) keeping the transitional results of the analysis pipeline means using disk space for an important volume of redundant data, ii) The coding and decoding of informations that are not actually used represent an important part of the treatment process. The new database system used by the `OBITools3` (called DMS for Data Management System) relies on column-oriented storage. The columns are immutable and can be assembled in views representing the data tables. This way, the data not modified by a command in an input table can easily be associated to the result table without duplicating any information ; and the data not used at all by a command can be associated with the result table without being read. This strategy results in a gain in disk space efficiency by limiting data redundancy, as well as a gain in execution time by limiting data reading, writing and conversion operations. Finally, as a mean to optimize data access, each column is stored in a binary file directly mapped in memory for reading and writing operations.   
 **Storage optimization.** DNA metabarcoding data is intrinsically very redundant. For example, the same sequence corresponding to a species will be present several thousand times across all samples. In order to limit the disk space used and make comparison operations more efficient, data in the form of character strings is stored in columns using a complex indexing structure, efficient on millions of values, coupling hash functions, Bloom filters and AVL trees. Finally, DNA sequences are compressed by encoding each nucleotide on two or four bits depending on whether the sequences contain only the four nucleotides (A, C, G, T) or use the IUPAC codes.   
 **Saving the data processing history.** The totality of the informations used by the `OBITools3` is stored in immutable data structures in the DMS. If a command has to modify a column used as input to produce its result, a new version of that column is created, leaving the initial version intact. This storage system enables to keep, at minimal cost, the totality of the transitional results produced by the pipeline. The storage of metadata describing all the operations that have produced a view (a result table) in the DMS makes possible the creation of an oriented hypergraph, where each node corresponds to a view and each arrow to an operation. By retracing the dependency relationships in this hypergraph, it is possible to rebuild *a posteriori* the entirety of the process that has produced a result table.   
 **Tools.** The `OBITools3` offer the same tools as the original `OBITools`. Eventually, new versions of `ecoPrimers` (PCR primer design) [3], `ecoPCR` (*in silico* PCR) [4], as well as `Sumatra` (sequence alignment) and `Sumaclust` (sequence alignment and clustering) [5] will be added, taking advantage of the database structure developed for the `OBITools3`.    
 **Implementation and disponibility.** The lower layers managing the DMS as well as all the compute-intensive functions are coded in `C99` for efficiency reasons. A `Cython` (<http://www.cython.org>) object layer allows for a simple but efficient implementation of the `OBITools3` commands in `Python 3.5`. The `OBITools3` are still in development, and the first functional versions are expected for autumn 2016.    
 **References.**
 1. Taberlet P, Coissac E, Hajibabaei M, Rieseberg LH: Environmental DNA. Mol Ecol 2012:1789–1793.
 2. Boyer F, Mercier C, Bonin A, Le Bras Y, Taberlet P, Coissac E: OBITools: a Unix-inspired software package for DNA metabarcoding. Mol Ecol Resour 2015:n/a–n/a.
 3. Riaz T, Shehzad W, Viari A, Pompanon F, Taberlet P, Coissac E: ecoPrimers: inference of new DNA barcode markers from whole genome sequence analysis. Nucleic Acids Res 2011, 39:e145.
 4. Ficetola GF, Coissac E, Zundel S, Riaz T, Shehzad W, Bessière J, Taberlet P, Pompanon F: An in silico approach for the evaluation of DNA barcodes. BMC Genomics 2010, 11:434.
 5. Mercier C, Boyer F, Bonin A, Coissac E (2013) SUMATRA and SUMACLUST: fast and exact comparison and clustering of sequences. Available: <http://metabarcoding.org/sumatra> and <http://metabarcoding.org/sumaclust>
--- a/c-sandbox/obicount/Makefile
+++ b/c-sandbox/obicount/Makefile
--- a/c-sandbox/obicount/obicount.c
+++ b/c-sandbox/obicount/obicount.c
--- a/distutils.ext/obidistutils/command/build_exe.py
+++ b/distutils.ext/obidistutils/command/build_exe.py
@ -6,28 +6,12 @@ Created on 20 oct. 2012
 import os
 from distutils import sysconfig
 from distutils.core import Command
-from distutils.sysconfig import customize_compiler as customize_compiler_ori
+from distutils.sysconfig import customize_compiler
 from distutils.errors import DistutilsSetupError
 from distutils import log
 from distutils.ccompiler import show_compilers
 def customize_compiler(compiler):
    customize_compiler_ori(compiler)
    compilername = compiler.compiler[0]
    if ("gcc" in compilername or "g++" in compilername):
        cc_cmd = ' '.join(compiler.compiler + ['-fopenmp'])
        ccshared= ' '.join(x for x in sysconfig.get_config_vars("ccshared") if x is not None)
        compiler.set_executables(
            compiler=cc_cmd,
            compiler_so=cc_cmd + ' ' + ccshared
            )
 class build_exe(Command):
    description = "build an executable -- Abstract command "
@ -96,7 +80,6 @@ class build_exe(Command):
            else:
                self.extra_compile_args.append('-m%s' % self.sse)
        # XXX same as for build_ext -- what about 'self.define' and
        # 'self.undef' ?
--- a/distutils.ext/obidistutils/command/build_ext.py
+++ b/distutils.ext/obidistutils/command/build_ext.py
@ -7,32 +7,16 @@ Created on 13 fevr. 2014
 from distutils import log
 import os
-from distutils import sysconfig
+from Cython.Distutils import build_ext  as ori_build_ext  # @UnresolvedImport
 from Cython.Compiler import Options as cython_options  # @UnresolvedImport
 from distutils.errors import DistutilsSetupError
 def _customize_compiler(compiler):
    compilername = compiler.compiler[0]
    if ("gcc" in compilername or "g++" in compilername):
        cc_cmd  = ' '.join(compiler.compiler + ['-fopenmp'])
        ccshared= ' '.join(x for x in sysconfig.get_config_vars("ccshared") if x is not None)
        compiler.set_executables(
            compiler=cc_cmd,
            compiler_so=cc_cmd + ' ' + ccshared
            )
 try:
    from Cython.Distutils import build_ext  as ori_build_ext  # @UnresolvedImport
    from Cython.Compiler import Options as cython_options  # @UnresolvedImport
 class build_ext(ori_build_ext):
    def modifyDocScripts(self):
            try:
                os.mkdir("doc/sphinx")
            except:
                pass
        build_dir_file=open("doc/sphinx/build_dir.txt","w")
        print(self.build_lib,file=build_dir_file)
        build_dir_file.close()
@ -44,8 +28,7 @@ try:
    def finalize_options(self):
-            super(build_ext, self).finalize_options()
+        ori_build_ext.finalize_options(self)  # @UndefinedVariable
        self.set_undefined_options('littlebigman',
                                   ('littlebigman',  'littlebigman'))
@ -97,23 +80,11 @@ try:
        self.check_extensions_list(self.extensions)
            print("pouic")
            print(ext.sources)
            print("pouac")
        for ext in self.extensions:
            log.info("%s :-> %s",ext.name,ext.sources)
            ext.sources = self.cython_sources(ext.sources, ext)
            self.build_extension(ext)
        def build_extensions(self):  # TODO what?? double? is it supposed to be build_extension?
            if hasattr(self, 'compiler'):
                _customize_compiler(self.compiler)
            if hasattr(self, 'shlib_compiler'):
                _customize_compiler(self.shlib_compiler)
            ori_build_ext.build_extensions(self)
    def run(self):
        self.modifyDocScripts()
@ -133,10 +104,9 @@ try:
    sub_commands = [('build_files',has_files),
                    ('build_cexe', has_executables)
-                        ] + ori_build_ext.sub_commands 
+                    ] + \
-
+                   ori_build_ext.sub_commands
-except ImportError:
+
    from distutils.command import build_ext  # @UnusedImport
--- a/distutils.ext/obidistutils/core.py
+++ b/distutils.ext/obidistutils/core.py
@ -9,13 +9,13 @@ import os.path
 import glob
 import sys
-try:
+# try:
-    from setuptools.extension import Extension
+#     from setuptools.extension import Extension
-except ImportError:
+# except ImportError:
    from distutils.extension import Extension
 #     from distutils.extension import Extension
 from distutils.extension import Extension
 from obidistutils.serenity.checkpackage import  install_requirements,\
                                                check_requirements, \
                                                RequirementError
@ -47,16 +47,10 @@ def findCython(root,base=None,pyrexs=None):
    for module in (path.basename(path.dirname(x)) 
                   for x in glob.glob(path.join(root,'*','__init__.py'))):
        for pyrex in glob.glob(path.join(root,module,'*.pyx')):
            libabspath = os.path.abspath('obi_libdir')
            obiabspath = os.path.abspath('.')
            pyrexs.append(Extension('.'.join(base+[module,path.splitext(path.basename(pyrex))[0]]),
-                                    [pyrex],
+                                    [pyrex]
                                    library_dirs=[libabspath],
                                    include_dirs=[libabspath],
                                    libraries=["obi3"],
                                    runtime_library_dirs=[libabspath],
                                    extra_link_args=["-Wl,-rpath,"+libabspath, "-L"+libabspath]
                                    )
                          )
            try:
@ -69,15 +63,13 @@ def findCython(root,base=None,pyrexs=None):
                log.info("Cython module : %s",cfiles)   
                incdir = set(os.path.dirname(x) for x in cfiles if x[-2:]==".h")
-                #cfiles = [x for x in cfiles if x[-2:]==".c"]                
+                cfiles = [x for x in cfiles if x[-2:]==".c"]                
-                #pyrexs[-1].sources.extend(cfiles)
+                pyrexs[-1].sources.extend(cfiles)
                pyrexs[-1].include_dirs.extend(incdir)
                pyrexs[-1].extra_compile_args.extend(['-msse2',
                                                      '-Wno-unused-function',
                                                      '-Wmissing-braces',
-                                                      '-Wchar-subscripts',
+                                                      '-Wchar-subscripts'])
                                                      '-fPIC'
                                                      ])
            except IOError:
                pass
@ -143,7 +135,7 @@ def setup(**attrs):
    log.set_threshold(log.INFO)
-    minversion      = attrs.get("pythonmin",'3.7')
+    minversion      = attrs.get("pythonmin",'3.4')
    maxversion      = attrs.get('pythonmax',None)    
    fork            = attrs.get('fork',False)
    requirementfile = attrs.get('requirements','requirements.txt')
@ -231,4 +223,4 @@ def setup(**attrs):
    from distutils.core import setup as ori_setup
-    return ori_setup(**attrs)
+    ori_setup(**attrs)
--- a/distutils.ext/obidistutils/dist.py
+++ b/distutils.ext/obidistutils/dist.py
@ -4,13 +4,13 @@ Created on 20 oct. 2012
@author: coissac
 '''
-try:
+# try:
-    from setuptools.dist import Distribution as ori_Distribution
+#     from setuptools.dist import Distribution as ori_Distribution
-except ImportError:
+# except ImportError:
    from distutils.dist import Distribution as ori_Distribution
 #     from distutils.dist import Distribution as ori_Distribution
 from distutils.dist import Distribution as ori_Distribution
 class Distribution(ori_Distribution):
    def __init__(self,attrs=None):
--- a/distutils.ext/obidistutils/serenity/init.py
+++ b/distutils.ext/obidistutils/serenity/init.py
@ -81,15 +81,9 @@ def serenity_mode(package,version):
    argparser.add_argument('--serenity',
                           dest='serenity', 
                           action='store_true',
-                           default=True, 
+                           default=False, 
                           help='Switch the installer in serenity mode. Everythings are installed in a virtualenv')
    argparser.add_argument('--no-serenity',
                           dest='serenity', 
                           action='store_false',
                           default=True, 
                           help='Switch the installer in the no serenity mode.')
    argparser.add_argument('--virtualenv',
                           dest='virtual', 
                           type=str,
--- a/distutils.ext/obidistutils/serenity/bootstrappip.py
+++ b/distutils.ext/obidistutils/serenity/bootstrappip.py
@ -1,36 +0,0 @@
 '''
 Created on 22 janv. 2016
@author: coissac
 '''
 import sys
 from urllib import request
 import os.path
 from obidistutils.serenity.util import get_serenity_dir
 from obidistutils.serenity.rerun import rerun_with_anothe_python
 from obidistutils.serenity.checkpython import is_a_virtualenv_python
 getpipurl="https://bootstrap.pypa.io/get-pip.py"
 def bootstrap():
    getpipfile=os.path.join(get_serenity_dir(),"get-pip.py")
    with request.urlopen(getpipurl) as getpip:
        with open(getpipfile,"wb") as out:
            for l in getpip:
                out.write(l)
    python = sys.executable
    if is_a_virtualenv_python():
        command= "%s %s" % (python,getpipfile)        
    else:
        command= "%s %s --user" % (python,getpipfile)
    os.system(command)
    rerun_with_anothe_python(python)
--- a/distutils.ext/obidistutils/serenity/checkpackage.py
+++ b/distutils.ext/obidistutils/serenity/checkpackage.py
@ -5,35 +5,27 @@ Created on 2 oct. 2014
 '''
 import re
 import os
 import pip                                              # @UnresolvedImport
 from pip.utils import get_installed_distributions       # @UnresolvedImport
 from distutils.version import StrictVersion             # @UnusedImport
 from distutils.errors import DistutilsError
 from distutils import log
 import os.path
 import sys
 import subprocess
 class RequirementError(Exception):
    pass
 def is_installed(requirement):
    pipcommand = os.path.join(os.path.dirname(sys.executable),'pip')
    pipjson    = subprocess.run([pipcommand,"list","--format=json"], 
                                 capture_output=True).stdout
    packages = eval(pipjson) 
    requirement_project,requirement_relation,requirement_version = parse_package_requirement(requirement)
-    package = [x for x in packages if x["name"]==requirement_project]
+    package = [x for x in get_installed_distributions() if x.project_name==requirement_project]
    if len(package)==1:
-        if (     requirement_version is not None 
+        if requirement_version is not None and requirement_relation is not None:    
-             and requirement_relation is not None):    
+            rep = (len(package)==1) and eval("StrictVersion('%s') %s StrictVersion('%s')" % (package[0].version,
            rep = (len(package)==1) and eval("StrictVersion('%s') %s StrictVersion('%s')" % (package[0]["version"],
                                                                                           requirement_relation,
                                                                                           requirement_version)
                                             )
@ -47,23 +39,20 @@ def is_installed(requirement):
            log.info("Look for package %s (%s%s) : ok version %s installed" % (requirement_project,
                                                                               requirement_relation,
                                                                               requirement_version,
-                                                                               package[0]["version"]))
+                                                                               package[0].version))
        else:
            log.info("Look for package %s : ok version %s installed" % (requirement_project,
-                                                                        package[0]["version"]))
+                                                                        package[0].version))
    else:
        if len(package)!=1:
            if requirement_version is not None and requirement_relation is not None: 
            log.info("Look for package %s (%s%s) : not installed" % (requirement_project,
                                                                     requirement_relation,
                                                                     requirement_version))
            else:
                log.info("Look for package %s : not installed" % requirement_project)                
        else:
            log.info("Look for package %s (%s%s) : failed only version %s installed" % (requirement_project,
                                                                                        requirement_relation,
                                                                                        requirement_version,
-                                                                                        package[0]["version"]))
+                                                                                        package[0].version))
    return rep
@ -92,7 +81,7 @@ def install_requirements(requirementfile='requirements.txt'):
        ok = is_installed(x)
        if not ok:
            log.info("  Installing requirement : %s" % x)
-            pip_install_package(x,requirement=requirementfile)
+            pip_install_package(x)
            install_something=True
            if x[0:3]=='pip':
                return True
@ -145,9 +134,8 @@ def get_package_requirement(package,requirementfile='requirements.txt'):
        return None
-def pip_install_package(package,directory=None,requirement=None):
+def pip_install_package(package,directory=None,upgrade=True):
    pipcommand = os.path.join(os.path.dirname(sys.executable),'pip')
    if directory is not None:
        log.info('    installing %s in directory %s' % (package,str(directory)))
@ -157,9 +145,8 @@ def pip_install_package(package,directory=None,requirement=None):
    args = ['install']
-    if requirement:
+    if upgrade:
-        args.append('--requirement')
+        args.append('--upgrade')
        args.append(requirement)
    if 'https_proxy' in os.environ:
        args.append('--proxy=%s' % os.environ['https_proxy'])
@ -169,7 +156,5 @@ def pip_install_package(package,directory=None,requirement=None):
    args.append(package)
-    pip = subprocess.run([pipcommand] + args)
+    return pip.main(args)
    return pip
--- a/doc/.gitignore
+++ b/doc/.gitignore
--- a/doc/Doxyfile
+++ b/doc/Doxyfile
--- a/doc/Makefile
+++ b/doc/Makefile
--- a/doc/conf.py
+++ b/doc/conf.py
@ -33,10 +33,10 @@ extensions = [
    'sphinx.ext.autodoc',
    'sphinx.ext.todo',
    'sphinx.ext.coverage',
-    'sphinx.ext.imgmath',
+    'sphinx.ext.pngmath',
    'sphinx.ext.ifconfig',
    'sphinx.ext.viewcode',
-    'breathe',
+#    'breathe',
 ]
 # Add any paths that contain templates here, relative to this directory.
@ -295,6 +295,4 @@ texinfo_documents = [
 sys.path.append( "breathe/" )
 breathe_projects = { "OBITools3": "doxygen/xml/" }
 breathe_default_project = "OBITools3"
-#breathe_projects_source = {
+
 #     "auto" : ( "../src", ["obidms.h", "obiavl.h"] )
 #     }
--- a/doc/source/DMS.rst
+++ b/doc/source/DMS.rst
--- a/doc/source/UML/OBIDMS_UML.png
+++ b/doc/source/UML/OBIDMS_UML.png
--- a/doc/source/UML/OBITypes_UML.class.violet.html
+++ b/doc/source/UML/OBITypes_UML.class.violet.html
--- a/doc/source/UML/OBITypes_UML.png
+++ b/doc/source/UML/OBITypes_UML.png
--- a/doc/source/UML/ObiDMS_UML.class.violet.html
+++ b/doc/source/UML/ObiDMS_UML.class.violet.html
--- a/doc/source/containers.rst
+++ b/doc/source/containers.rst
--- a/doc/source/data.rst
+++ b/doc/source/data.rst
--- a/doc/source/elementary.rst
+++ b/doc/source/elementary.rst
--- a/doc/source/guidelines.rst
+++ b/doc/source/guidelines.rst
--- a/doc/source/images/history.png
+++ b/doc/source/images/history.png
--- a/doc/source/images/version_control.png
+++ b/doc/source/images/version_control.png
--- a/doc/source/index.rst
+++ b/doc/source/index.rst
@ -11,7 +11,7 @@ OBITools3 documentation
   Programming guidelines <guidelines>
   Data structures <data>
-   Code documentation <code_doc/codedoc>
+
 Indices and tables
 ------------------
--- a/doc/source/specialvalues.rst
+++ b/doc/source/specialvalues.rst
--- a/doc/source/types.rst
+++ b/doc/source/types.rst
@ -4,7 +4,6 @@ OBITypes
 .. image:: ./UML/OBITypes_UML.png
 :download:`html version of the OBITypes UML file <UML/OBITypes_UML.class.violet.html>`
--- a/doc/sphinx/build_dir.txt
+++ b/doc/sphinx/build_dir.txt
@ -0,0 +1 @@
 build/lib.macosx-10.6-intel-3.5
--- a/python/obi.py
+++ b/python/obi.py
@ -1,71 +0,0 @@
 #!/usr/local/bin/python3.4
 '''
 obi -- shortdesc
 obi is a description
 It defines classes_and_methods
@author:     user_name
@copyright:  2014 organization_name. All rights reserved.
@license:    license
@contact:    user_email
@deffield    updated: Updated
 '''
 default_config = { 'software'       : "The OBITools",
                   'log'            : False,
                   'loglevel'       : 'INFO',
                   'progress'       : True,
                   'inputURI'       : None,
                   'outputURI'      : None,
                   'defaultdms'     : None,
                   'inputview'      : None,
                   'outputview'     : None,
                   'skip'           : 0,
                   'only'           : None,
                   'fileformat'     : None,
                   'skiperror'      : True,
                   'qualityformat'  : b'sanger',
                   'offset'         : -1,
                   'noquality'      : False,
                   'seqtype'        : b'nuc',
                   "header"         : False,
                   "sep"            : None,
                   "quote"          : [b"'",b'"'],
                   "dec"            : b".",
                   "nastring"       : b"NA",
                   "stripwhite"     : True,
                   "blanklineskip"  : True,
                   "commentchar"    : b"#",
                   "nocreatedms"    : False
                  }
 root_config_name='obi'
 from obitools3.apps.config import getConfiguration     # @UnresolvedImport
 from obitools3.version import version
 __all__     = []
 __version__ = version
 __date__    = '2014-09-28'
 __updated__ = '2014-09-28'
 DEBUG = 1
 TESTRUN = 0
 PROFILE = 0
 if __name__ =="__main__":
    config = getConfiguration(root_config_name,
                              default_config)    
    config[root_config_name]['module'].run(config)
--- a/python/obitools3/init.py
+++ b/python/obitools3/init.py
--- a/python/obitools3/init.pyc
+++ b/python/obitools3/init.pyc
--- a/python/obitools3/apps/arguments.cfiles
+++ b/python/obitools3/apps/arguments.cfiles
@ -1,110 +0,0 @@
 ../../../src/obi_lcs.h
 ../../../src/obi_lcs.c
 ../../../src/obierrno.h
 ../../../src/obierrno.c
 ../../../src/upperband.h
 ../../../src/upperband.c
 ../../../src/sse_banded_LCS_alignment.h
 ../../../src/sse_banded_LCS_alignment.c
 ../../../src/obiblob.h
 ../../../src/obiblob.c
 ../../../src/utils.h
 ../../../src/utils.c
 ../../../src/obidms.h
 ../../../src/obidms.c
 ../../../src/libjson/json_utils.h
 ../../../src/libjson/json_utils.c
 ../../../src/libjson/cJSON.h
 ../../../src/libjson/cJSON.c
 ../../../src/obiavl.h
 ../../../src/obiavl.c
 ../../../src/bloom.h
 ../../../src/bloom.c
 ../../../src/crc64.h
 ../../../src/crc64.c
 ../../../src/murmurhash2.h
 ../../../src/murmurhash2.c
 ../../../src/obidmscolumn.h
 ../../../src/obidmscolumn.c
 ../../../src/obitypes.h
 ../../../src/obitypes.c
 ../../../src/obidmscolumndir.h
 ../../../src/obidmscolumndir.c
 ../../../src/obiblob_indexer.h
 ../../../src/obiblob_indexer.c
 ../../../src/obiview.h
 ../../../src/obiview.c
 ../../../src/hashtable.h
 ../../../src/hashtable.c
 ../../../src/linked_list.h
 ../../../src/linked_list.c
 ../../../src/obidmscolumn_array.h
 ../../../src/obidmscolumn_array.c
 ../../../src/obidmscolumn_blob.h
 ../../../src/obidmscolumn_blob.c
 ../../../src/obidmscolumn_idx.h
 ../../../src/obidmscolumn_idx.c
 ../../../src/obidmscolumn_bool.h
 ../../../src/obidmscolumn_bool.c
 ../../../src/obidmscolumn_char.h
 ../../../src/obidmscolumn_char.c
 ../../../src/obidmscolumn_float.h
 ../../../src/obidmscolumn_float.c
 ../../../src/obidmscolumn_int.h
 ../../../src/obidmscolumn_int.c
 ../../../src/obidmscolumn_qual.h
 ../../../src/obidmscolumn_qual.c
 ../../../src/obidmscolumn_seq.h
 ../../../src/obidmscolumn_seq.c
 ../../../src/obidmscolumn_str.h
 ../../../src/obidmscolumn_str.c
 ../../../src/array_indexer.h
 ../../../src/array_indexer.c
 ../../../src/char_str_indexer.h
 ../../../src/char_str_indexer.c
 ../../../src/dna_seq_indexer.h
 ../../../src/dna_seq_indexer.c
 ../../../src/encode.c
 ../../../src/encode.h
 ../../../src/uint8_indexer.c
 ../../../src/uint8_indexer.h
 ../../../src/build_reference_db.c
 ../../../src/build_reference_db.h
 ../../../src/kmer_similarity.c
 ../../../src/kmer_similarity.h
 ../../../src/obi_clean.c
 ../../../src/obi_clean.h
 ../../../src/obi_ecopcr.c
 ../../../src/obi_ecopcr.h
 ../../../src/obi_ecotag.c
 ../../../src/obi_ecotag.h
 ../../../src/obidms_taxonomy.c
 ../../../src/obidms_taxonomy.h
 ../../../src/obilittlebigman.c
 ../../../src/obilittlebigman.h
 ../../../src/_sse.h
 ../../../src/obidebug.h
 ../../../src/libecoPCR/libapat/CODES/dft_code.h
 ../../../src/libecoPCR/libapat/CODES/dna_code.h
 ../../../src/libecoPCR/libapat/CODES/prot_code.h
 ../../../src/libecoPCR/libapat/apat_parse.c
 ../../../src/libecoPCR/libapat/apat_search.c
 ../../../src/libecoPCR/libapat/apat.h
 ../../../src/libecoPCR/libapat/Gmach.h
 ../../../src/libecoPCR/libapat/Gtypes.h
 ../../../src/libecoPCR/libapat/libstki.c
 ../../../src/libecoPCR/libapat/libstki.h
 ../../../src/libecoPCR/libthermo/nnparams.h
 ../../../src/libecoPCR/libthermo/nnparams.c
 ../../../src/libecoPCR/ecoapat.c
 ../../../src/libecoPCR/ecodna.c
 ../../../src/libecoPCR/ecoError.c
 ../../../src/libecoPCR/ecoMalloc.c
 ../../../src/libecoPCR/ecoPCR.h
--- a/python/obitools3/apps/arguments.pxd
+++ b/python/obitools3/apps/arguments.pxd
@ -1,3 +0,0 @@
 #cython: language_level=3
 cpdef buildArgumentParser(str configname, str softname)
--- a/python/obitools3/apps/arguments.pyx
+++ b/python/obitools3/apps/arguments.pyx
@ -1,62 +0,0 @@
 #cython: language_level=3
 '''
 Created on 27 mars 2016
@author: coissac
 '''
 import argparse
 import sys
 from .command import getCommandsList
 class ObiParser(argparse.ArgumentParser): 
    def error(self, message):
        sys.stderr.write('error: %s\n' % message)
        self.print_help()
        sys.exit(2)
 cpdef buildArgumentParser(str configname, 
                          str softname):
    parser = ObiParser()
    parser.add_argument('--version',   dest='%s:version' % configname, 
                                       action='store_true', 
                                       default=False, 
                        help='Print the version of %s' % softname)
    parser.add_argument('--log',       dest='%s:log' % configname, 
                                       action='store',
                                       type=str,
                                       default=None, 
                        help='Create a logfile')
    parser.add_argument('--no-progress', dest='%s:progress' % configname, 
                                       action='store_false', 
                                       default=None, 
                        help='Do not print the progress bar during analyzes')
    subparsers = parser.add_subparsers(title='subcommands',
                                       description='valid subcommands',
                                       help='additional help')
    commands = getCommandsList()
    for c in commands:
        module = commands[c]
        if hasattr(module, "run"):
            if hasattr(module, "__title__"):
                sub = subparsers.add_parser(c,help=module.__title__)
            else:
                sub = subparsers.add_parser(c)
            if hasattr(module, "addOptions"):
                module.addOptions(sub)
            sub.set_defaults(**{'%s:module'  % configname : module})
            sub.set_defaults(**{'%s:modulename'  % configname : c})
    return parser
--- a/python/obitools3/apps/command.cfiles
+++ b/python/obitools3/apps/command.cfiles
@ -1,110 +0,0 @@
 ../../../src/obi_lcs.h
 ../../../src/obi_lcs.c
 ../../../src/obierrno.h
 ../../../src/obierrno.c
 ../../../src/upperband.h
 ../../../src/upperband.c
 ../../../src/sse_banded_LCS_alignment.h
 ../../../src/sse_banded_LCS_alignment.c
 ../../../src/obiblob.h
 ../../../src/obiblob.c
 ../../../src/utils.h
 ../../../src/utils.c
 ../../../src/obidms.h
 ../../../src/obidms.c
 ../../../src/libjson/json_utils.h
 ../../../src/libjson/json_utils.c
 ../../../src/libjson/cJSON.h
 ../../../src/libjson/cJSON.c
 ../../../src/obiavl.h
 ../../../src/obiavl.c
 ../../../src/bloom.h
 ../../../src/bloom.c
 ../../../src/crc64.h
 ../../../src/crc64.c
 ../../../src/murmurhash2.h
 ../../../src/murmurhash2.c
 ../../../src/obidmscolumn.h
 ../../../src/obidmscolumn.c
 ../../../src/obitypes.h
 ../../../src/obitypes.c
 ../../../src/obidmscolumndir.h
 ../../../src/obidmscolumndir.c
 ../../../src/obiblob_indexer.h
 ../../../src/obiblob_indexer.c
 ../../../src/obiview.h
 ../../../src/obiview.c
 ../../../src/hashtable.h
 ../../../src/hashtable.c
 ../../../src/linked_list.h
 ../../../src/linked_list.c
 ../../../src/obidmscolumn_array.h
 ../../../src/obidmscolumn_array.c
 ../../../src/obidmscolumn_blob.h
 ../../../src/obidmscolumn_blob.c
 ../../../src/obidmscolumn_idx.h
 ../../../src/obidmscolumn_idx.c
 ../../../src/obidmscolumn_bool.h
 ../../../src/obidmscolumn_bool.c
 ../../../src/obidmscolumn_char.h
 ../../../src/obidmscolumn_char.c
 ../../../src/obidmscolumn_float.h
 ../../../src/obidmscolumn_float.c
 ../../../src/obidmscolumn_int.h
 ../../../src/obidmscolumn_int.c
 ../../../src/obidmscolumn_qual.h
 ../../../src/obidmscolumn_qual.c
 ../../../src/obidmscolumn_seq.h
 ../../../src/obidmscolumn_seq.c
 ../../../src/obidmscolumn_str.h
 ../../../src/obidmscolumn_str.c
 ../../../src/array_indexer.h
 ../../../src/array_indexer.c
 ../../../src/char_str_indexer.h
 ../../../src/char_str_indexer.c
 ../../../src/dna_seq_indexer.h
 ../../../src/dna_seq_indexer.c
 ../../../src/encode.c
 ../../../src/encode.h
 ../../../src/uint8_indexer.c
 ../../../src/uint8_indexer.h
 ../../../src/build_reference_db.c
 ../../../src/build_reference_db.h
 ../../../src/kmer_similarity.c
 ../../../src/kmer_similarity.h
 ../../../src/obi_clean.c
 ../../../src/obi_clean.h
 ../../../src/obi_ecopcr.c
 ../../../src/obi_ecopcr.h
 ../../../src/obi_ecotag.c
 ../../../src/obi_ecotag.h
 ../../../src/obidms_taxonomy.c
 ../../../src/obidms_taxonomy.h
 ../../../src/obilittlebigman.c
 ../../../src/obilittlebigman.h
 ../../../src/_sse.h
 ../../../src/obidebug.h
 ../../../src/libecoPCR/libapat/CODES/dft_code.h
 ../../../src/libecoPCR/libapat/CODES/dna_code.h
 ../../../src/libecoPCR/libapat/CODES/prot_code.h
 ../../../src/libecoPCR/libapat/apat_parse.c
 ../../../src/libecoPCR/libapat/apat_search.c
 ../../../src/libecoPCR/libapat/apat.h
 ../../../src/libecoPCR/libapat/Gmach.h
 ../../../src/libecoPCR/libapat/Gtypes.h
 ../../../src/libecoPCR/libapat/libstki.c
 ../../../src/libecoPCR/libapat/libstki.h
 ../../../src/libecoPCR/libthermo/nnparams.h
 ../../../src/libecoPCR/libthermo/nnparams.c
 ../../../src/libecoPCR/ecoapat.c
 ../../../src/libecoPCR/ecodna.c
 ../../../src/libecoPCR/ecoError.c
 ../../../src/libecoPCR/ecoMalloc.c
 ../../../src/libecoPCR/ecoPCR.h
--- a/python/obitools3/apps/command.pxd
+++ b/python/obitools3/apps/command.pxd
@ -1,3 +0,0 @@
 #cython: language_level=3
 cdef object loadCommand(str name,loader)
--- a/python/obitools3/apps/command.pyx
+++ b/python/obitools3/apps/command.pyx
@ -1,44 +0,0 @@
 #cython: language_level=3
 '''
 Created on 27 mars 2016
@author: coissac
 '''
 import pkgutil
 from obitools3 import commands
 cdef object loadCommand(str name,loader):
    '''
    Load a command module from its name and an ImpLoader
    This function is for internal use
    @param name:   name of the module
    @type name: str 
    @param loader: the module loader
    @type loader: ImpLoader
    @return the loaded module
    @rtype: module 
    '''
    module = loader.find_module(name).load_module(name)
    return module
 def getCommandsList():
    '''
    Returns the list of sub-commands available to the main `obi` command
    @return: a dict instance with key corresponding to each command and
             value corresponding to the module
    @rtype: dict
    '''
    cdef dict cmds = dict((x[1],loadCommand(x[1],x[0])) 
                           for x in pkgutil.iter_modules(commands.__path__) 
                           if not x[2])
    return cmds
--- a/python/obitools3/apps/config.cfiles
+++ b/python/obitools3/apps/config.cfiles
@ -1,110 +0,0 @@
 ../../../src/obi_lcs.h
 ../../../src/obi_lcs.c
 ../../../src/obierrno.h
 ../../../src/obierrno.c
 ../../../src/upperband.h
 ../../../src/upperband.c
 ../../../src/sse_banded_LCS_alignment.h
 ../../../src/sse_banded_LCS_alignment.c
 ../../../src/obiblob.h
 ../../../src/obiblob.c
 ../../../src/utils.h
 ../../../src/utils.c
 ../../../src/obidms.h
 ../../../src/obidms.c
 ../../../src/libjson/json_utils.h
 ../../../src/libjson/json_utils.c
 ../../../src/libjson/cJSON.h
 ../../../src/libjson/cJSON.c
 ../../../src/obiavl.h
 ../../../src/obiavl.c
 ../../../src/bloom.h
 ../../../src/bloom.c
 ../../../src/crc64.h
 ../../../src/crc64.c
 ../../../src/murmurhash2.h
 ../../../src/murmurhash2.c
 ../../../src/obidmscolumn.h
 ../../../src/obidmscolumn.c
 ../../../src/obitypes.h
 ../../../src/obitypes.c
 ../../../src/obidmscolumndir.h
 ../../../src/obidmscolumndir.c
 ../../../src/obiblob_indexer.h
 ../../../src/obiblob_indexer.c
 ../../../src/obiview.h
 ../../../src/obiview.c
 ../../../src/hashtable.h
 ../../../src/hashtable.c
 ../../../src/linked_list.h
 ../../../src/linked_list.c
 ../../../src/obidmscolumn_array.h
 ../../../src/obidmscolumn_array.c
 ../../../src/obidmscolumn_blob.h
 ../../../src/obidmscolumn_blob.c
 ../../../src/obidmscolumn_idx.h
 ../../../src/obidmscolumn_idx.c
 ../../../src/obidmscolumn_bool.h
 ../../../src/obidmscolumn_bool.c
 ../../../src/obidmscolumn_char.h
 ../../../src/obidmscolumn_char.c
 ../../../src/obidmscolumn_float.h
 ../../../src/obidmscolumn_float.c
 ../../../src/obidmscolumn_int.h
 ../../../src/obidmscolumn_int.c
 ../../../src/obidmscolumn_qual.h
 ../../../src/obidmscolumn_qual.c
 ../../../src/obidmscolumn_seq.h
 ../../../src/obidmscolumn_seq.c
 ../../../src/obidmscolumn_str.h
 ../../../src/obidmscolumn_str.c
 ../../../src/array_indexer.h
 ../../../src/array_indexer.c
 ../../../src/char_str_indexer.h
 ../../../src/char_str_indexer.c
 ../../../src/dna_seq_indexer.h
 ../../../src/dna_seq_indexer.c
 ../../../src/encode.c
 ../../../src/encode.h
 ../../../src/uint8_indexer.c
 ../../../src/uint8_indexer.h
 ../../../src/build_reference_db.c
 ../../../src/build_reference_db.h
 ../../../src/kmer_similarity.c
 ../../../src/kmer_similarity.h
 ../../../src/obi_clean.c
 ../../../src/obi_clean.h
 ../../../src/obi_ecopcr.c
 ../../../src/obi_ecopcr.h
 ../../../src/obi_ecotag.c
 ../../../src/obi_ecotag.h
 ../../../src/obidms_taxonomy.c
 ../../../src/obidms_taxonomy.h
 ../../../src/obilittlebigman.c
 ../../../src/obilittlebigman.h
 ../../../src/_sse.h
 ../../../src/obidebug.h
 ../../../src/libecoPCR/libapat/CODES/dft_code.h
 ../../../src/libecoPCR/libapat/CODES/dna_code.h
 ../../../src/libecoPCR/libapat/CODES/prot_code.h
 ../../../src/libecoPCR/libapat/apat_parse.c
 ../../../src/libecoPCR/libapat/apat_search.c
 ../../../src/libecoPCR/libapat/apat.h
 ../../../src/libecoPCR/libapat/Gmach.h
 ../../../src/libecoPCR/libapat/Gtypes.h
 ../../../src/libecoPCR/libapat/libstki.c
 ../../../src/libecoPCR/libapat/libstki.h
 ../../../src/libecoPCR/libthermo/nnparams.h
 ../../../src/libecoPCR/libthermo/nnparams.c
 ../../../src/libecoPCR/ecoapat.c
 ../../../src/libecoPCR/ecodna.c
 ../../../src/libecoPCR/ecoError.c
 ../../../src/libecoPCR/ecoMalloc.c
 ../../../src/libecoPCR/ecoPCR.h
--- a/python/obitools3/apps/config.pxd
+++ b/python/obitools3/apps/config.pxd
@ -1,10 +0,0 @@
 #cython: language_level=3
 cpdef str setRootConfigName(str rootname)
 cpdef str getRootConfigName()
 cdef dict buildDefaultConfiguration(str root_config_name,
                                    dict  config)
 cpdef dict getConfiguration(str root_config_name=?,
                                    dict  config=?)
--- a/python/obitools3/apps/config.pyx
+++ b/python/obitools3/apps/config.pyx
@ -1,114 +0,0 @@
 #cython: language_level=3
 '''
 Created on 27 mars 2016
@author: coissac
 '''
 import sys
 from .command   import  getCommandsList
 from .logging   cimport getLogger
 from .arguments cimport buildArgumentParser
 from ..version import version
 from _curses import version
 cdef dict __default_config__ = {}
 cpdef str setRootConfigName(str rootname):
    global __default_config__
    if '__root_config__' in __default_config__:
        if __default_config__["__root_config__"] in __default_config__:
            __default_config__[rootname]=__default_config__[__default_config__["__root_config__"]]
            del __default_config__[__default_config__["__root_config__"]]
    __default_config__['__root_config__']=rootname
    return rootname
 cpdef str getRootConfigName():
    global __default_config__
    return __default_config__.get('__root_config__',None)
 cdef dict buildDefaultConfiguration(str root_config_name,
                                    dict  config):
    global __default_config__
    __default_config__.clear()
    setRootConfigName(root_config_name)    
    __default_config__[root_config_name]=config
    config['version']=version
    commands = getCommandsList()
    for c in commands:
        module = commands[c]
        assert hasattr(module, "run")
        if hasattr(module, 'default_config'):
            __default_config__[c]=module.default_config
        else:
            __default_config__[c]={}
    return __default_config__
 cpdef dict getConfiguration(str root_config_name="__default__",
                            dict  config={}):
    global __default_config__
    if '__done__' in __default_config__:
        return __default_config__
    if root_config_name=="__default__":
        raise RuntimeError("No root_config_name specified")
    if not config:
        raise RuntimeError("Base configuration is empty")
    config =  buildDefaultConfiguration(root_config_name,
                                        config)
    parser = buildArgumentParser(root_config_name,
                                 config[root_config_name]['software'])
    options = vars(parser.parse_args())
    if options['%s:version' % root_config_name]:
        print("%s - Version %s" % (config[root_config_name]['software'],
                                   config[root_config_name]['version']))
        sys.exit(0)
    for k in options:
        section,key = k.split(':')
        s = config[section]
        if options[k] is not None:
            s[key]=options[k]
    if not 'module' in config[root_config_name]:
        print('\nError: No command specified',file=sys.stderr)
        parser.print_help()
        sys.exit(2)
    getLogger(config)
    config['__done__']=True
    return config
 def logger(level, *messages):
    try:
        config=getConfiguration()
        root = config["__root_config__"]
        l = config[root]['logger']
        if config[root]['verbose']:
            getattr(l, level)(*messages)
    except:
        print(*messages,file=sys.stderr)
--- a/python/obitools3/apps/logging.cfiles
+++ b/python/obitools3/apps/logging.cfiles
@ -1,110 +0,0 @@
 ../../../src/obi_lcs.h
 ../../../src/obi_lcs.c
 ../../../src/obierrno.h
 ../../../src/obierrno.c
 ../../../src/upperband.h
 ../../../src/upperband.c
 ../../../src/sse_banded_LCS_alignment.h
 ../../../src/sse_banded_LCS_alignment.c
 ../../../src/obiblob.h
 ../../../src/obiblob.c
 ../../../src/utils.h
 ../../../src/utils.c
 ../../../src/obidms.h
 ../../../src/obidms.c
 ../../../src/libjson/json_utils.h
 ../../../src/libjson/json_utils.c
 ../../../src/libjson/cJSON.h
 ../../../src/libjson/cJSON.c
 ../../../src/obiavl.h
 ../../../src/obiavl.c
 ../../../src/bloom.h
 ../../../src/bloom.c
 ../../../src/crc64.h
 ../../../src/crc64.c
 ../../../src/murmurhash2.h
 ../../../src/murmurhash2.c
 ../../../src/obidmscolumn.h
 ../../../src/obidmscolumn.c
 ../../../src/obitypes.h
 ../../../src/obitypes.c
 ../../../src/obidmscolumndir.h
 ../../../src/obidmscolumndir.c
 ../../../src/obiblob_indexer.h
 ../../../src/obiblob_indexer.c
 ../../../src/obiview.h
 ../../../src/obiview.c
 ../../../src/hashtable.h
 ../../../src/hashtable.c
 ../../../src/linked_list.h
 ../../../src/linked_list.c
 ../../../src/obidmscolumn_array.h
 ../../../src/obidmscolumn_array.c
 ../../../src/obidmscolumn_blob.h
 ../../../src/obidmscolumn_blob.c
 ../../../src/obidmscolumn_idx.h
 ../../../src/obidmscolumn_idx.c
 ../../../src/obidmscolumn_bool.h
 ../../../src/obidmscolumn_bool.c
 ../../../src/obidmscolumn_char.h
 ../../../src/obidmscolumn_char.c
 ../../../src/obidmscolumn_float.h
 ../../../src/obidmscolumn_float.c
 ../../../src/obidmscolumn_int.h
 ../../../src/obidmscolumn_int.c
 ../../../src/obidmscolumn_qual.h
 ../../../src/obidmscolumn_qual.c
 ../../../src/obidmscolumn_seq.h
 ../../../src/obidmscolumn_seq.c
 ../../../src/obidmscolumn_str.h
 ../../../src/obidmscolumn_str.c
 ../../../src/array_indexer.h
 ../../../src/array_indexer.c
 ../../../src/char_str_indexer.h
 ../../../src/char_str_indexer.c
 ../../../src/dna_seq_indexer.h
 ../../../src/dna_seq_indexer.c
 ../../../src/encode.c
 ../../../src/encode.h
 ../../../src/uint8_indexer.c
 ../../../src/uint8_indexer.h
 ../../../src/build_reference_db.c
 ../../../src/build_reference_db.h
 ../../../src/kmer_similarity.c
 ../../../src/kmer_similarity.h
 ../../../src/obi_clean.c
 ../../../src/obi_clean.h
 ../../../src/obi_ecopcr.c
 ../../../src/obi_ecopcr.h
 ../../../src/obi_ecotag.c
 ../../../src/obi_ecotag.h
 ../../../src/obidms_taxonomy.c
 ../../../src/obidms_taxonomy.h
 ../../../src/obilittlebigman.c
 ../../../src/obilittlebigman.h
 ../../../src/_sse.h
 ../../../src/obidebug.h
 ../../../src/libecoPCR/libapat/CODES/dft_code.h
 ../../../src/libecoPCR/libapat/CODES/dna_code.h
 ../../../src/libecoPCR/libapat/CODES/prot_code.h
 ../../../src/libecoPCR/libapat/apat_parse.c
 ../../../src/libecoPCR/libapat/apat_search.c
 ../../../src/libecoPCR/libapat/apat.h
 ../../../src/libecoPCR/libapat/Gmach.h
 ../../../src/libecoPCR/libapat/Gtypes.h
 ../../../src/libecoPCR/libapat/libstki.c
 ../../../src/libecoPCR/libapat/libstki.h
 ../../../src/libecoPCR/libthermo/nnparams.h
 ../../../src/libecoPCR/libthermo/nnparams.c
 ../../../src/libecoPCR/ecoapat.c
 ../../../src/libecoPCR/ecodna.c
 ../../../src/libecoPCR/ecoError.c
 ../../../src/libecoPCR/ecoMalloc.c
 ../../../src/libecoPCR/ecoPCR.h
--- a/python/obitools3/apps/logging.pxd
+++ b/python/obitools3/apps/logging.pxd
@ -1,3 +0,0 @@
 #cython: language_level=3
 cpdef getLogger(dict config)
--- a/python/obitools3/apps/logging.pyx
+++ b/python/obitools3/apps/logging.pyx
@ -1,48 +0,0 @@
 #cython: language_level=3
 '''
 Created on 27 mars 2016
@author: coissac
 '''
 import logging
 import sys
 cpdef getLogger(dict config):
    '''
    Returns the logger as defined by the command line option
    or by the config file
    :param config:
    '''
    root = config["__root_config__"]
    level  = config[root]['loglevel'] 
    logfile= config[root]['log'] 
    rootlogger   = logging.getLogger()
    logFormatter = logging.Formatter("%%(asctime)s [%s : %%(levelname)-5.5s]  %%(message)s" % config[root]['modulename'])
    stderrHandler = logging.StreamHandler(sys.stderr)
    stderrHandler.setFormatter(logFormatter)
    rootlogger.addHandler(stderrHandler)
    if logfile:
        fileHandler = logging.FileHandler(logfile)
        fileHandler.setFormatter(logFormatter)
        rootlogger.addHandler(fileHandler)
    try:
        loglevel = getattr(logging, level) 
    except:
        loglevel = logging.INFO
    rootlogger.setLevel(loglevel)
    config[root]['logger']=rootlogger
    config[root]['verbose']=True
    return rootlogger
--- a/python/obitools3/apps/optiongroups/init.py
+++ b/python/obitools3/apps/optiongroups/init.py
@ -1,272 +0,0 @@
 def __addInputOption(optionManager):
    optionManager.add_argument(
                    dest='obi:inputURI',  
                    metavar='INPUT', 
                    help='Data source URI')
    group = optionManager.add_argument_group("Restriction to a sub-part options",
                    "Allows to limit analysis to a sub-part of the input")
    group.add_argument('--skip',
                     action="store", dest="obi:skip",
                     metavar='<N>',
                     default=None,
                     type=int,
                     help="skip the N first sequences")
    group.add_argument('--only',
                     action="store", dest="obi:only",
                     metavar='<N>',
                     default=None,
                     type=int,
                     help="treat only N sequences")
 def __addImportInputOption(optionManager):
    group = optionManager.add_argument_group("Input format options for imported files")
    group.add_argument('--fasta-input',
                     action="store_const", dest="obi:inputformat",
                     default=None,
                     const=b'fasta',
                     help="Input file is in sanger fasta format")
    group.add_argument('--fastq-input',
                     action="store_const", dest="obi:inputformat",
                     default=None,
                     const=b'fastq',
                     help="Input file is in fastq format")
    group.add_argument('--embl-input',
                     action="store_const", dest="obi:inputformat",
                     default=None,
                     const=b'embl',
                     help="Input file is in embl nucleic format")
    group.add_argument('--genbank-input',
                     action="store_const", dest="obi:inputformat",
                     default=None,
                     const=b'genbank',
                     help="Input file is in genbank nucleic format")
    group.add_argument('--ngsfilter-input',
                     action="store_const", dest="obi:inputformat",
                     default=None,
                     const=b'ngsfilter',
                     help="Input file is an ngsfilter file")
    group.add_argument('--ecopcr-result-input',
                     action="store_const", dest="obi:inputformat",
                     default=None,
                     const=b'ecopcr',
                     help="Input file is the result of an ecoPCR (version 2)")
    group.add_argument('--ecoprimers-result-input',
                     action="store_const", dest="obi:inputformat",
                     default=None,
                     const=b'ecoprimers',
                     help="Input file is the result of an ecoprimers")
    group.add_argument('--tabular-input',
                     action="store_const", dest="obi:inputformat",
                     default=None,
                     const=b'tabular',
                     help="Input file is a tabular file")
    group.add_argument('--no-skip-on-error',
                     action="store_false", dest="obi:skiperror",
                     default=True,
                     help="Don't skip sequence entries with parsing errors (default: they are skipped)")
    group.add_argument('--no-quality',
                     action="store_true", dest="obi:noquality",
                     default=False,
                     help="Do not import fastQ quality")
    group.add_argument('--quality-sanger',
                     action="store_const", dest="obi:qualityformat",
                     default=None,
                     const=b'sanger',
                     help="Fastq quality is encoded following sanger format (standard fastq)")
    group.add_argument('--quality-solexa',
                     action="store_const", dest="obi:qualityformat",
                     default=None,
                     const=b'solexa',
                     help="Fastq quality is encoded following solexa sequencer format")
    group.add_argument('--nuc',
                     action="store_const", dest="obi:moltype",
                     default=None,
                     const=b'nuc',
                     help="Input file contains nucleic sequences")
    group.add_argument('--prot',
                     action="store_const", dest="obi:moltype",
                     default=None,
                     const=b'pep',
                     help="Input file contains protein sequences")
    group.add_argument('--input-na-string',
                     action="store", dest="obi:inputnastring",
                     default="NA",
                     type=str,
                     help="String associated with Non Available (NA) values in the input")
 def __addTabularInputOption(optionManager):
    group = optionManager.add_argument_group("Input format options for tabular files")
    group.add_argument('--header',
                     action="store_true", dest="obi:header",
                     default=False,
                     help="First line of tabular file contains column names")
    group.add_argument('--sep',
                     action="store", dest="obi:sep",
                     default=None,
                     type=str,
                     help="Column separator")
    group.add_argument('--dec',
                     action="store", dest="obi:dec",
                     default=".",
                     type=str,
                     help="Decimal separator")
    group.add_argument('--strip-white',
                     action="store_false", dest="obi:stripwhite",
                     default=True,
                     help="Remove white chars at the beginning and the end of values")
    group.add_argument('--blank-line-skip',
                     action="store_false", dest="obi:blanklineskip",
                     default=True,
                     help="Skip empty lines")
    group.add_argument('--comment-char',
                     action="store", dest="obi:commentchar",
                     default="#",
                     type=str,
                     help="Lines starting by this char are considered as comment")
 def __addTaxdumpInputOption(optionManager):  # TODO maybe not the best way to do it
    group = optionManager.add_argument_group("Input format options for taxdump")
    group.add_argument('--taxdump',
                     action="store_true", dest="obi:taxdump",
                     default=False,
                     help="Whether the input is a taxdump")
 def __addTaxonomyOption(optionManager):
    group = optionManager.add_argument_group("Input format options for taxonomy")
    group.add_argument('--taxonomy',
                     action="store", dest="obi:taxoURI",
                     default=None,
                     help="Taxonomy URI")
    #TODO option bool to download taxo if URI doesn't exist
 def addMinimalInputOption(optionManager):
    __addInputOption(optionManager)
 def addImportInputOption(optionManager):
    __addInputOption(optionManager)
    __addImportInputOption(optionManager)
 def addTabularInputOption(optionManager):
    __addTabularInputOption(optionManager)
 def addTaxonomyOption(optionManager):
    __addTaxonomyOption(optionManager)
 def addTaxdumpInputOption(optionManager):
    __addTaxdumpInputOption(optionManager)
 def addAllInputOption(optionManager):
    __addInputOption(optionManager)
    __addImportInputOption(optionManager)
    __addTabularInputOption(optionManager)
    __addTaxonomyOption(optionManager)
    __addTaxdumpInputOption(optionManager)    
 def __addOutputOption(optionManager):
    optionManager.add_argument(
                    dest='obi:outputURI',  
                    metavar='OUTPUT', 
                    help='Data destination URI')
 def __addDMSOutputOption(optionManager):
    group = optionManager.add_argument_group("Output options for DMS data")
    group.add_argument('--no-create-dms',
                 action="store_true", dest="obi:nocreatedms",
                 default=False,
                 help="Don't create an output DMS is it is not existing")
    group.add_argument('--max-elts',
                 action="store", dest="obi:maxelts",
                 metavar='<N>',
                 default=1000,
                 type=int,
                 help="Maximum number of elements per line in a column "
                      "(e.g. the number of different keys in a dictionary-type "
                      "key from sequence headers). If the number of different keys "
                      "is greater than N, the values are stored as character strings")
 def __addExportOutputOption(optionManager):
    group = optionManager.add_argument_group("Output format options for exported files")
    group.add_argument('--fasta-output',
                     action="store_const", dest="obi:outputformat",
                     default=None,
                     const=b'fasta',
                     help="Output file is in sanger fasta format")
    group.add_argument('--fastq-output',
                     action="store_const", dest="obi:outputformat",
                     default=None,
                     const=b'fastq',
                     help="Output file is in fastq format")
    group.add_argument('--print-na',
                     action="store_true", dest="obi:printna",
                     default=False,
                     help="Print Non Available (NA) values in the output")
    group.add_argument('--output-na-string',
                     action="store", dest="obi:outputnastring",
                     default="NA",
                     type=str,
                     help="String associated with Non Available (NA) values in the output")
 def addMinimalOutputOption(optionManager):
    __addOutputOption(optionManager)
    __addDMSOutputOption(optionManager)
 def addExportOutputOption(optionManager):
    __addOutputOption(optionManager)
    __addExportOutputOption(optionManager)
 def addAllOutputOption(optionManager):
    __addOutputOption(optionManager)
    __addDMSOutputOption(optionManager)
    __addExportOutputOption(optionManager)
--- a/python/obitools3/apps/progress.cfiles
+++ b/python/obitools3/apps/progress.cfiles
@ -1,110 +0,0 @@
 ../../../src/obi_lcs.h
 ../../../src/obi_lcs.c
 ../../../src/obierrno.h
 ../../../src/obierrno.c
 ../../../src/upperband.h
 ../../../src/upperband.c
 ../../../src/sse_banded_LCS_alignment.h
 ../../../src/sse_banded_LCS_alignment.c
 ../../../src/obiblob.h
 ../../../src/obiblob.c
 ../../../src/utils.h
 ../../../src/utils.c
 ../../../src/obidms.h
 ../../../src/obidms.c
 ../../../src/libjson/json_utils.h
 ../../../src/libjson/json_utils.c
 ../../../src/libjson/cJSON.h
 ../../../src/libjson/cJSON.c
 ../../../src/obiavl.h
 ../../../src/obiavl.c
 ../../../src/bloom.h
 ../../../src/bloom.c
 ../../../src/crc64.h
 ../../../src/crc64.c
 ../../../src/murmurhash2.h
 ../../../src/murmurhash2.c
 ../../../src/obidmscolumn.h
 ../../../src/obidmscolumn.c
 ../../../src/obitypes.h
 ../../../src/obitypes.c
 ../../../src/obidmscolumndir.h
 ../../../src/obidmscolumndir.c
 ../../../src/obiblob_indexer.h
 ../../../src/obiblob_indexer.c
 ../../../src/obiview.h
 ../../../src/obiview.c
 ../../../src/hashtable.h
 ../../../src/hashtable.c
 ../../../src/linked_list.h
 ../../../src/linked_list.c
 ../../../src/obidmscolumn_array.h
 ../../../src/obidmscolumn_array.c
 ../../../src/obidmscolumn_blob.h
 ../../../src/obidmscolumn_blob.c
 ../../../src/obidmscolumn_idx.h
 ../../../src/obidmscolumn_idx.c
 ../../../src/obidmscolumn_bool.h
 ../../../src/obidmscolumn_bool.c
 ../../../src/obidmscolumn_char.h
 ../../../src/obidmscolumn_char.c
 ../../../src/obidmscolumn_float.h
 ../../../src/obidmscolumn_float.c
 ../../../src/obidmscolumn_int.h
 ../../../src/obidmscolumn_int.c
 ../../../src/obidmscolumn_qual.h
 ../../../src/obidmscolumn_qual.c
 ../../../src/obidmscolumn_seq.h
 ../../../src/obidmscolumn_seq.c
 ../../../src/obidmscolumn_str.h
 ../../../src/obidmscolumn_str.c
 ../../../src/array_indexer.h
 ../../../src/array_indexer.c
 ../../../src/char_str_indexer.h
 ../../../src/char_str_indexer.c
 ../../../src/dna_seq_indexer.h
 ../../../src/dna_seq_indexer.c
 ../../../src/encode.c
 ../../../src/encode.h
 ../../../src/uint8_indexer.c
 ../../../src/uint8_indexer.h
 ../../../src/build_reference_db.c
 ../../../src/build_reference_db.h
 ../../../src/kmer_similarity.c
 ../../../src/kmer_similarity.h
 ../../../src/obi_clean.c
 ../../../src/obi_clean.h
 ../../../src/obi_ecopcr.c
 ../../../src/obi_ecopcr.h
 ../../../src/obi_ecotag.c
 ../../../src/obi_ecotag.h
 ../../../src/obidms_taxonomy.c
 ../../../src/obidms_taxonomy.h
 ../../../src/obilittlebigman.c
 ../../../src/obilittlebigman.h
 ../../../src/_sse.h
 ../../../src/obidebug.h
 ../../../src/libecoPCR/libapat/CODES/dft_code.h
 ../../../src/libecoPCR/libapat/CODES/dna_code.h
 ../../../src/libecoPCR/libapat/CODES/prot_code.h
 ../../../src/libecoPCR/libapat/apat_parse.c
 ../../../src/libecoPCR/libapat/apat_search.c
 ../../../src/libecoPCR/libapat/apat.h
 ../../../src/libecoPCR/libapat/Gmach.h
 ../../../src/libecoPCR/libapat/Gtypes.h
 ../../../src/libecoPCR/libapat/libstki.c
 ../../../src/libecoPCR/libapat/libstki.h
 ../../../src/libecoPCR/libthermo/nnparams.h
 ../../../src/libecoPCR/libthermo/nnparams.c
 ../../../src/libecoPCR/ecoapat.c
 ../../../src/libecoPCR/ecodna.c
 ../../../src/libecoPCR/ecoError.c
 ../../../src/libecoPCR/ecoMalloc.c
 ../../../src/libecoPCR/ecoPCR.h
--- a/python/obitools3/apps/progress.pxd
+++ b/python/obitools3/apps/progress.pxd
@ -1,65 +0,0 @@
 #cython: language_level=3
 cdef extern from "stdio.h":
    struct FILE
    int fprintf(FILE *stream, char *format, ...)
    int fputs(char *string, FILE *stream)
    FILE* stderr
    ctypedef unsigned int off_t "unsigned long long"
 cdef extern from "unistd.h":
    int fsync(int fd);
 cdef extern from "time.h":
    struct tm :
        int tm_yday 
        int tm_hour
        int tm_min
        int tm_sec
    enum: CLOCKS_PER_SEC
    ctypedef int time_t
    ctypedef int clock_t
    ctypedef int suseconds_t
    struct timeval:
        time_t      tv_sec     #  seconds */
        suseconds_t tv_usec    #  microseconds */
    struct timezone :
        int tz_minuteswest;    # minutes west of Greenwich
        int tz_dsttime;        # type of DST correction 
    int gettimeofday(timeval *tv, timezone *tz)
    tm *gmtime_r(time_t *clock, tm *result)
    time_t time(time_t *tloc)
    clock_t clock()
 cdef class ProgressBar:
    cdef off_t   maxi
    cdef clock_t starttime
    cdef clock_t lasttime
    cdef clock_t tickcount
    cdef int freq
    cdef int cycle
    cdef int arrow
    cdef int lastlog
    cdef bint ontty
    cdef int fd
    cdef bint cut
    cdef bytes _head
    cdef char *chead
    cdef object logger
    cdef char  *wheel
    cdef char  *spaces
    cdef char*  diese 
    cdef clock_t clock(self)
--- a/python/obitools3/apps/progress.pyx
+++ b/python/obitools3/apps/progress.pyx
@ -1,157 +0,0 @@
 #cython: language_level=3
 '''
 Created on 27 mars 2016
@author: coissac
 '''
 from ..utils cimport str2bytes, bytes2str
 from .config cimport getConfiguration 
 import sys
 cdef class ProgressBar:
    cdef clock_t clock(self):
        cdef clock_t t
        cdef timeval tp
        cdef clock_t s
        <void> gettimeofday(&tp,NULL)
        s = <clock_t> (<double> tp.tv_usec * 1.e-6 * <double> CLOCKS_PER_SEC)
        t = tp.tv_sec * CLOCKS_PER_SEC + s 
        return t
    def __init__(self,
                 off_t maxi,
                 dict  config={},
                 str head="",
                 double seconde=0.1,
                 cut=False):
        self.starttime = self.clock()
        self.lasttime  = self.starttime
        self.tickcount = <clock_t> (seconde * CLOCKS_PER_SEC)
        self.freq      = 1
        self.cycle     = 0
        self.arrow     = 0
        self.lastlog   = 0
        if not config:
            config=getConfiguration()
        self.ontty = sys.stderr.isatty()
        if (maxi<=0):
            maxi=1
        self.maxi  = maxi
        self.head  = head
        self.chead = self._head 
        self.cut   = cut
        self.logger=config[config["__root_config__"]]["logger"]
        self.wheel =  '|/-\\'
        self.spaces='          ' \
                    '          ' \
                    '          ' \
                    '          ' \
                    '          '
        self.diese ='##########' \
                    '##########' \
                    '##########' \
                    '##########' \
                    '##########'  
    def __call__(self, object pos, bint force=False):
        cdef off_t    ipos
        cdef clock_t  elapsed
        cdef clock_t  newtime
        cdef clock_t  delta
        cdef clock_t  more 
        cdef double   percent 
        cdef tm remain
        cdef int days,hour,minu,sec
        cdef off_t fraction
        cdef int twentyth
        self.cycle+=1
        if self.cycle % self.freq == 0 or force:
            self.cycle=1
            newtime  = self.clock()
            delta         = newtime - self.lasttime
            self.lasttime = newtime
            elapsed       = newtime - self.starttime
 #            print(" ",delta,elapsed,elapsed/CLOCKS_PER_SEC,self.tickcount)
            if   delta < self.tickcount / 5 :
                self.freq*=2
            elif delta > self.tickcount * 5 and self.freq>1:
                self.freq/=2
            if callable(pos):
                ipos=pos()
            else:
                ipos=pos
            if ipos==0:
                ipos=1                
            percent = <double>ipos/<double>self.maxi
            more = <time_t>((<double>elapsed / percent * (1. - percent))/CLOCKS_PER_SEC)
            <void>gmtime_r(&more, &remain)
            days  = remain.tm_yday 
            hour  = remain.tm_hour
            minu  = remain.tm_min
            sec   = remain.tm_sec
            if self.ontty:
                fraction=<int>(percent * 50.)
                self.arrow=(self.arrow+1) % 4
                if days:
                    <void>fprintf(stderr,b'\r%s %5.1f %% |%.*s%c%.*s] remain : %d days %02d:%02d:%02d\033[K',
                                    self.chead,
                                    percent*100,
                                    fraction,self.diese,
                                    self.wheel[self.arrow],
                                    50-fraction,self.spaces,
                                    days,hour,minu,sec)
                else:
                    <void>fprintf(stderr,b'\r%s %5.1f %% |%.*s%c%.*s] remain : %02d:%02d:%02d\033[K',
                                    self.chead,
                                    percent*100.,
                                    fraction,self.diese,
                                    self.wheel[self.arrow],
                                    50-fraction,self.spaces,
                                    hour,minu,sec)
            if self.cut:
                tenth = int(percent * 10)
                if tenth != self.lastlog:
                    if self.ontty:
                        <void>fputs(b'\n',stderr)
                    self.logger.info('%s %5.1f %% remain : %02d:%02d:%02d\033[K' % (
                                            bytes2str(self._head),
                                            percent*100.,
                                            hour,minu,sec))
                    self.lastlog=tenth
        else:
            self.cycle+=1
    property head:    
        def __get__(self):
            return self._head
        def __set__(self,str value):
            self._head=str2bytes(value)
            self.chead=self._head
--- a/python/obitools3/apps/temp.cfiles
+++ b/python/obitools3/apps/temp.cfiles
@ -1,110 +0,0 @@
 ../../../src/obi_lcs.h
 ../../../src/obi_lcs.c
 ../../../src/obierrno.h
 ../../../src/obierrno.c
 ../../../src/upperband.h
 ../../../src/upperband.c
 ../../../src/sse_banded_LCS_alignment.h
 ../../../src/sse_banded_LCS_alignment.c
 ../../../src/obiblob.h
 ../../../src/obiblob.c
 ../../../src/utils.h
 ../../../src/utils.c
 ../../../src/obidms.h
 ../../../src/obidms.c
 ../../../src/libjson/json_utils.h
 ../../../src/libjson/json_utils.c
 ../../../src/libjson/cJSON.h
 ../../../src/libjson/cJSON.c
 ../../../src/obiavl.h
 ../../../src/obiavl.c
 ../../../src/bloom.h
 ../../../src/bloom.c
 ../../../src/crc64.h
 ../../../src/crc64.c
 ../../../src/murmurhash2.h
 ../../../src/murmurhash2.c
 ../../../src/obidmscolumn.h
 ../../../src/obidmscolumn.c
 ../../../src/obitypes.h
 ../../../src/obitypes.c
 ../../../src/obidmscolumndir.h
 ../../../src/obidmscolumndir.c
 ../../../src/obiblob_indexer.h
 ../../../src/obiblob_indexer.c
 ../../../src/obiview.h
 ../../../src/obiview.c
 ../../../src/hashtable.h
 ../../../src/hashtable.c
 ../../../src/linked_list.h
 ../../../src/linked_list.c
 ../../../src/obidmscolumn_array.h
 ../../../src/obidmscolumn_array.c
 ../../../src/obidmscolumn_blob.h
 ../../../src/obidmscolumn_blob.c
 ../../../src/obidmscolumn_idx.h
 ../../../src/obidmscolumn_idx.c
 ../../../src/obidmscolumn_bool.h
 ../../../src/obidmscolumn_bool.c
 ../../../src/obidmscolumn_char.h
 ../../../src/obidmscolumn_char.c
 ../../../src/obidmscolumn_float.h
 ../../../src/obidmscolumn_float.c
 ../../../src/obidmscolumn_int.h
 ../../../src/obidmscolumn_int.c
 ../../../src/obidmscolumn_qual.h
 ../../../src/obidmscolumn_qual.c
 ../../../src/obidmscolumn_seq.h
 ../../../src/obidmscolumn_seq.c
 ../../../src/obidmscolumn_str.h
 ../../../src/obidmscolumn_str.c
 ../../../src/array_indexer.h
 ../../../src/array_indexer.c
 ../../../src/char_str_indexer.h
 ../../../src/char_str_indexer.c
 ../../../src/dna_seq_indexer.h
 ../../../src/dna_seq_indexer.c
 ../../../src/encode.c
 ../../../src/encode.h
 ../../../src/uint8_indexer.c
 ../../../src/uint8_indexer.h
 ../../../src/build_reference_db.c
 ../../../src/build_reference_db.h
 ../../../src/kmer_similarity.c
 ../../../src/kmer_similarity.h
 ../../../src/obi_clean.c
 ../../../src/obi_clean.h
 ../../../src/obi_ecopcr.c
 ../../../src/obi_ecopcr.h
 ../../../src/obi_ecotag.c
 ../../../src/obi_ecotag.h
 ../../../src/obidms_taxonomy.c
 ../../../src/obidms_taxonomy.h
 ../../../src/obilittlebigman.c
 ../../../src/obilittlebigman.h
 ../../../src/_sse.h
 ../../../src/obidebug.h
 ../../../src/libecoPCR/libapat/CODES/dft_code.h
 ../../../src/libecoPCR/libapat/CODES/dna_code.h
 ../../../src/libecoPCR/libapat/CODES/prot_code.h
 ../../../src/libecoPCR/libapat/apat_parse.c
 ../../../src/libecoPCR/libapat/apat_search.c
 ../../../src/libecoPCR/libapat/apat.h
 ../../../src/libecoPCR/libapat/Gmach.h
 ../../../src/libecoPCR/libapat/Gtypes.h
 ../../../src/libecoPCR/libapat/libstki.c
 ../../../src/libecoPCR/libapat/libstki.h
 ../../../src/libecoPCR/libthermo/nnparams.h
 ../../../src/libecoPCR/libthermo/nnparams.c
 ../../../src/libecoPCR/ecoapat.c
 ../../../src/libecoPCR/ecodna.c
 ../../../src/libecoPCR/ecoError.c
 ../../../src/libecoPCR/ecoMalloc.c
 ../../../src/libecoPCR/ecoPCR.h
--- a/python/obitools3/apps/temp.pxd
+++ b/python/obitools3/apps/temp.pxd
@ -1,8 +0,0 @@
 #cython: language_level=3
 '''
 Created on 28 juillet 2017
@author: coissac
 '''
--- a/python/obitools3/apps/temp.pyx
+++ b/python/obitools3/apps/temp.pyx
@ -1,99 +0,0 @@
 #cython: language_level=3
 '''
 Created on 28 juillet 2017
@author: coissac
 '''
 from os import environb,getpid
 from os.path import join, isdir
 from tempfile import TemporaryDirectory, _get_candidate_names
 from shutil import rmtree
 from atexit import register
 from obitools3.dms.dms import DMS
 from obitools3.apps.config import getConfiguration
 from obitools3.apps.config import logger
 from obitools3.dms.dms cimport DMS
 from obitools3.utils cimport tobytes,tostr
 cpdef get_temp_dir():
    """
    Returns a temporary directory object specific of this instance of obitools.
    This is an application function. It cannot be called out of an obi command.
    It requires a valid configuration.
    If the function is called several time from the same obi session, the same
    directory is returned.
    If the OBITMP environment variable exist, the temporary directory is created
    inside this directory.
    The directory is automatically destroyed at the end of the end of the process.
        @return: a temporary python directory object.
    """
    cdef bytes tmpdirname
    cdef dict config = getConfiguration()
    root = config["__root_config__"]
    try:
        return config[root]["tempdir"].name
    except KeyError:
        pass
    try:
        basedir=environb[b'OBITMP']
    except KeyError:
        basedir=None
    tmp = TemporaryDirectory(dir=basedir)
    config[root]["tempdir"]=tmp
    return tmp.name
 cpdef get_temp_dir_name():
    """
    Returns the name of the  temporary directory object 
    specific of this instance of obitools.
        @return: the name of the temporary directory.
        @see get_temp_dir
    """
    return get_temp_dir_name().name
 cpdef get_temp_dms():
    cdef bytes tmpdirname                   # @DuplicatedSignature
    cdef dict config = getConfiguration()   # @DuplicatedSignature
    cdef DMS tmpdms
    root = config["__root_config__"]
    try:
        return config[root]["tempdms"]
    except KeyError:
        pass
    tmpdirname=get_temp_dir()
    tempname = join(tmpdirname,
                    b"obi.%d.%s" % (getpid(),
                                    tobytes(next(_get_candidate_names())))
                   )
    tmpdms = DMS.new(tempname)
    config[root]["tempdms"]=tmpdms
    return tmpdms
--- a/python/obitools3/commands/align.cfiles
+++ b/python/obitools3/commands/align.cfiles
@ -1,103 +0,0 @@
 ../../../src/obi_lcs.h
 ../../../src/obi_lcs.c
 ../../../src/obierrno.h
 ../../../src/obierrno.c
 ../../../src/upperband.h
 ../../../src/upperband.c
 ../../../src/sse_banded_LCS_alignment.h
 ../../../src/sse_banded_LCS_alignment.c
 ../../../src/obiblob.h
 ../../../src/obiblob.c
 ../../../src/utils.h
 ../../../src/utils.c
 ../../../src/obidms.h
 ../../../src/obidms.c
 ../../../src/libjson/json_utils.h
 ../../../src/libjson/json_utils.c
 ../../../src/libjson/cJSON.h
 ../../../src/libjson/cJSON.c
 ../../../src/obiavl.h
 ../../../src/obiavl.c
 ../../../src/bloom.h
 ../../../src/bloom.c
 ../../../src/crc64.h
 ../../../src/crc64.c
 ../../../src/murmurhash2.h
 ../../../src/murmurhash2.c
 ../../../src/obidmscolumn.h
 ../../../src/obidmscolumn.c
 ../../../src/obitypes.h
 ../../../src/obitypes.c
 ../../../src/obidmscolumndir.h
 ../../../src/obidmscolumndir.c
 ../../../src/obiblob_indexer.h
 ../../../src/obiblob_indexer.c
 ../../../src/obiview.h
 ../../../src/obiview.c
 ../../../src/hashtable.h
 ../../../src/hashtable.c
 ../../../src/linked_list.h
 ../../../src/linked_list.c
 ../../../src/obidmscolumn_array.h
 ../../../src/obidmscolumn_array.c
 ../../../src/obidmscolumn_blob.h
 ../../../src/obidmscolumn_blob.c
 ../../../src/obidmscolumn_idx.h
 ../../../src/obidmscolumn_idx.c
 ../../../src/obidmscolumn_bool.h
 ../../../src/obidmscolumn_bool.c
 ../../../src/obidmscolumn_char.h
 ../../../src/obidmscolumn_char.c
 ../../../src/obidmscolumn_float.h
 ../../../src/obidmscolumn_float.c
 ../../../src/obidmscolumn_int.h
 ../../../src/obidmscolumn_int.c
 ../../../src/obidmscolumn_qual.h
 ../../../src/obidmscolumn_qual.c
 ../../../src/obidmscolumn_seq.h
 ../../../src/obidmscolumn_seq.c
 ../../../src/obidmscolumn_str.h
 ../../../src/obidmscolumn_str.c
 ../../../src/array_indexer.h
 ../../../src/array_indexer.c
 ../../../src/char_str_indexer.h
 ../../../src/char_str_indexer.c
 ../../../src/dna_seq_indexer.h
 ../../../src/dna_seq_indexer.c
 ../../../src/encode.c
 ../../../src/encode.h
 ../../../src/uint8_indexer.c
 ../../../src/uint8_indexer.h
 ../../../src/build_reference_db.c
 ../../../src/build_reference_db.h
 ../../../src/kmer_similarity.c
 ../../../src/kmer_similarity.h
 ../../../src/obi_clean.c
 ../../../src/obi_clean.h
 ../../../src/obi_ecopcr.c
 ../../../src/obi_ecopcr.h
 ../../../src/obi_ecotag.c
 ../../../src/obi_ecotag.h
 ../../../src/obidms_taxonomy.c
 ../../../src/obidms_taxonomy.h
 ../../../src/obilittlebigman.c
 ../../../src/obilittlebigman.h
 ../../../src/_sse.h
 ../../../src/obidebug.h
 ../../../src/libecoPCR/libapat/CODES/dft_code.h
 ../../../src/libecoPCR/libapat/CODES/dna_code.h
 ../../../src/libecoPCR/libapat/CODES/prot_code.h
 ../../../src/libecoPCR/libapat/apat_parse.c
 ../../../src/libecoPCR/libapat/apat_search.c
 ../../../src/libecoPCR/libapat/apat.h
 ../../../src/libecoPCR/libapat/Gmach.h
 ../../../src/libecoPCR/libapat/Gtypes.h
 ../../../src/libecoPCR/libapat/libstki.c
 ../../../src/libecoPCR/libapat/libstki.h
 ../../../src/libecoPCR/libthermo/nnparams.h
 ../../../src/libecoPCR/libthermo/nnparams.c
 ../../../src/libecoPCR/ecoapat.c
 ../../../src/libecoPCR/ecodna.c
 ../../../src/libecoPCR/ecoError.c
 ../../../src/libecoPCR/ecoMalloc.c
 ../../../src/libecoPCR/ecoPCR.h
--- a/python/obitools3/commands/align.pxd
+++ b/python/obitools3/commands/align.pxd
@ -1,18 +0,0 @@
 #cython: language_level=3
 cpdef align_columns(bytes dms_n, 
                    bytes input_view_1_n, 
                    bytes output_view_n,
                    bytes input_view_2_n=*,
                    bytes input_column_1_n=*, 
                    bytes input_column_2_n=*,
                    bytes input_elt_1_n=*, 
                    bytes input_elt_2_n=*,
                    bytes id_column_1_n=*, 
                    bytes id_column_2_n=*,
                    double threshold=*, bint normalize=*, 
                    int reference=*, bint similarity_mode=*,
                    bint print_seq=*, bint print_count=*,
                    bytes comments=*,
                    int thread_count=*)
--- a/python/obitools3/commands/align.pyx
+++ b/python/obitools3/commands/align.pyx
@ -1,274 +0,0 @@
 #cython: language_level=3
 from obitools3.apps.progress cimport ProgressBar  # @UnresolvedImport
 from obitools3.dms import DMS
 from obitools3.dms.view.view cimport View
 from obitools3.uri.decode import open_uri
 from obitools3.apps.optiongroups import addMinimalInputOption, addMinimalOutputOption
 from obitools3.dms.view import RollbackException
 from obitools3.apps.config import logger
 from obitools3.utils cimport tobytes, str2bytes
 from obitools3.dms.capi.obilcsalign cimport obi_lcs_align_one_column, \
                                            obi_lcs_align_two_columns
 import time
 import sys
 __title__="Aligns one sequence column with itself or two sequence columns"
 def addOptions(parser):
   addMinimalInputOption(parser)
   addMinimalOutputOption(parser)
   group=parser.add_argument_group('obi align specific options')
   group.add_argument('--input-2', '-I',
                      action="store", dest="align:inputuri2",
                      metavar='<INPUT URI>',
                      default="",
                      type=str,
                      help="Eventually, the URI of the second input to align with the first one.")
   group.add_argument('--threshold','-t',
                      action="store", dest="align:threshold",
                      metavar='<THRESHOLD>',
                      default=0.0,
                      type=float,
                      help="Score threshold. If the score is normalized and expressed in similarity (default),"
                           " it is an identity, e.g. 0.95 for an identity of 95%%. If the score is normalized"
                           " and expressed in distance, it is (1.0 - identity), e.g. 0.05 for an identity of 95%%."
                           " If the score is not normalized and expressed in similarity, it is the length of the"
                           " Longest Common Subsequence. If the score is not normalized and expressed in distance,"
                           " it is (reference length - LCS length)."
                           " Only sequence pairs with a similarity above <THRESHOLD> are printed. Default: 0.00"
                           " (no threshold).")
   group.add_argument('--longest-length','-L',
                      action="store_const", dest="align:reflength",
                      default=0,
                      const=1,
                      help="The reference length is the length of the longest sequence."
                           " Default: the reference length is the length of the alignment.")
   group.add_argument('--shortest-length','-l',
                      action="store_const", dest="align:reflength",
                      default=0,
                      const=2,
                      help="The reference length is the length of the shortest sequence."
                           " Default: the reference length is the length of the alignment.")
   group.add_argument('--raw','-r',
                      action="store_false", dest="align:normalize",
                      default=True,
                      help="Raw score, not normalized. Default: score is normalized with the reference sequence length.")
   group.add_argument('--distance','-D',
                      action="store_false", dest="align:similarity",
                      default=True,
                      help="Score is expressed in distance. Default: score is expressed in similarity.")
   group.add_argument('--print-seq','-s',
                      action="store_true", dest="align:printseq",
                      default=False,
                      help="The nucleotide sequences are written in the output view. Default: they are not written.")
   group.add_argument('--print-count','-n',
                      action="store_true", dest="align:printcount",
                      default=False,
                      help="Sequence counts are written in the output view. Default: they are not written.")
   group.add_argument('--thread-count','-p',   # TODO should probably be in a specific option group
                      action="store", dest="align:threadcount",
                      metavar='<THREAD COUNT>',
                      default=1,
                      type=int,
                      help="Number of threads to use for the computation. Default: one.")
 cpdef align_columns(bytes dms_n, 
                    bytes input_view_1_n, 
                    bytes output_view_n,
                    bytes input_view_2_n=b"",
                    bytes input_column_1_n=b"", 
                    bytes input_column_2_n=b"",
                    bytes input_elt_1_n=b"", 
                    bytes input_elt_2_n=b"",
                    bytes id_column_1_n=b"", 
                    bytes id_column_2_n=b"",
                    double threshold=0.0, bint normalize=True, 
                    int reference=0, bint similarity_mode=True,
                    bint print_seq=False, bint print_count=False,
                    bytes comments=b"{}",
                    int thread_count=1) : 
    if input_view_2_n == b"" and input_column_2_n == b"" :
        if obi_lcs_align_one_column(dms_n, \
                                    input_view_1_n, \
                                    input_column_1_n, \
                                    input_elt_1_n, \
                                    id_column_1_n, \
                                    output_view_n, \
                                    comments, \
                                    print_seq, \
                                    print_count, \
                                    threshold, normalize, reference, similarity_mode,
                                    thread_count) < 0 :
            raise Exception("Error aligning sequences")        
    else:
        if obi_lcs_align_two_columns(dms_n, \
                                     input_view_1_n, \
                                     input_view_2_n, \
                                     input_column_1_n, \
                                     input_column_2_n, \
                                     input_elt_1_n, \
                                     input_elt_2_n, \
                                     id_column_1_n, \
                                     id_column_2_n, \
                                     output_view_n, \
                                     comments, \
                                     print_seq, \
                                     print_count, \
                                     threshold, normalize, reference, similarity_mode) < 0 :
            raise Exception("Error aligning sequences")        
 def run(config):
    DMS.obi_atexit()
    logger("info", "obi align")
    # Open the input: only the DMS
    input = open_uri(config['obi']['inputURI'],
                     dms_only=True)
    if input is None:
        raise Exception("Could not read input")
    i_dms = input[0]
    i_dms_name = input[0].name
    i_uri = input[1]
    i_view_name = i_uri.split(b"/")[0]
    i_column_name = b""
    i_element_name = b""
    if len(i_uri.split(b"/")) == 2:
        i_column_name = i_uri.split(b"/")[1]
    if len(i_uri.split(b"/")) == 3:
        i_element_name = i_uri.split(b"/")[2]
    if len(i_uri.split(b"/")) > 3:
        raise Exception("Input URI contains too many elements:", config['obi']['inputURI'])
    # Open the second input if there is one
    i_dms_2 = None
    i_dms_name_2 = b""
    original_i_view_name_2 = b""
    i_view_name_2 = b""
    i_column_name_2 = b""
    i_element_name_2 = b""
    if config['align']['inputuri2']:
        input_2 = open_uri(config['align']['inputuri2'],
                           dms_only=True)
        if input_2 is None:
            raise Exception("Could not read second input")
        i_dms_2 = input_2[0]
        i_dms_name_2 = i_dms_2.name
        i_uri_2 = input_2[1]
        original_i_view_name_2 = i_uri_2.split(b"/")[0]
        if len(i_uri_2.split(b"/")) == 2:
            i_column_name_2 = i_uri_2.split(b"/")[1]
        if len(i_uri_2.split(b"/")) == 3:
            i_element_name_2 = i_uri_2.split(b"/")[2]
        if len(i_uri_2.split(b"/")) > 3:
            raise Exception("Input URI contains too many elements:", config['align']['inputuri2'])
        # If the 2 input DMS are not the same, temporarily import 2nd input view in first input DMS
        if i_dms != i_dms_2:
            temp_i_view_name_2 = original_i_view_name_2
            i=0
            while temp_i_view_name_2 in i_dms:  # Making sure view name is unique in input DMS
                temp_i_view_name_2 = original_i_view_name_2+b"_"+str2bytes(str(i))
                i+=1
            i_view_name_2 = temp_i_view_name_2
            View.import_view(i_dms_2.full_path[:-7], i_dms.full_path[:-7], original_i_view_name_2, i_view_name_2)
    # Open the output: only the DMS
    output = open_uri(config['obi']['outputURI'],
                      input=False,
                      dms_only=True)
    if output is None:
        raise Exception("Could not create output")
    o_dms = output[0]
    o_dms_name = o_dms.name 
    final_o_view_name = output[1]
    # If the input and output DMS are not the same, align creating a temporary view in the input dms that will be exported to 
    # the right DMS and deleted in the other afterwards.
    if i_dms != o_dms:
        temporary_view_name = final_o_view_name
        i=0
        while temporary_view_name in i_dms:  # Making sure view name is unique in input DMS
            temporary_view_name = final_o_view_name+b"_"+str2bytes(str(i))
            i+=1
        o_view_name = temporary_view_name
    else:
        o_view_name = final_o_view_name
    # Save command config in View comments
    command_line = " ".join(sys.argv[1:])
    i_dms_list = [i_dms_name]
    if i_dms_name_2:
        i_dms_list.append(i_dms_name_2)
    i_view_list = [i_view_name]
    if original_i_view_name_2:
        i_view_list.append(original_i_view_name_2)
    comments = View.print_config(config, "align", command_line, input_dms_name=i_dms_list, input_view_name=i_view_list)
    # Call cython alignment function
      # Using default ID columns of the view. TODO discuss adding option
    align_columns(i_dms_name,  \
                  i_view_name,  \
                  o_view_name,  \
                  input_view_2_n   = i_view_name_2,  \
                  input_column_1_n = i_column_name,  \
                  input_column_2_n = i_column_name_2, \
                  input_elt_1_n    = i_element_name,  \
                  input_elt_2_n    = i_element_name_2, \
                  id_column_1_n    = b"",  \
                  id_column_2_n    = b"", \
                  threshold        = config['align']['threshold'], \
                  normalize        = config['align']['normalize'],  \
                  reference        = config['align']['reflength'],  \
                  similarity_mode  = config['align']['similarity'],  \
                  print_seq        = config['align']['printseq'],  \
                  print_count      = config['align']['printcount'], \
                  comments         = comments, \
                  thread_count     = config['align']['threadcount'])
    # If the input and output DMS are not the same, export result view to output DMS
    if i_dms != o_dms:
        View.import_view(i_dms.full_path[:-7], o_dms.full_path[:-7], o_view_name, final_o_view_name)
    # Save command config in output DMS comments
    o_dms.record_command_line(command_line)
    #print("\n\nOutput view:\n````````````", file=sys.stderr)
    #print(repr(o_dms[final_o_view_name]), file=sys.stderr)
    # If the two input DMS are different, delete the temporary input view in the first input DMS
    if i_dms_2 and i_dms != i_dms_2:
        View.delete_view(i_dms, i_view_name_2)
        i_dms_2.close()
    # If the input and the output DMS are different, delete the temporary result view in the input DMS
    if i_dms != o_dms:
        View.delete_view(i_dms, o_view_name)
        o_dms.close()
    i_dms.close()
    logger("info", "Done.")
--- a/python/obitools3/commands/alignpairedend.cfiles
+++ b/python/obitools3/commands/alignpairedend.cfiles
@ -1,103 +0,0 @@
 ../../../src/obi_lcs.h
 ../../../src/obi_lcs.c
 ../../../src/obierrno.h
 ../../../src/obierrno.c
 ../../../src/upperband.h
 ../../../src/upperband.c
 ../../../src/sse_banded_LCS_alignment.h
 ../../../src/sse_banded_LCS_alignment.c
 ../../../src/obiblob.h
 ../../../src/obiblob.c
 ../../../src/utils.h
 ../../../src/utils.c
 ../../../src/obidms.h
 ../../../src/obidms.c
 ../../../src/libjson/json_utils.h
 ../../../src/libjson/json_utils.c
 ../../../src/libjson/cJSON.h
 ../../../src/libjson/cJSON.c
 ../../../src/obiavl.h
 ../../../src/obiavl.c
 ../../../src/bloom.h
 ../../../src/bloom.c
 ../../../src/crc64.h
 ../../../src/crc64.c
 ../../../src/murmurhash2.h
 ../../../src/murmurhash2.c
 ../../../src/obidmscolumn.h
 ../../../src/obidmscolumn.c
 ../../../src/obitypes.h
 ../../../src/obitypes.c
 ../../../src/obidmscolumndir.h
 ../../../src/obidmscolumndir.c
 ../../../src/obiblob_indexer.h
 ../../../src/obiblob_indexer.c
 ../../../src/obiview.h
 ../../../src/obiview.c
 ../../../src/hashtable.h
 ../../../src/hashtable.c
 ../../../src/linked_list.h
 ../../../src/linked_list.c
 ../../../src/obidmscolumn_array.h
 ../../../src/obidmscolumn_array.c
 ../../../src/obidmscolumn_blob.h
 ../../../src/obidmscolumn_blob.c
 ../../../src/obidmscolumn_idx.h
 ../../../src/obidmscolumn_idx.c
 ../../../src/obidmscolumn_bool.h
 ../../../src/obidmscolumn_bool.c
 ../../../src/obidmscolumn_char.h
 ../../../src/obidmscolumn_char.c
 ../../../src/obidmscolumn_float.h
 ../../../src/obidmscolumn_float.c
 ../../../src/obidmscolumn_int.h
 ../../../src/obidmscolumn_int.c
 ../../../src/obidmscolumn_qual.h
 ../../../src/obidmscolumn_qual.c
 ../../../src/obidmscolumn_seq.h
 ../../../src/obidmscolumn_seq.c
 ../../../src/obidmscolumn_str.h
 ../../../src/obidmscolumn_str.c
 ../../../src/array_indexer.h
 ../../../src/array_indexer.c
 ../../../src/char_str_indexer.h
 ../../../src/char_str_indexer.c
 ../../../src/dna_seq_indexer.h
 ../../../src/dna_seq_indexer.c
 ../../../src/encode.c
 ../../../src/encode.h
 ../../../src/uint8_indexer.c
 ../../../src/uint8_indexer.h
 ../../../src/build_reference_db.c
 ../../../src/build_reference_db.h
 ../../../src/kmer_similarity.c
 ../../../src/kmer_similarity.h
 ../../../src/obi_clean.c
 ../../../src/obi_clean.h
 ../../../src/obi_ecopcr.c
 ../../../src/obi_ecopcr.h
 ../../../src/obi_ecotag.c
 ../../../src/obi_ecotag.h
 ../../../src/obidms_taxonomy.c
 ../../../src/obidms_taxonomy.h
 ../../../src/obilittlebigman.c
 ../../../src/obilittlebigman.h
 ../../../src/_sse.h
 ../../../src/obidebug.h
 ../../../src/libecoPCR/libapat/CODES/dft_code.h
 ../../../src/libecoPCR/libapat/CODES/dna_code.h
 ../../../src/libecoPCR/libapat/CODES/prot_code.h
 ../../../src/libecoPCR/libapat/apat_parse.c
 ../../../src/libecoPCR/libapat/apat_search.c
 ../../../src/libecoPCR/libapat/apat.h
 ../../../src/libecoPCR/libapat/Gmach.h
 ../../../src/libecoPCR/libapat/Gtypes.h
 ../../../src/libecoPCR/libapat/libstki.c
 ../../../src/libecoPCR/libapat/libstki.h
 ../../../src/libecoPCR/libthermo/nnparams.h
 ../../../src/libecoPCR/libthermo/nnparams.c
 ../../../src/libecoPCR/ecoapat.c
 ../../../src/libecoPCR/ecodna.c
 ../../../src/libecoPCR/ecoError.c
 ../../../src/libecoPCR/ecoMalloc.c
 ../../../src/libecoPCR/ecoPCR.h
--- a/python/obitools3/commands/alignpairedend.pxd
+++ b/python/obitools3/commands/alignpairedend.pxd
@ -1,4 +0,0 @@
 #cython: language_level=3
 cdef object buildAlignment(object direct, object reverse)
--- a/python/obitools3/commands/alignpairedend.pyx
+++ b/python/obitools3/commands/alignpairedend.pyx
@ -1,249 +0,0 @@
 #cython: language_level=3
 from obitools3.apps.progress cimport ProgressBar  # @UnresolvedImport
 from obitools3.dms import DMS
 from obitools3.dms.view import RollbackException
 from obitools3.dms.view.typed_view.view_NUC_SEQS cimport View_NUC_SEQS
 from obitools3.dms.column.column cimport Column
 from obitools3.dms.capi.obiview cimport QUALITY_COLUMN
 from obitools3.dms.capi.obitypes cimport OBI_QUAL
 from obitools3.apps.optiongroups import addMinimalInputOption, addMinimalOutputOption
 from obitools3.uri.decode import open_uri
 from obitools3.apps.config import logger
 from obitools3.libalign._qsassemble import QSolexaReverseAssemble
 from obitools3.libalign._qsrassemble import QSolexaRightReverseAssemble
 from obitools3.libalign._solexapairend import buildConsensus, buildJoinedSequence
 from obitools3.dms.obiseq cimport Nuc_Seq
 from obitools3.libalign.shifted_ali cimport Kmer_similarity, Ali_shifted
 from obitools3.commands.ngsfilter import REVERSE_SEQ_COLUMN_NAME, REVERSE_QUALITY_COLUMN_NAME
 import sys
 import os
 __title__="Aligns paired-ended reads"
 def addOptions(parser):
    addMinimalInputOption(parser)
    addMinimalOutputOption(parser)
    group = parser.add_argument_group('obi alignpairedend specific options')
    group.add_argument('-R', '--reverse-reads',
                     action="store", dest="alignpairedend:reverse",
                     metavar="<URI>",
                     default=None,
                     type=str,
                     help="URI to the reverse reads if they are in a different view than the forward reads")
    group.add_argument('--score-min',
                     action="store", dest="alignpairedend:smin",
                     metavar="#.###",
                     default=None,
                     type=float,
                     help="Minimum score for keeping alignments")
    group.add_argument('-A', '--true-ali',
                       action="store_true", dest="alignpairedend:trueali",
                       default=False,
                       help="Performs gap free end alignment of sequences instead of using kmers to compute alignments (slower).")
    group.add_argument('-k', '--kmer-size',
                       action="store", dest="alignpairedend:kmersize",
                       metavar="#",
                       default=3,
                       type=int,
                       help="K-mer size for kmer comparisons, between 1 and 4 (not when using -A option; default: 3)")
 la = QSolexaReverseAssemble()
 ra = QSolexaRightReverseAssemble()
 cdef object buildAlignment(object direct, object reverse):
    if len(direct)==0 or len(reverse)==0:
        return None
    la.seqA = direct
    la.seqB = reverse
    ali=la()
    ali.direction='left'
    ra.seqA = direct
    ra.seqB = reverse
    rali=ra()
    rali.direction='right'
    if ali.score < rali.score:
        ali = rali
    return ali
 def alignmentIterator(entries, aligner): 
    if type(entries) == list:
        two_views = True
        forward = entries[0]
        reverse = entries[1]
        entries_len = len(forward)
    else:
        two_views = False
        entries_len = len(entries)
    for i in range(entries_len):
        if two_views:
            seqF = forward[i]
            seqR = reverse[i]
        else:
            seqF = Nuc_Seq.new_from_stored(entries[i])
            seqR = Nuc_Seq(seqF.id, seqF[REVERSE_SEQ_COLUMN_NAME], quality=seqF[REVERSE_QUALITY_COLUMN_NAME])
            seqR.index = i
        ali = aligner(seqF, seqR)
        if ali is None:
            continue
        yield ali
 def run(config):
    DMS.obi_atexit()
    logger("info", "obi alignpairedend")
    # Open the input
    two_views = False
    forward = None
    reverse = None
    input = None
    input = open_uri(config['obi']['inputURI'])
    if input is None:
        raise Exception("Could not open input reads")
    if input[2] != View_NUC_SEQS:
        raise NotImplementedError('obi alignpairedend only works on NUC_SEQS views')    
    if "reverse" in config["alignpairedend"]:
        two_views = True
        forward = input[1]        
        rinput = open_uri(config["alignpairedend"]["reverse"])
        if rinput is None:
            raise Exception("Could not open reverse reads")
        if rinput[2] != View_NUC_SEQS:
            raise NotImplementedError('obi alignpairedend only works on NUC_SEQS views')
        reverse = rinput[1]
        if len(forward) != len(reverse):
            raise Exception("Error: the number of forward and reverse reads are different")
        entries = [forward, reverse]
        input_dms_name = [forward.dms.name, reverse.dms.name]
        input_view_name = [forward.name, reverse.name]
    else:
        entries = input[1]
        input_dms_name = [entries.dms.name]
        input_view_name = [entries.name]
    if two_views:
        entries_len = len(forward)
    else:
        entries_len = len(entries)
    # Open the output
    output = open_uri(config['obi']['outputURI'],
                      input=False,
                      newviewtype=View_NUC_SEQS)
    if output is None:
        raise Exception("Could not create output view")
    view = output[1]
    Column.new_column(view, QUALITY_COLUMN, OBI_QUAL)   #TODO output URI quality option?
    if 'smin' in config['alignpairedend']:
        smin = config['alignpairedend']['smin']
    else:
        smin = 0
    # Initialize the progress bar
    pb = ProgressBar(entries_len, config, seconde=5)
    if config['alignpairedend']['trueali']:
        kmer_ali = False
        aligner = buildAlignment
    else :
        kmer_ali = True
        if type(entries) == list:
            forward = entries[0]
            reverse = entries[1]
            aligner = Kmer_similarity(forward, view2=reverse, kmer_size=config['alignpairedend']['kmersize'])
        else:
            aligner = Kmer_similarity(entries, column2=entries[REVERSE_SEQ_COLUMN_NAME], qual_column2=entries[REVERSE_QUALITY_COLUMN_NAME], kmer_size=config['alignpairedend']['kmersize'])
    ba = alignmentIterator(entries, aligner)
    i = 0
    for ali in ba:
        pb(i)
        consensus = view[i]
        if not two_views:
            seqF = entries[i]
        else:
            seqF = forward[i]
        if smin > 0:
            if (ali.score > smin) :
                buildConsensus(ali, consensus, seqF)
            else:
                if not two_views:
                    seqR = Nuc_Seq(seqF.id, seqF[REVERSE_SEQ_COLUMN_NAME], quality = seqF[REVERSE_QUALITY_COLUMN_NAME])
                else:
                    seqR = reverse[i]
                buildJoinedSequence(ali, seqR, consensus, forward=seqF)
            consensus[b"smin"] = smin
        else:
            buildConsensus(ali, consensus, seqF)
        if kmer_ali :
            ali.free()
        i+=1
    pb(i, force=True)
    print("", file=sys.stderr)
    if kmer_ali :
        aligner.free()
    # Save command config in View and DMS comments
    command_line = " ".join(sys.argv[1:])
    view.write_config(config, "alignpairedend", command_line, input_dms_name=input_dms_name, input_view_name=input_view_name)
    output[0].record_command_line(command_line)
    #print("\n\nOutput view:\n````````````", file=sys.stderr)
    #print(repr(view), file=sys.stderr)
    input[0].close()
    if two_views:
        rinput[0].close()
    output[0].close()
    logger("info", "Done.")
--- a/python/obitools3/commands/annotate.cfiles
+++ b/python/obitools3/commands/annotate.cfiles
@ -1,103 +0,0 @@
 ../../../src/obi_lcs.h
 ../../../src/obi_lcs.c
 ../../../src/obierrno.h
 ../../../src/obierrno.c
 ../../../src/upperband.h
 ../../../src/upperband.c
 ../../../src/sse_banded_LCS_alignment.h
 ../../../src/sse_banded_LCS_alignment.c
 ../../../src/obiblob.h
 ../../../src/obiblob.c
 ../../../src/utils.h
 ../../../src/utils.c
 ../../../src/obidms.h
 ../../../src/obidms.c
 ../../../src/libjson/json_utils.h
 ../../../src/libjson/json_utils.c
 ../../../src/libjson/cJSON.h
 ../../../src/libjson/cJSON.c
 ../../../src/obiavl.h
 ../../../src/obiavl.c
 ../../../src/bloom.h
 ../../../src/bloom.c
 ../../../src/crc64.h
 ../../../src/crc64.c
 ../../../src/murmurhash2.h
 ../../../src/murmurhash2.c
 ../../../src/obidmscolumn.h
 ../../../src/obidmscolumn.c
 ../../../src/obitypes.h
 ../../../src/obitypes.c
 ../../../src/obidmscolumndir.h
 ../../../src/obidmscolumndir.c
 ../../../src/obiblob_indexer.h
 ../../../src/obiblob_indexer.c
 ../../../src/obiview.h
 ../../../src/obiview.c
 ../../../src/hashtable.h
 ../../../src/hashtable.c
 ../../../src/linked_list.h
 ../../../src/linked_list.c
 ../../../src/obidmscolumn_array.h
 ../../../src/obidmscolumn_array.c
 ../../../src/obidmscolumn_blob.h
 ../../../src/obidmscolumn_blob.c
 ../../../src/obidmscolumn_idx.h
 ../../../src/obidmscolumn_idx.c
 ../../../src/obidmscolumn_bool.h
 ../../../src/obidmscolumn_bool.c
 ../../../src/obidmscolumn_char.h
 ../../../src/obidmscolumn_char.c
 ../../../src/obidmscolumn_float.h
 ../../../src/obidmscolumn_float.c
 ../../../src/obidmscolumn_int.h
 ../../../src/obidmscolumn_int.c
 ../../../src/obidmscolumn_qual.h
 ../../../src/obidmscolumn_qual.c
 ../../../src/obidmscolumn_seq.h
 ../../../src/obidmscolumn_seq.c
 ../../../src/obidmscolumn_str.h
 ../../../src/obidmscolumn_str.c
 ../../../src/array_indexer.h
 ../../../src/array_indexer.c
 ../../../src/char_str_indexer.h
 ../../../src/char_str_indexer.c
 ../../../src/dna_seq_indexer.h
 ../../../src/dna_seq_indexer.c
 ../../../src/encode.c
 ../../../src/encode.h
 ../../../src/uint8_indexer.c
 ../../../src/uint8_indexer.h
 ../../../src/build_reference_db.c
 ../../../src/build_reference_db.h
 ../../../src/kmer_similarity.c
 ../../../src/kmer_similarity.h
 ../../../src/obi_clean.c
 ../../../src/obi_clean.h
 ../../../src/obi_ecopcr.c
 ../../../src/obi_ecopcr.h
 ../../../src/obi_ecotag.c
 ../../../src/obi_ecotag.h
 ../../../src/obidms_taxonomy.c
 ../../../src/obidms_taxonomy.h
 ../../../src/obilittlebigman.c
 ../../../src/obilittlebigman.h
 ../../../src/_sse.h
 ../../../src/obidebug.h
 ../../../src/libecoPCR/libapat/CODES/dft_code.h
 ../../../src/libecoPCR/libapat/CODES/dna_code.h
 ../../../src/libecoPCR/libapat/CODES/prot_code.h
 ../../../src/libecoPCR/libapat/apat_parse.c
 ../../../src/libecoPCR/libapat/apat_search.c
 ../../../src/libecoPCR/libapat/apat.h
 ../../../src/libecoPCR/libapat/Gmach.h
 ../../../src/libecoPCR/libapat/Gtypes.h
 ../../../src/libecoPCR/libapat/libstki.c
 ../../../src/libecoPCR/libapat/libstki.h
 ../../../src/libecoPCR/libthermo/nnparams.h
 ../../../src/libecoPCR/libthermo/nnparams.c
 ../../../src/libecoPCR/ecoapat.c
 ../../../src/libecoPCR/ecodna.c
 ../../../src/libecoPCR/ecoError.c
 ../../../src/libecoPCR/ecoMalloc.c
 ../../../src/libecoPCR/ecoPCR.h
--- a/python/obitools3/commands/annotate.pyx
+++ b/python/obitools3/commands/annotate.pyx
@ -1,382 +0,0 @@
 #cython: language_level=3
 from obitools3.apps.progress cimport ProgressBar  # @UnresolvedImport
 from obitools3.dms import DMS
 from obitools3.dms.view.view cimport View, Line_selection
 from obitools3.uri.decode import open_uri
 from obitools3.apps.optiongroups import addMinimalInputOption, addTaxonomyOption, addMinimalOutputOption
 from obitools3.dms.view import RollbackException
 from functools import reduce
 from obitools3.apps.config import logger
 from obitools3.utils cimport tobytes, str2bytes
 from obitools3.dms.capi.obiview cimport NUC_SEQUENCE_COLUMN, \
                                        ID_COLUMN, \
                                        DEFINITION_COLUMN, \
                                        QUALITY_COLUMN, \
                                        COUNT_COLUMN
 import time
 import math 
 import sys
 __title__="Annotate views with new tags and edit existing annotations"
 SPECIAL_COLUMNS = [NUC_SEQUENCE_COLUMN, ID_COLUMN, DEFINITION_COLUMN, QUALITY_COLUMN]
 def addOptions(parser):
    addMinimalInputOption(parser)
    addTaxonomyOption(parser)
    addMinimalOutputOption(parser)
    group=parser.add_argument_group('obi annotate specific options')
    group.add_argument('--seq-rank',   # TODO seq/elt/line???
                       action="store_true", 
                       dest="annotate:add_rank",
                       default=False,
                       help="Add a rank attribute to the sequence "
                            "indicating the sequence position in the data.")
    group.add_argument('-R', '--rename-tag',
                       action="append", 
                       dest="annotate:rename_tags",
                       metavar="<OLD_NAME:NEW_NAME>",
                       type=str,
                       default=[],
                       help="Change tag name from OLD_NAME to NEW_NAME.")
    group.add_argument('-D', '--delete-tag',
                       action="append", 
                       dest="annotate:delete_tags",
                       metavar="<TAG_NAME>",
                       type=str,
                       default=[],
                       help="Delete tag TAG_NAME.")
    group.add_argument('-S', '--set-tag',
                       action="append", 
                       dest="annotate:set_tags",
                       metavar="<TAG_NAME:PYTHON_EXPRESSION>",
                       type=str,
                       default=[],
                       help="Add a new tag named TAG_NAME with "
                            "a value computed from PYTHON_EXPRESSION.")
    group.add_argument('--set-identifier',
                       action="store", 
                       dest="annotate:set_identifier",
                       metavar="<PYTHON_EXPRESSION>",
                       type=str,
                       default=None,
                       help="Set sequence identifier with "
                            "a value computed from PYTHON_EXPRESSION.")
    group.add_argument('--set-sequence',
                       action="store", 
                       dest="annotate:set_sequence",
                       metavar="<PYTHON_EXPRESSION>",
                       type=str,
                       default=None,
                       help="Change the sequence itself with "
                            "a value computed from PYTHON_EXPRESSION.")
    group.add_argument('--set-definition',
                       action="store", 
                       dest="annotate:set_definition",
                       metavar="<PYTHON_EXPRESSION>",
                       type=str,
                       default=None,
                       help="Set sequence definition with "
                            "a value computed from PYTHON_EXPRESSION.")
    group.add_argument('--run',
                       action="store", 
                       dest="annotate:run",
                       metavar="<PYTHON_EXPRESSION>",
                       type=str,
                       default=None,
                       help="Run a python expression on each element.")
    group.add_argument('-C', '--clear',
                       action="store_true", 
                       dest="annotate:clear",
                       default=False,
                       help="Clear all tags except the obligatory ones.")
    group.add_argument('-k','--keep',
                       action='append',
                       dest="annotate:keep",
                       metavar="<TAG>",
                       default=[],
                       type=str,
                       help="Only keep this tag. (Can be specified several times.)")
    group.add_argument('--length',
                       action="store_true", 
                       dest="annotate:length",
                       default=False,
                       help="Add 'seq_length' tag with sequence length.")
    group.add_argument('--with-taxon-at-rank',
                       action='append',
                       dest="annotate:taxon_at_rank",
                       metavar="<RANK_NAME>",
                       default=[],
                       type=str,
                       help="Add taxonomy annotation at the specified rank level RANK_NAME.")
 def sequenceTaggerGenerator(config, taxo=None):
    toSet=None
    newId=None
    newDef=None
    newSeq=None
    length=None
    add_rank=None
    run=None
    if 'set_tags' in config['annotate']:   # TODO default option problem, to fix
        toSet = [x.split(':',1) for x in config['annotate']['set_tags'] if len(x.split(':',1))==2]
    if 'set_identifier' in config['annotate']:
        newId = config['annotate']['set_identifier']
    if 'set_definition' in config['annotate']:
        newDef = config['annotate']['set_definition']
    if 'set_sequence' in config['annotate']:
        newSeq = config['annotate']['set_sequence']
    if 'length' in config['annotate']:
        length = config['annotate']['length']
    if 'add_rank' in config["annotate"]:
        add_rank = config["annotate"]["add_rank"]
    if 'run' in config['annotate']:
        run = config['annotate']['run']
    counter = [0]
    for i in range(len(toSet)):
        for j in range(len(toSet[i])):
            toSet[i][j] = tobytes(toSet[i][j])
    annoteRank=[]
    if config['annotate']['taxon_at_rank']:
        if taxo is not None:
            annoteRank = config['annotate']['taxon_at_rank']
        else:
            raise Exception("A taxonomy must be provided to annotate taxon ranks")
    def sequenceTagger(seq):
        if counter[0]>=0:
            counter[0]+=1
        for rank in annoteRank:
            if 'taxid' in seq:
                taxid = seq['taxid']
                if taxid is not None:
                    rtaxid = taxo.get_taxon_at_rank(taxid, rank)
                    if rtaxid is not None:
                        scn = taxo.get_scientific_name(rtaxid)
                    else:
                        scn=None
                    seq[rank]=rtaxid
                    seq["%s_name"%rank]=scn
        if add_rank:
            seq['seq_rank']=counter[0]
        for i,v in toSet:
            #try:
            if taxo is not None:
                environ = {'taxonomy' : taxo, 'sequence':seq, 'counter':counter[0], 'math':math}
            else:
                environ = {'sequence':seq, 'counter':counter[0], 'math':math}
            val = eval(v, environ, seq)
            #except Exception,e:       # TODO discuss usefulness of this
            #    if options.onlyValid:
            #        raise e
            #    val = v
            seq[i]=val
        if length:
            seq['seq_length']=len(seq)
        if newId is not None:
 #            try:
            if taxo is not None:
                environ = {'taxonomy' : taxo, 'sequence':seq, 'counter':counter[0], 'math':math}
            else:
                environ = {'sequence':seq, 'counter':counter[0], 'math':math}     
            val = eval(newId, environ, seq)
 #            except Exception,e:
 #                if options.onlyValid:
 #                    raise e
 #                val = newId
            seq.id=val
        if newDef is not None:
 #            try:
            if taxo is not None:
                environ = {'taxonomy' : taxo, 'sequence':seq, 'counter':counter[0], 'math':math}
            else:
                environ = {'sequence':seq, 'counter':counter[0], 'math':math}     
            val = eval(newDef, environ, seq)
 #            except Exception,e:
 #                if options.onlyValid:
 #                    raise e
 #                val = newDef
            seq.definition=val
 #             
        if newSeq is not None:
 #            try:
            if taxo is not None:
                environ = {'taxonomy' : taxo, 'sequence':seq, 'counter':counter[0], 'math':math}
            else:
                environ = {'sequence':seq, 'counter':counter[0], 'math':math}     
            val = eval(newSeq, environ, seq)
 #            except Exception,e:
 #                if options.onlyValid:
 #                    raise e
 #                val = newSeq
            seq.seq=val
            if 'seq_length' in seq:
                seq['seq_length']=len(seq)
            # Delete quality since it must match the sequence.
            # TODO discuss deleting for each sequence separately
            if QUALITY_COLUMN in seq:
                seq.view.delete_column(QUALITY_COLUMN)
        if run is not None:
 #            try:
            if taxo is not None:
                environ = {'taxonomy' : taxo, 'sequence':seq, 'counter':counter[0], 'math':math}
            else:
                environ = {'sequence':seq, 'counter':counter[0], 'math':math}     
            eval(run, environ, seq)
 #            except Exception,e:
 #                if options.onlyValid:
 #                    raise e
    return sequenceTagger
 def run(config):
    DMS.obi_atexit()
    logger("info", "obi annotate")
    # Open the input
    input = open_uri(config['obi']['inputURI'])
    if input is None:
        raise Exception("Could not read input view")
    i_dms = input[0]
    i_view = input[1]
    i_view_name = input[1].name
    # Open the output: only the DMS, as the output view is going to be created by cloning the input view
    # (could eventually be done via an open_uri() argument)
    output = open_uri(config['obi']['outputURI'],
                      input=False,
                      dms_only=True)
    if output is None:
        raise Exception("Could not create output view")
    o_dms = output[0]
    o_view_name = output[1]
    # If the input and output DMS are not the same, import the input view in the output DMS before cloning it to modify it
    # (could be the other way around: clone and modify in the input DMS then import the new view in the output DMS)
    if i_dms != o_dms:
        imported_view_name = i_view_name
        i=0
        while imported_view_name in o_dms:  # Making sure view name is unique in output DMS
            imported_view_name = i_view_name+b"_"+str2bytes(str(i))
            i+=1
        View.import_view(i_dms.full_path[:-7], o_dms.full_path[:-7], i_view_name, imported_view_name)
        i_view = o_dms[imported_view_name]
    # Clone output view from input view
    o_view = i_view.clone(o_view_name)
    if o_view is None:
        raise Exception("Couldn't create output view")
    i_view.close()
    # Open taxonomy if there is one
    if 'taxoURI' in config['obi'] and config['obi']['taxoURI'] is not None:
        taxo_uri = open_uri(config['obi']['taxoURI'])
        if taxo_uri is None:
            raise Exception("Couldn't open taxonomy")
        taxo = taxo_uri[1]
    else :
        taxo = None
    # Initialize the progress bar
    pb = ProgressBar(len(o_view), config, seconde=5)
    try:
        # Apply editions
        # Editions at view level
        if 'delete_tags' in config['annotate']:
            toDelete = config['annotate']['delete_tags'][:]
        if 'rename_tags' in config['annotate']:
            toRename = [x.split(':',1) for x in config['annotate']['rename_tags'] if len(x.split(':',1))==2]
        if 'clear' in config['annotate']:
            clear = config['annotate']['clear']
        if 'keep' in config['annotate']:
            keep = config['annotate']['keep']
        for i in range(len(toDelete)):
            toDelete[i] = tobytes(toDelete[i])
        for i in range(len(toRename)):
            for j in range(len(toRename[i])):
                toRename[i][j] = tobytes(toRename[i][j])
        for i in range(len(keep)):
            keep[i] = tobytes(keep[i])
        keep = set(keep)
        if clear or keep:
            keys = [k for k in o_view.keys()]
            for k in keys:
                if k not in keep and k not in SPECIAL_COLUMNS:
                    o_view.delete_column(k)
        else:
            for k in toDelete:
                o_view.delete_column(k)
            for old_name, new_name in toRename:
                if old_name in o_view:
                    o_view.rename_column(old_name, new_name)
        # Editions at line level
        sequenceTagger = sequenceTaggerGenerator(config, taxo=taxo)
        for i in range(len(o_view)):
            pb(i)
            sequenceTagger(o_view[i])
    except Exception, e:
        raise RollbackException("obi annotate error, rollbacking view: "+str(e), o_view)
    pb(i, force=True)
    print("", file=sys.stderr)
    # Save command config in View and DMS comments
    command_line = " ".join(sys.argv[1:])
    input_dms_name=[input[0].name]
    input_view_name=[i_view_name]
    if 'taxoURI' in config['obi'] and config['obi']['taxoURI'] is not None:
        input_dms_name.append(config['obi']['taxoURI'].split("/")[-3])
        input_view_name.append("taxonomy/"+config['obi']['taxoURI'].split("/")[-1])
    o_view.write_config(config, "annotate", command_line, input_dms_name=input_dms_name, input_view_name=input_view_name)
    output[0].record_command_line(command_line)
    #print("\n\nOutput view:\n````````````", file=sys.stderr)
    #print(repr(o_view), file=sys.stderr)
    # If the input and the output DMS are different, delete the temporary imported view used to create the final view
    if i_dms != o_dms:
        View.delete_view(o_dms, imported_view_name)
        o_dms.close()
    i_dms.close()
    logger("info", "Done.")
--- a/python/obitools3/commands/build_ref_db.cfiles
+++ b/python/obitools3/commands/build_ref_db.cfiles
@ -1,103 +0,0 @@
 ../../../src/obi_lcs.h
 ../../../src/obi_lcs.c
 ../../../src/obierrno.h
 ../../../src/obierrno.c
 ../../../src/upperband.h
 ../../../src/upperband.c
 ../../../src/sse_banded_LCS_alignment.h
 ../../../src/sse_banded_LCS_alignment.c
 ../../../src/obiblob.h
 ../../../src/obiblob.c
 ../../../src/utils.h
 ../../../src/utils.c
 ../../../src/obidms.h
 ../../../src/obidms.c
 ../../../src/libjson/json_utils.h
 ../../../src/libjson/json_utils.c
 ../../../src/libjson/cJSON.h
 ../../../src/libjson/cJSON.c
 ../../../src/obiavl.h
 ../../../src/obiavl.c
 ../../../src/bloom.h
 ../../../src/bloom.c
 ../../../src/crc64.h
 ../../../src/crc64.c
 ../../../src/murmurhash2.h
 ../../../src/murmurhash2.c
 ../../../src/obidmscolumn.h
 ../../../src/obidmscolumn.c
 ../../../src/obitypes.h
 ../../../src/obitypes.c
 ../../../src/obidmscolumndir.h
 ../../../src/obidmscolumndir.c
 ../../../src/obiblob_indexer.h
 ../../../src/obiblob_indexer.c
 ../../../src/obiview.h
 ../../../src/obiview.c
 ../../../src/hashtable.h
 ../../../src/hashtable.c
 ../../../src/linked_list.h
 ../../../src/linked_list.c
 ../../../src/obidmscolumn_array.h
 ../../../src/obidmscolumn_array.c
 ../../../src/obidmscolumn_blob.h
 ../../../src/obidmscolumn_blob.c
 ../../../src/obidmscolumn_idx.h
 ../../../src/obidmscolumn_idx.c
 ../../../src/obidmscolumn_bool.h
 ../../../src/obidmscolumn_bool.c
 ../../../src/obidmscolumn_char.h
 ../../../src/obidmscolumn_char.c
 ../../../src/obidmscolumn_float.h
 ../../../src/obidmscolumn_float.c
 ../../../src/obidmscolumn_int.h
 ../../../src/obidmscolumn_int.c
 ../../../src/obidmscolumn_qual.h
 ../../../src/obidmscolumn_qual.c
 ../../../src/obidmscolumn_seq.h
 ../../../src/obidmscolumn_seq.c
 ../../../src/obidmscolumn_str.h
 ../../../src/obidmscolumn_str.c
 ../../../src/array_indexer.h
 ../../../src/array_indexer.c
 ../../../src/char_str_indexer.h
 ../../../src/char_str_indexer.c
 ../../../src/dna_seq_indexer.h
 ../../../src/dna_seq_indexer.c
 ../../../src/encode.c
 ../../../src/encode.h
 ../../../src/uint8_indexer.c
 ../../../src/uint8_indexer.h
 ../../../src/build_reference_db.c
 ../../../src/build_reference_db.h
 ../../../src/kmer_similarity.c
 ../../../src/kmer_similarity.h
 ../../../src/obi_clean.c
 ../../../src/obi_clean.h
 ../../../src/obi_ecopcr.c
 ../../../src/obi_ecopcr.h
 ../../../src/obi_ecotag.c
 ../../../src/obi_ecotag.h
 ../../../src/obidms_taxonomy.c
 ../../../src/obidms_taxonomy.h
 ../../../src/obilittlebigman.c
 ../../../src/obilittlebigman.h
 ../../../src/_sse.h
 ../../../src/obidebug.h
 ../../../src/libecoPCR/libapat/CODES/dft_code.h
 ../../../src/libecoPCR/libapat/CODES/dna_code.h
 ../../../src/libecoPCR/libapat/CODES/prot_code.h
 ../../../src/libecoPCR/libapat/apat_parse.c
 ../../../src/libecoPCR/libapat/apat_search.c
 ../../../src/libecoPCR/libapat/apat.h
 ../../../src/libecoPCR/libapat/Gmach.h
 ../../../src/libecoPCR/libapat/Gtypes.h
 ../../../src/libecoPCR/libapat/libstki.c
 ../../../src/libecoPCR/libapat/libstki.h
 ../../../src/libecoPCR/libthermo/nnparams.h
 ../../../src/libecoPCR/libthermo/nnparams.c
 ../../../src/libecoPCR/ecoapat.c
 ../../../src/libecoPCR/ecodna.c
 ../../../src/libecoPCR/ecoError.c
 ../../../src/libecoPCR/ecoMalloc.c
 ../../../src/libecoPCR/ecoPCR.h
--- a/python/obitools3/commands/build_ref_db.pyx
+++ b/python/obitools3/commands/build_ref_db.pyx
@ -1,105 +0,0 @@
 #cython: language_level=3
 from obitools3.apps.progress cimport ProgressBar  # @UnresolvedImport
 from obitools3.dms.dms cimport DMS
 from obitools3.dms.view import RollbackException
 from obitools3.dms.capi.build_reference_db cimport build_reference_db
 from obitools3.apps.optiongroups import addMinimalInputOption, addTaxonomyOption, addMinimalOutputOption
 from obitools3.uri.decode import open_uri
 from obitools3.apps.config import logger
 from obitools3.utils cimport tobytes, str2bytes
 from obitools3.dms.view.view cimport View
 from obitools3.dms.view.typed_view.view_NUC_SEQS cimport View_NUC_SEQS
 import sys
 __title__="Tag a set of sequences for PCR and sequencing errors identification"
 def addOptions(parser):
    addMinimalInputOption(parser)
    addTaxonomyOption(parser)
    addMinimalOutputOption(parser)
    group = parser.add_argument_group('obi build_ref_db specific options')
    group.add_argument('--threshold','-t',
                      action="store", dest="build_ref_db:threshold",
                      metavar='<THRESHOLD>',
                      default=0.0,
                      type=float,
                      help="Score threshold as a normalized identity, e.g. 0.95 for an identity of 95%%. Default: 0.00"
                           " (no threshold).")
 def run(config):
    DMS.obi_atexit()
    logger("info", "obi build_ref_db")
    # Open the input: only the DMS
    input = open_uri(config['obi']['inputURI'],
                     dms_only=True)
    if input is None:
        raise Exception("Could not read input")
    i_dms = input[0]
    i_dms_name = input[0].name
    i_view_name = input[1]
    # Open the output: only the DMS
    output = open_uri(config['obi']['outputURI'],
                      input=False,
                      dms_only=True)
    if output is None:
        raise Exception("Could not create output")
    o_dms = output[0]
    final_o_view_name = output[1]
    # If the input and output DMS are not the same, build the database creating a temporary view that will be exported to 
    # the right DMS and deleted in the other afterwards.
    if i_dms != o_dms:
        temporary_view_name = final_o_view_name
        i=0
        while temporary_view_name in i_dms:  # Making sure view name is unique in input DMS
            temporary_view_name = final_o_view_name+b"_"+str2bytes(str(i))
            i+=1
        o_view_name = temporary_view_name
    else:
        o_view_name = final_o_view_name
    # Read taxonomy name
    taxonomy_name = config['obi']['taxoURI'].split("/")[-1]   # Robust in theory
    # Save command config in View comments
    command_line = " ".join(sys.argv[1:])
    input_dms_name=[i_dms_name]
    input_view_name= [i_view_name]
    input_dms_name.append(config['obi']['taxoURI'].split("/")[-3])
    input_view_name.append("taxonomy/"+config['obi']['taxoURI'].split("/")[-1])
    comments = View.print_config(config, "build_ref_db", command_line, input_dms_name=input_dms_name, input_view_name=input_view_name)
    if build_reference_db(tobytes(i_dms_name), tobytes(i_view_name), tobytes(taxonomy_name), tobytes(o_view_name), comments, config['build_ref_db']['threshold']) < 0:
        raise Exception("Error building a reference database")
    # If the input and output DMS are not the same, export result view to output DMS
    if i_dms != o_dms:
        View.import_view(i_dms.full_path[:-7], o_dms.full_path[:-7], o_view_name, final_o_view_name)
    # Save command config in DMS comments
    o_dms.record_command_line(command_line)
    #print("\n\nOutput view:\n````````````", file=sys.stderr)
    #print(repr(o_dms[final_o_view_name]), file=sys.stderr)
    # If the input and the output DMS are different, delete the temporary result view in the input DMS
    if i_dms != o_dms:
        View.delete_view(i_dms, o_view_name)
        o_dms.close()
    i_dms.close()
    logger("info", "Done.")
--- a/python/obitools3/commands/clean.cfiles
+++ b/python/obitools3/commands/clean.cfiles
@ -1,103 +0,0 @@
 ../../../src/obi_lcs.h
 ../../../src/obi_lcs.c
 ../../../src/obierrno.h
 ../../../src/obierrno.c
 ../../../src/upperband.h
 ../../../src/upperband.c
 ../../../src/sse_banded_LCS_alignment.h
 ../../../src/sse_banded_LCS_alignment.c
 ../../../src/obiblob.h
 ../../../src/obiblob.c
 ../../../src/utils.h
 ../../../src/utils.c
 ../../../src/obidms.h
 ../../../src/obidms.c
 ../../../src/libjson/json_utils.h
 ../../../src/libjson/json_utils.c
 ../../../src/libjson/cJSON.h
 ../../../src/libjson/cJSON.c
 ../../../src/obiavl.h
 ../../../src/obiavl.c
 ../../../src/bloom.h
 ../../../src/bloom.c
 ../../../src/crc64.h
 ../../../src/crc64.c
 ../../../src/murmurhash2.h
 ../../../src/murmurhash2.c
 ../../../src/obidmscolumn.h
 ../../../src/obidmscolumn.c
 ../../../src/obitypes.h
 ../../../src/obitypes.c
 ../../../src/obidmscolumndir.h
 ../../../src/obidmscolumndir.c
 ../../../src/obiblob_indexer.h
 ../../../src/obiblob_indexer.c
 ../../../src/obiview.h
 ../../../src/obiview.c
 ../../../src/hashtable.h
 ../../../src/hashtable.c
 ../../../src/linked_list.h
 ../../../src/linked_list.c
 ../../../src/obidmscolumn_array.h
 ../../../src/obidmscolumn_array.c
 ../../../src/obidmscolumn_blob.h
 ../../../src/obidmscolumn_blob.c
 ../../../src/obidmscolumn_idx.h
 ../../../src/obidmscolumn_idx.c
 ../../../src/obidmscolumn_bool.h
 ../../../src/obidmscolumn_bool.c
 ../../../src/obidmscolumn_char.h
 ../../../src/obidmscolumn_char.c
 ../../../src/obidmscolumn_float.h
 ../../../src/obidmscolumn_float.c
 ../../../src/obidmscolumn_int.h
 ../../../src/obidmscolumn_int.c
 ../../../src/obidmscolumn_qual.h
 ../../../src/obidmscolumn_qual.c
 ../../../src/obidmscolumn_seq.h
 ../../../src/obidmscolumn_seq.c
 ../../../src/obidmscolumn_str.h
 ../../../src/obidmscolumn_str.c
 ../../../src/array_indexer.h
 ../../../src/array_indexer.c
 ../../../src/char_str_indexer.h
 ../../../src/char_str_indexer.c
 ../../../src/dna_seq_indexer.h
 ../../../src/dna_seq_indexer.c
 ../../../src/encode.c
 ../../../src/encode.h
 ../../../src/uint8_indexer.c
 ../../../src/uint8_indexer.h
 ../../../src/build_reference_db.c
 ../../../src/build_reference_db.h
 ../../../src/kmer_similarity.c
 ../../../src/kmer_similarity.h
 ../../../src/obi_clean.c
 ../../../src/obi_clean.h
 ../../../src/obi_ecopcr.c
 ../../../src/obi_ecopcr.h
 ../../../src/obi_ecotag.c
 ../../../src/obi_ecotag.h
 ../../../src/obidms_taxonomy.c
 ../../../src/obidms_taxonomy.h
 ../../../src/obilittlebigman.c
 ../../../src/obilittlebigman.h
 ../../../src/_sse.h
 ../../../src/obidebug.h
 ../../../src/libecoPCR/libapat/CODES/dft_code.h
 ../../../src/libecoPCR/libapat/CODES/dna_code.h
 ../../../src/libecoPCR/libapat/CODES/prot_code.h
 ../../../src/libecoPCR/libapat/apat_parse.c
 ../../../src/libecoPCR/libapat/apat_search.c
 ../../../src/libecoPCR/libapat/apat.h
 ../../../src/libecoPCR/libapat/Gmach.h
 ../../../src/libecoPCR/libapat/Gtypes.h
 ../../../src/libecoPCR/libapat/libstki.c
 ../../../src/libecoPCR/libapat/libstki.h
 ../../../src/libecoPCR/libthermo/nnparams.h
 ../../../src/libecoPCR/libthermo/nnparams.c
 ../../../src/libecoPCR/ecoapat.c
 ../../../src/libecoPCR/ecodna.c
 ../../../src/libecoPCR/ecoError.c
 ../../../src/libecoPCR/ecoMalloc.c
 ../../../src/libecoPCR/ecoPCR.h
--- a/python/obitools3/commands/clean.pyx
+++ b/python/obitools3/commands/clean.pyx
@ -1,124 +0,0 @@
 #cython: language_level=3
 from obitools3.apps.progress cimport ProgressBar  # @UnresolvedImport
 from obitools3.dms.dms cimport DMS
 from obitools3.dms.view import RollbackException
 from obitools3.dms.capi.obiclean cimport obi_clean
 from obitools3.apps.optiongroups import addMinimalInputOption, addMinimalOutputOption
 from obitools3.uri.decode import open_uri
 from obitools3.apps.config import logger
 from obitools3.utils cimport tobytes, str2bytes
 from obitools3.dms.view.view cimport View
 from obitools3.dms.view.typed_view.view_NUC_SEQS cimport View_NUC_SEQS
 import sys
 __title__="Tag a set of sequences for PCR and sequencing errors identification"
 def addOptions(parser):
    addMinimalInputOption(parser)
    addMinimalOutputOption(parser)
    group = parser.add_argument_group('obi clean specific options')
    group.add_argument('--distance', '-d',
                       action="store", dest="clean:distance",
                       metavar='<DISTANCE>',
                       default=1.0,
                       type=float,
                       help="Maximum numbers of errors between two variant sequences. Default: 1.")
    group.add_argument('--sample-tag', '-s',
                       action="store", 
                       dest="clean:sample-tag-name",
                       metavar="<SAMPLE TAG NAME>",
                       type=str,
                       default="merged_sample",
                       help="Name of the tag where sample counts are kept.")
    group.add_argument('--ratio', '-r',
                       action="store", dest="clean:ratio",
                       metavar='<RATIO>',
                       default=0.5,
                       type=float,
                       help="Maximum ratio between the counts of two sequences so that the less abundant one can be considered"
                            " a variant of the more abundant one. Default: 0.5.")
    group.add_argument('--heads-only', '-H',
                       action="store_true", 
                       dest="clean:heads-only",
                       default=False,
                       help="Only sequences labeled as heads are kept in the output. Default: False")
    group.add_argument('--cluster-tags', '-C',
                       action="store_true", 
                       dest="clean:cluster-tags",
                       default=False,
                       help="Adds tags for each sequence giving its cluster's head and weight for each sample.")
 def run(config):
    DMS.obi_atexit()
    logger("info", "obi clean")
    # Open the input: only the DMS
    input = open_uri(config['obi']['inputURI'],
                     dms_only=True)
    if input is None:
        raise Exception("Could not read input")
    i_dms = input[0]
    i_dms_name = input[0].name
    i_view_name = input[1]
    # Open the output: only the DMS
    output = open_uri(config['obi']['outputURI'],
                      input=False,
                      dms_only=True)
    if output is None:
        raise Exception("Could not create output")
    o_dms = output[0]
    final_o_view_name = output[1]
    # If the input and output DMS are not the same, run obiclean creating a temporary view that will be exported to 
    # the right DMS and deleted in the other afterwards.
    if i_dms != o_dms:
        temporary_view_name = final_o_view_name
        i=0
        while temporary_view_name in i_dms:  # Making sure view name is unique in input DMS
            temporary_view_name = final_o_view_name+b"_"+str2bytes(str(i))
            i+=1
        o_view_name = temporary_view_name
    else:
        o_view_name = final_o_view_name
    # Save command config in View comments
    command_line = " ".join(sys.argv[1:])
    comments = View.print_config(config, "clean", command_line, input_dms_name=[i_dms_name], input_view_name=[i_view_name])
    if obi_clean(tobytes(i_dms_name), tobytes(i_view_name), tobytes(config['clean']['sample-tag-name']), tobytes(o_view_name), comments, \
              config['clean']['distance'], config['clean']['ratio'], config['clean']['heads-only'], 1) < 0:
        raise Exception("Error running obiclean")
    # If the input and output DMS are not the same, export result view to output DMS
    if i_dms != o_dms:
        View.import_view(i_dms.full_path[:-7], o_dms.full_path[:-7], o_view_name, final_o_view_name)
    # Save command config in DMS comments
    o_dms.record_command_line(command_line)
    #print("\n\nOutput view:\n````````````", file=sys.stderr)
    #print(repr(o_dms[final_o_view_name]), file=sys.stderr)
    # If the input and the output DMS are different, delete the temporary result view in the input DMS
    if i_dms != o_dms:
        View.delete_view(i_dms, o_view_name)
        o_dms.close()
    i_dms.close()
    logger("info", "Done.")
--- a/python/obitools3/commands/count.cfiles
+++ b/python/obitools3/commands/count.cfiles
@ -1,103 +0,0 @@
 ../../../src/obi_lcs.h
 ../../../src/obi_lcs.c
 ../../../src/obierrno.h
 ../../../src/obierrno.c
 ../../../src/upperband.h
 ../../../src/upperband.c
 ../../../src/sse_banded_LCS_alignment.h
 ../../../src/sse_banded_LCS_alignment.c
 ../../../src/obiblob.h
 ../../../src/obiblob.c
 ../../../src/utils.h
 ../../../src/utils.c
 ../../../src/obidms.h
 ../../../src/obidms.c
 ../../../src/libjson/json_utils.h
 ../../../src/libjson/json_utils.c
 ../../../src/libjson/cJSON.h
 ../../../src/libjson/cJSON.c
 ../../../src/obiavl.h
 ../../../src/obiavl.c
 ../../../src/bloom.h
 ../../../src/bloom.c
 ../../../src/crc64.h
 ../../../src/crc64.c
 ../../../src/murmurhash2.h
 ../../../src/murmurhash2.c
 ../../../src/obidmscolumn.h
 ../../../src/obidmscolumn.c
 ../../../src/obitypes.h
 ../../../src/obitypes.c
 ../../../src/obidmscolumndir.h
 ../../../src/obidmscolumndir.c
 ../../../src/obiblob_indexer.h
 ../../../src/obiblob_indexer.c
 ../../../src/obiview.h
 ../../../src/obiview.c
 ../../../src/hashtable.h
 ../../../src/hashtable.c
 ../../../src/linked_list.h
 ../../../src/linked_list.c
 ../../../src/obidmscolumn_array.h
 ../../../src/obidmscolumn_array.c
 ../../../src/obidmscolumn_blob.h
 ../../../src/obidmscolumn_blob.c
 ../../../src/obidmscolumn_idx.h
 ../../../src/obidmscolumn_idx.c
 ../../../src/obidmscolumn_bool.h
 ../../../src/obidmscolumn_bool.c
 ../../../src/obidmscolumn_char.h
 ../../../src/obidmscolumn_char.c
 ../../../src/obidmscolumn_float.h
 ../../../src/obidmscolumn_float.c
 ../../../src/obidmscolumn_int.h
 ../../../src/obidmscolumn_int.c
 ../../../src/obidmscolumn_qual.h
 ../../../src/obidmscolumn_qual.c
 ../../../src/obidmscolumn_seq.h
 ../../../src/obidmscolumn_seq.c
 ../../../src/obidmscolumn_str.h
 ../../../src/obidmscolumn_str.c
 ../../../src/array_indexer.h
 ../../../src/array_indexer.c
 ../../../src/char_str_indexer.h
 ../../../src/char_str_indexer.c
 ../../../src/dna_seq_indexer.h
 ../../../src/dna_seq_indexer.c
 ../../../src/encode.c
 ../../../src/encode.h
 ../../../src/uint8_indexer.c
 ../../../src/uint8_indexer.h
 ../../../src/build_reference_db.c
 ../../../src/build_reference_db.h
 ../../../src/kmer_similarity.c
 ../../../src/kmer_similarity.h
 ../../../src/obi_clean.c
 ../../../src/obi_clean.h
 ../../../src/obi_ecopcr.c
 ../../../src/obi_ecopcr.h
 ../../../src/obi_ecotag.c
 ../../../src/obi_ecotag.h
 ../../../src/obidms_taxonomy.c
 ../../../src/obidms_taxonomy.h
 ../../../src/obilittlebigman.c
 ../../../src/obilittlebigman.h
 ../../../src/_sse.h
 ../../../src/obidebug.h
 ../../../src/libecoPCR/libapat/CODES/dft_code.h
 ../../../src/libecoPCR/libapat/CODES/dna_code.h
 ../../../src/libecoPCR/libapat/CODES/prot_code.h
 ../../../src/libecoPCR/libapat/apat_parse.c
 ../../../src/libecoPCR/libapat/apat_search.c
 ../../../src/libecoPCR/libapat/apat.h
 ../../../src/libecoPCR/libapat/Gmach.h
 ../../../src/libecoPCR/libapat/Gtypes.h
 ../../../src/libecoPCR/libapat/libstki.c
 ../../../src/libecoPCR/libapat/libstki.h
 ../../../src/libecoPCR/libthermo/nnparams.h
 ../../../src/libecoPCR/libthermo/nnparams.c
 ../../../src/libecoPCR/ecoapat.c
 ../../../src/libecoPCR/ecodna.c
 ../../../src/libecoPCR/ecoError.c
 ../../../src/libecoPCR/ecoMalloc.c
 ../../../src/libecoPCR/ecoPCR.h
--- a/python/obitools3/commands/count.pyx
+++ b/python/obitools3/commands/count.pyx
@ -1,55 +0,0 @@
 #cython: language_level=3
 from obitools3.uri.decode import open_uri
 from obitools3.apps.config import logger
 from obitools3.dms import DMS
 from obitools3.apps.optiongroups import addMinimalInputOption
 from obitools3.dms.capi.obiview cimport COUNT_COLUMN
 __title__="Counts sequence records"
 def addOptions(parser):
    addMinimalInputOption(parser)
    group = parser.add_argument_group('obi count specific options')
    group.add_argument('-s','--sequence',
                        action="store_true", dest="count:sequence",
                        default=False,
                        help="Prints only the number of sequence records.")
    group.add_argument('-a','--all',
                        action="store_true", dest="count:all",
                        default=False,
                        help="Prints only the total count of sequence records (if a sequence has no `count` attribute, its default count is 1) (default: False).")
 def run(config):
    DMS.obi_atexit()
    logger("info", "obi count")
    # Open the input
    input = open_uri(config['obi']['inputURI'])
    if input is None:
        raise Exception("Could not read input")
    entries = input[1]
    count1 = len(entries)
    count2 = 0
    if COUNT_COLUMN in entries and ((config['count']['sequence'] == config['count']['all']) or (config['count']['all'])) :
        for e in entries:
            count2+=e[COUNT_COLUMN]
    if COUNT_COLUMN in entries and (config['count']['sequence'] == config['count']['all']):
        print(count1,count2)
    elif COUNT_COLUMN in entries and config['count']['all']:
        print(count2)
    else:
        print(count1)
--- a/python/obitools3/commands/ecopcr.cfiles
+++ b/python/obitools3/commands/ecopcr.cfiles
@ -1,103 +0,0 @@
 ../../../src/obi_lcs.h
 ../../../src/obi_lcs.c
 ../../../src/obierrno.h
 ../../../src/obierrno.c
 ../../../src/upperband.h
 ../../../src/upperband.c
 ../../../src/sse_banded_LCS_alignment.h
 ../../../src/sse_banded_LCS_alignment.c
 ../../../src/obiblob.h
 ../../../src/obiblob.c
 ../../../src/utils.h
 ../../../src/utils.c
 ../../../src/obidms.h
 ../../../src/obidms.c
 ../../../src/libjson/json_utils.h
 ../../../src/libjson/json_utils.c
 ../../../src/libjson/cJSON.h
 ../../../src/libjson/cJSON.c
 ../../../src/obiavl.h
 ../../../src/obiavl.c
 ../../../src/bloom.h
 ../../../src/bloom.c
 ../../../src/crc64.h
 ../../../src/crc64.c
 ../../../src/murmurhash2.h
 ../../../src/murmurhash2.c
 ../../../src/obidmscolumn.h
 ../../../src/obidmscolumn.c
 ../../../src/obitypes.h
 ../../../src/obitypes.c
 ../../../src/obidmscolumndir.h
 ../../../src/obidmscolumndir.c
 ../../../src/obiblob_indexer.h
 ../../../src/obiblob_indexer.c
 ../../../src/obiview.h
 ../../../src/obiview.c
 ../../../src/hashtable.h
 ../../../src/hashtable.c
 ../../../src/linked_list.h
 ../../../src/linked_list.c
 ../../../src/obidmscolumn_array.h
 ../../../src/obidmscolumn_array.c
 ../../../src/obidmscolumn_blob.h
 ../../../src/obidmscolumn_blob.c
 ../../../src/obidmscolumn_idx.h
 ../../../src/obidmscolumn_idx.c
 ../../../src/obidmscolumn_bool.h
 ../../../src/obidmscolumn_bool.c
 ../../../src/obidmscolumn_char.h
 ../../../src/obidmscolumn_char.c
 ../../../src/obidmscolumn_float.h
 ../../../src/obidmscolumn_float.c
 ../../../src/obidmscolumn_int.h
 ../../../src/obidmscolumn_int.c
 ../../../src/obidmscolumn_qual.h
 ../../../src/obidmscolumn_qual.c
 ../../../src/obidmscolumn_seq.h
 ../../../src/obidmscolumn_seq.c
 ../../../src/obidmscolumn_str.h
 ../../../src/obidmscolumn_str.c
 ../../../src/array_indexer.h
 ../../../src/array_indexer.c
 ../../../src/char_str_indexer.h
 ../../../src/char_str_indexer.c
 ../../../src/dna_seq_indexer.h
 ../../../src/dna_seq_indexer.c
 ../../../src/encode.c
 ../../../src/encode.h
 ../../../src/uint8_indexer.c
 ../../../src/uint8_indexer.h
 ../../../src/build_reference_db.c
 ../../../src/build_reference_db.h
 ../../../src/kmer_similarity.c
 ../../../src/kmer_similarity.h
 ../../../src/obi_clean.c
 ../../../src/obi_clean.h
 ../../../src/obi_ecopcr.c
 ../../../src/obi_ecopcr.h
 ../../../src/obi_ecotag.c
 ../../../src/obi_ecotag.h
 ../../../src/obidms_taxonomy.c
 ../../../src/obidms_taxonomy.h
 ../../../src/obilittlebigman.c
 ../../../src/obilittlebigman.h
 ../../../src/_sse.h
 ../../../src/obidebug.h
 ../../../src/libecoPCR/libapat/CODES/dft_code.h
 ../../../src/libecoPCR/libapat/CODES/dna_code.h
 ../../../src/libecoPCR/libapat/CODES/prot_code.h
 ../../../src/libecoPCR/libapat/apat_parse.c
 ../../../src/libecoPCR/libapat/apat_search.c
 ../../../src/libecoPCR/libapat/apat.h
 ../../../src/libecoPCR/libapat/Gmach.h
 ../../../src/libecoPCR/libapat/Gtypes.h
 ../../../src/libecoPCR/libapat/libstki.c
 ../../../src/libecoPCR/libapat/libstki.h
 ../../../src/libecoPCR/libthermo/nnparams.h
 ../../../src/libecoPCR/libthermo/nnparams.c
 ../../../src/libecoPCR/ecoapat.c
 ../../../src/libecoPCR/ecodna.c
 ../../../src/libecoPCR/ecoError.c
 ../../../src/libecoPCR/ecoMalloc.c
 ../../../src/libecoPCR/ecoPCR.h
--- a/python/obitools3/commands/ecopcr.pyx
+++ b/python/obitools3/commands/ecopcr.pyx
@ -1,202 +0,0 @@
 #cython: language_level=3
 from obitools3.apps.progress cimport ProgressBar  # @UnresolvedImport
 from obitools3.dms.dms cimport DMS
 from obitools3.dms.capi.obidms cimport OBIDMS_p
 from obitools3.dms.view import RollbackException
 from obitools3.dms.capi.obiecopcr cimport obi_ecopcr
 from obitools3.apps.optiongroups import addMinimalInputOption, addMinimalOutputOption, addTaxonomyOption
 from obitools3.uri.decode import open_uri
 from obitools3.apps.config import logger
 from obitools3.utils cimport tobytes
 from obitools3.dms.view.typed_view.view_NUC_SEQS cimport View_NUC_SEQS
 from obitools3.dms.view import View
 from libc.stdlib  cimport malloc, free
 from libc.stdint  cimport int32_t
 import sys
 __title__="in silico PCR"
 # TODO: add option to output unique ids
 def addOptions(parser):
    addMinimalInputOption(parser)
    addTaxonomyOption(parser)
    addMinimalOutputOption(parser)
    group = parser.add_argument_group('obi ecopcr specific options')
    group.add_argument('--primer1', '-F',
                       action="store", dest="ecopcr:primer1",
                       metavar='<PRIMER>',
                       type=str,
                       help="Forward primer.")
    group.add_argument('--primer2', '-R',
                       action="store", dest="ecopcr:primer2",
                       metavar='<PRIMER>',
                       type=str,
                       help="Reverse primer.")
    group.add_argument('--error', '-e',
                       action="store", dest="ecopcr:error",
                       metavar='<ERROR>',
                       default=0,
                       type=int,
                       help="Maximum number of errors (mismatches) allowed per primer. Default: 0.")
    group.add_argument('--min-length', '-l',
                       action="store", 
                       dest="ecopcr:min-length",
                       metavar="<MINIMUM LENGTH>",
                       type=int,
                       default=0,
                       help="Minimum length of the in silico amplified DNA fragment, excluding primers.")
    group.add_argument('--max-length', '-L',
                       action="store", 
                       dest="ecopcr:max-length",
                       metavar="<MAXIMUM LENGTH>",
                       type=int,
                       default=0,
                       help="Maximum length of the in silico amplified DNA fragment, excluding primers.")
    group.add_argument('--restrict-to-taxid', '-r',
                       action="append", 
                       dest="ecopcr:restrict-to-taxid",
                       metavar="<TAXID>",
                       type=int,
                       default=[],
                       help="Only the sequence records corresponding to the taxonomic group identified "
                            "by TAXID are considered for the in silico PCR. The TAXID is an integer "
                            "that can be found in the NCBI taxonomic database.")
    group.add_argument('--ignore-taxid', '-i',
                       action="append", 
                       dest="ecopcr:ignore-taxid",
                       metavar="<TAXID>",
                       type=int,
                       default=[],
                       help="The sequences of the taxonomic group identified by TAXID are not considered for the in silico PCR.")
    group.add_argument('--circular', '-c',
                       action="store_true", 
                       dest="ecopcr:circular",
                       default=False,
                       help="Considers that the input sequences are circular (e.g. mitochondrial or chloroplastic DNA).")
    group.add_argument('--salt-concentration', '-a',
                       action="store", 
                       dest="ecopcr:salt-concentration",
                       metavar="<FLOAT>",
                       type=float,
                       default=0.05,
                       help="Salt concentration used for estimating the Tm. Default: 0.05.")
    group.add_argument('--salt-correction-method', '-m',
                       action="store", 
                       dest="ecopcr:salt-correction-method",
                       metavar="<1|2>",
                       type=int,
                       default=1,
                       help="Defines the method used for estimating the Tm (melting temperature) between the primers and their corresponding "
                            "target sequences. SANTALUCIA: 1, or OWCZARZY: 2. Default: 1.")
    group.add_argument('--keep-nucs', '-D',
                       action="store", 
                       dest="ecopcr:keep-nucs",
                       metavar="<INTEGER>",
                       type=int,
                       default=0,
                       help="Keeps the specified number of nucleotides on each side of the in silico amplified sequences, "
                            "(already including the amplified DNA fragment plus the two target sequences of the primers).")
    group.add_argument('--kingdom-mode', '-k',
                       action="store_true", 
                       dest="ecopcr:kingdom-mode",
                       default=False,
                       help="Print in the output the kingdom of the in silico amplified sequences (default: print the superkingdom).")
 def run(config):
    cdef int32_t* restrict_to_taxids_p = NULL
    cdef int32_t* ignore_taxids_p = NULL
    restrict_to_taxids_len = len(config['ecopcr']['restrict-to-taxid'])
    restrict_to_taxids_p = <int32_t*> malloc((restrict_to_taxids_len + 1) * sizeof(int32_t))   # +1 for the -1 flagging the end of the array
    for i in range(restrict_to_taxids_len) :
        restrict_to_taxids_p[i] = config['ecopcr']['restrict-to-taxid'][i]
    restrict_to_taxids_p[restrict_to_taxids_len] = -1
    ignore_taxids_len = len(config['ecopcr']['ignore-taxid'])
    ignore_taxids_p = <int32_t*> malloc((ignore_taxids_len + 1) * sizeof(int32_t))   # +1 for the -1 flagging the end of the array
    for i in range(ignore_taxids_len) :
        ignore_taxids_p[i] = config['ecopcr']['ignore-taxid'][i]
    ignore_taxids_p[ignore_taxids_len] = -1
    DMS.obi_atexit()
    logger("info", "obi ecopcr")
    # Open the input: only the DMS
    input = open_uri(config['obi']['inputURI'],
                     dms_only=True)
    if input is None:
        raise Exception("Could not read input")
    i_dms = input[0]
    i_dms_name = input[0].name
    i_view_name = input[1]
    # Open the output: only the DMS
    output = open_uri(config['obi']['outputURI'],
                      input=False,
                      dms_only=True)
    if output is None:
        raise Exception("Could not create output")
    o_dms = output[0]
    o_dms_name = output[0].name
    o_view_name = output[1]
    # Read taxonomy name    
    taxonomy_name = config['obi']['taxoURI'].split("/")[-1]   # Robust in theory
    # Save command config in View comments
    command_line = " ".join(sys.argv[1:])
    input_dms_name=[i_dms_name]
    input_view_name= [i_view_name]
    input_dms_name.append(config['obi']['taxoURI'].split("/")[-3])
    input_view_name.append("taxonomy/"+config['obi']['taxoURI'].split("/")[-1])
    comments = View.print_config(config, "ecopcr", command_line, input_dms_name=input_dms_name, input_view_name=input_view_name)
    # TODO: primers in comments?
    if obi_ecopcr(tobytes(i_dms_name), tobytes(i_view_name), tobytes(taxonomy_name), \
                  tobytes(o_dms_name), tobytes(o_view_name), comments, \
                  tobytes(config['ecopcr']['primer1']), tobytes(config['ecopcr']['primer2']), \
                  config['ecopcr']['error'], \
                  config['ecopcr']['min-length'], config['ecopcr']['max-length'], \
                  restrict_to_taxids_p, ignore_taxids_p, \
                  config['ecopcr']['circular'], config['ecopcr']['salt-concentration'], config['ecopcr']['salt-correction-method'], \
                  config['ecopcr']['keep-nucs'], config['ecopcr']['kingdom-mode']) < 0:
        raise Exception("Error running ecopcr")
    # Save command config in DMS comments
    o_dms.record_command_line(command_line)
    free(restrict_to_taxids_p)
    free(ignore_taxids_p)
    #print("\n\nOutput view:\n````````````", file=sys.stderr)
    #print(repr(o_dms[o_view_name]), file=sys.stderr)
    o_dms.close()
    logger("info", "Done.")
--- a/python/obitools3/commands/ecotag.cfiles
+++ b/python/obitools3/commands/ecotag.cfiles
@ -1,103 +0,0 @@
 ../../../src/obi_lcs.h
 ../../../src/obi_lcs.c
 ../../../src/obierrno.h
 ../../../src/obierrno.c
 ../../../src/upperband.h
 ../../../src/upperband.c
 ../../../src/sse_banded_LCS_alignment.h
 ../../../src/sse_banded_LCS_alignment.c
 ../../../src/obiblob.h
 ../../../src/obiblob.c
 ../../../src/utils.h
 ../../../src/utils.c
 ../../../src/obidms.h
 ../../../src/obidms.c
 ../../../src/libjson/json_utils.h
 ../../../src/libjson/json_utils.c
 ../../../src/libjson/cJSON.h
 ../../../src/libjson/cJSON.c
 ../../../src/obiavl.h
 ../../../src/obiavl.c
 ../../../src/bloom.h
 ../../../src/bloom.c
 ../../../src/crc64.h
 ../../../src/crc64.c
 ../../../src/murmurhash2.h
 ../../../src/murmurhash2.c
 ../../../src/obidmscolumn.h
 ../../../src/obidmscolumn.c
 ../../../src/obitypes.h
 ../../../src/obitypes.c
 ../../../src/obidmscolumndir.h
 ../../../src/obidmscolumndir.c
 ../../../src/obiblob_indexer.h
 ../../../src/obiblob_indexer.c
 ../../../src/obiview.h
 ../../../src/obiview.c
 ../../../src/hashtable.h
 ../../../src/hashtable.c
 ../../../src/linked_list.h
 ../../../src/linked_list.c
 ../../../src/obidmscolumn_array.h
 ../../../src/obidmscolumn_array.c
 ../../../src/obidmscolumn_blob.h
 ../../../src/obidmscolumn_blob.c
 ../../../src/obidmscolumn_idx.h
 ../../../src/obidmscolumn_idx.c
 ../../../src/obidmscolumn_bool.h
 ../../../src/obidmscolumn_bool.c
 ../../../src/obidmscolumn_char.h
 ../../../src/obidmscolumn_char.c
 ../../../src/obidmscolumn_float.h
 ../../../src/obidmscolumn_float.c
 ../../../src/obidmscolumn_int.h
 ../../../src/obidmscolumn_int.c
 ../../../src/obidmscolumn_qual.h
 ../../../src/obidmscolumn_qual.c
 ../../../src/obidmscolumn_seq.h
 ../../../src/obidmscolumn_seq.c
 ../../../src/obidmscolumn_str.h
 ../../../src/obidmscolumn_str.c
 ../../../src/array_indexer.h
 ../../../src/array_indexer.c
 ../../../src/char_str_indexer.h
 ../../../src/char_str_indexer.c
 ../../../src/dna_seq_indexer.h
 ../../../src/dna_seq_indexer.c
 ../../../src/encode.c
 ../../../src/encode.h
 ../../../src/uint8_indexer.c
 ../../../src/uint8_indexer.h
 ../../../src/build_reference_db.c
 ../../../src/build_reference_db.h
 ../../../src/kmer_similarity.c
 ../../../src/kmer_similarity.h
 ../../../src/obi_clean.c
 ../../../src/obi_clean.h
 ../../../src/obi_ecopcr.c
 ../../../src/obi_ecopcr.h
 ../../../src/obi_ecotag.c
 ../../../src/obi_ecotag.h
 ../../../src/obidms_taxonomy.c
 ../../../src/obidms_taxonomy.h
 ../../../src/obilittlebigman.c
 ../../../src/obilittlebigman.h
 ../../../src/_sse.h
 ../../../src/obidebug.h
 ../../../src/libecoPCR/libapat/CODES/dft_code.h
 ../../../src/libecoPCR/libapat/CODES/dna_code.h
 ../../../src/libecoPCR/libapat/CODES/prot_code.h
 ../../../src/libecoPCR/libapat/apat_parse.c
 ../../../src/libecoPCR/libapat/apat_search.c
 ../../../src/libecoPCR/libapat/apat.h
 ../../../src/libecoPCR/libapat/Gmach.h
 ../../../src/libecoPCR/libapat/Gtypes.h
 ../../../src/libecoPCR/libapat/libstki.c
 ../../../src/libecoPCR/libapat/libstki.h
 ../../../src/libecoPCR/libthermo/nnparams.h
 ../../../src/libecoPCR/libthermo/nnparams.c
 ../../../src/libecoPCR/ecoapat.c
 ../../../src/libecoPCR/ecodna.c
 ../../../src/libecoPCR/ecoError.c
 ../../../src/libecoPCR/ecoMalloc.c
 ../../../src/libecoPCR/ecoPCR.h
--- a/python/obitools3/commands/ecotag.pyx
+++ b/python/obitools3/commands/ecotag.pyx
@ -1,129 +0,0 @@
 #cython: language_level=3
 from obitools3.apps.progress cimport ProgressBar  # @UnresolvedImport
 from obitools3.dms.dms cimport DMS
 from obitools3.dms.view import RollbackException
 from obitools3.dms.capi.obiecotag cimport obi_ecotag
 from obitools3.apps.optiongroups import addMinimalInputOption, addTaxonomyOption, addMinimalOutputOption
 from obitools3.uri.decode import open_uri
 from obitools3.apps.config import logger
 from obitools3.utils cimport tobytes, str2bytes
 from obitools3.dms.view.view cimport View
 from obitools3.dms.view.typed_view.view_NUC_SEQS cimport View_NUC_SEQS
 import sys
 __title__="Taxonomic assignment of sequences"
 def addOptions(parser):
    addMinimalInputOption(parser)
    addTaxonomyOption(parser)
    addMinimalOutputOption(parser)
    group = parser.add_argument_group('obi ecotag specific options')
    group.add_argument('--ref-database','-R',
                      action="store", dest="ecotag:ref_view",
                      metavar='<REF_VIEW>',
                      type=str,
                      help="URI of the view containing the reference database as built by the build_ref_db command.")
    group.add_argument('--minimum-identity','-m',
                      action="store", dest="ecotag:threshold",
                      metavar='<THRESHOLD>',
                      default=0.0,
                      type=float,
                      help="Minimum identity to consider for assignment, as a normalized identity, e.g. 0.95 for an identity of 95%%. "
                           "Default: 0.00 (no threshold).")
 def run(config):
    DMS.obi_atexit()
    logger("info", "obi ecotag")
    # Open the query view: only the DMS
    input = open_uri(config['obi']['inputURI'],
                     dms_only=True)
    if input is None:
        raise Exception("Could not read input")
    i_dms = input[0]
    i_dms_name = input[0].name
    i_view_name = input[1]
    # Open the reference view: only the DMS
    ref = open_uri(config['ecotag']['ref_view'],
                     dms_only=True)
    if ref is None:
        raise Exception("Could not read reference view URI")
    ref_dms = ref[0]
    ref_dms_name = ref[0].name
    ref_view_name = ref[1]
    # Open the output: only the DMS
    output = open_uri(config['obi']['outputURI'],
                      input=False,
                      dms_only=True)
    if output is None:
        raise Exception("Could not create output")
    o_dms = output[0]
    final_o_view_name = output[1]
    # If the input and output DMS are not the same, run ecotag creating a temporary view that will be exported to 
    # the right DMS and deleted in the other afterwards.
    if i_dms != o_dms:
        temporary_view_name = final_o_view_name
        i=0
        while temporary_view_name in i_dms:  # Making sure view name is unique in input DMS
            temporary_view_name = final_o_view_name+b"_"+str2bytes(str(i))
            i+=1
        o_view_name = temporary_view_name
    else:
        o_view_name = final_o_view_name
    # Read taxonomy DMS and name
    taxo = open_uri(config['obi']['taxoURI'],
                    dms_only=True)
    taxo_dms_name = taxo[0].name
    taxo_dms = taxo[0]
    taxonomy_name = config['obi']['taxoURI'].split("/")[-1]   # Robust in theory
    # Save command config in View comments
    command_line = " ".join(sys.argv[1:])
    input_dms_name=[i_dms_name]
    input_view_name= [i_view_name]
    input_dms_name.append(ref_dms_name)
    input_view_name.append(ref_view_name)
    input_dms_name.append(config['obi']['taxoURI'].split("/")[-3])
    input_view_name.append("taxonomy/"+config['obi']['taxoURI'].split("/")[-1])
    comments = View.print_config(config, "ecotag", command_line, input_dms_name=input_dms_name, input_view_name=input_view_name)
    if obi_ecotag(tobytes(i_dms_name), tobytes(i_view_name), \
                  tobytes(ref_dms_name), tobytes(ref_view_name), \
                  tobytes(taxo_dms_name), tobytes(taxonomy_name), \
                  tobytes(o_view_name), comments, 
                  config['ecotag']['threshold']) < 0:
        raise Exception("Error running ecotag")
    # If the input and output DMS are not the same, export result view to output DMS
    if i_dms != o_dms:
        View.import_view(i_dms.full_path[:-7], o_dms.full_path[:-7], o_view_name, final_o_view_name)
    # Save command config in DMS comments
    o_dms.record_command_line(command_line)
    #print("\n\nOutput view:\n````````````", file=sys.stderr)
    #print(repr(o_dms[final_o_view_name]), file=sys.stderr)
    # If the input and the output DMS are different, delete the temporary result view in the input DMS
    if i_dms != o_dms:
        View.delete_view(i_dms, o_view_name)
        o_dms.close()
    i_dms.close()
    logger("info", "Done.")
--- a/python/obitools3/commands/export.cfiles
+++ b/python/obitools3/commands/export.cfiles
@ -1,103 +0,0 @@
 ../../../src/obi_lcs.h
 ../../../src/obi_lcs.c
 ../../../src/obierrno.h
 ../../../src/obierrno.c
 ../../../src/upperband.h
 ../../../src/upperband.c
 ../../../src/sse_banded_LCS_alignment.h
 ../../../src/sse_banded_LCS_alignment.c
 ../../../src/obiblob.h
 ../../../src/obiblob.c
 ../../../src/utils.h
 ../../../src/utils.c
 ../../../src/obidms.h
 ../../../src/obidms.c
 ../../../src/libjson/json_utils.h
 ../../../src/libjson/json_utils.c
 ../../../src/libjson/cJSON.h
 ../../../src/libjson/cJSON.c
 ../../../src/obiavl.h
 ../../../src/obiavl.c
 ../../../src/bloom.h
 ../../../src/bloom.c
 ../../../src/crc64.h
 ../../../src/crc64.c
 ../../../src/murmurhash2.h
 ../../../src/murmurhash2.c
 ../../../src/obidmscolumn.h
 ../../../src/obidmscolumn.c
 ../../../src/obitypes.h
 ../../../src/obitypes.c
 ../../../src/obidmscolumndir.h
 ../../../src/obidmscolumndir.c
 ../../../src/obiblob_indexer.h
 ../../../src/obiblob_indexer.c
 ../../../src/obiview.h
 ../../../src/obiview.c
 ../../../src/hashtable.h
 ../../../src/hashtable.c
 ../../../src/linked_list.h
 ../../../src/linked_list.c
 ../../../src/obidmscolumn_array.h
 ../../../src/obidmscolumn_array.c
 ../../../src/obidmscolumn_blob.h
 ../../../src/obidmscolumn_blob.c
 ../../../src/obidmscolumn_idx.h
 ../../../src/obidmscolumn_idx.c
 ../../../src/obidmscolumn_bool.h
 ../../../src/obidmscolumn_bool.c
 ../../../src/obidmscolumn_char.h
 ../../../src/obidmscolumn_char.c
 ../../../src/obidmscolumn_float.h
 ../../../src/obidmscolumn_float.c
 ../../../src/obidmscolumn_int.h
 ../../../src/obidmscolumn_int.c
 ../../../src/obidmscolumn_qual.h
 ../../../src/obidmscolumn_qual.c
 ../../../src/obidmscolumn_seq.h
 ../../../src/obidmscolumn_seq.c
 ../../../src/obidmscolumn_str.h
 ../../../src/obidmscolumn_str.c
 ../../../src/array_indexer.h
 ../../../src/array_indexer.c
 ../../../src/char_str_indexer.h
 ../../../src/char_str_indexer.c
 ../../../src/dna_seq_indexer.h
 ../../../src/dna_seq_indexer.c
 ../../../src/encode.c
 ../../../src/encode.h
 ../../../src/uint8_indexer.c
 ../../../src/uint8_indexer.h
 ../../../src/build_reference_db.c
 ../../../src/build_reference_db.h
 ../../../src/kmer_similarity.c
 ../../../src/kmer_similarity.h
 ../../../src/obi_clean.c
 ../../../src/obi_clean.h
 ../../../src/obi_ecopcr.c
 ../../../src/obi_ecopcr.h
 ../../../src/obi_ecotag.c
 ../../../src/obi_ecotag.h
 ../../../src/obidms_taxonomy.c
 ../../../src/obidms_taxonomy.h
 ../../../src/obilittlebigman.c
 ../../../src/obilittlebigman.h
 ../../../src/_sse.h
 ../../../src/obidebug.h
 ../../../src/libecoPCR/libapat/CODES/dft_code.h
 ../../../src/libecoPCR/libapat/CODES/dna_code.h
 ../../../src/libecoPCR/libapat/CODES/prot_code.h
 ../../../src/libecoPCR/libapat/apat_parse.c
 ../../../src/libecoPCR/libapat/apat_search.c
 ../../../src/libecoPCR/libapat/apat.h
 ../../../src/libecoPCR/libapat/Gmach.h
 ../../../src/libecoPCR/libapat/Gtypes.h
 ../../../src/libecoPCR/libapat/libstki.c
 ../../../src/libecoPCR/libapat/libstki.h
 ../../../src/libecoPCR/libthermo/nnparams.h
 ../../../src/libecoPCR/libthermo/nnparams.c
 ../../../src/libecoPCR/ecoapat.c
 ../../../src/libecoPCR/ecodna.c
 ../../../src/libecoPCR/ecoError.c
 ../../../src/libecoPCR/ecoMalloc.c
 ../../../src/libecoPCR/ecoPCR.h
--- a/python/obitools3/commands/export.pyx
+++ b/python/obitools3/commands/export.pyx
@ -1,69 +0,0 @@
 #cython: language_level=3
 from obitools3.apps.progress cimport ProgressBar  # @UnresolvedImport
 from obitools3.uri.decode import open_uri
 from obitools3.apps.config import logger
 from obitools3.dms import DMS
 from obitools3.dms.obiseq import Nuc_Seq
 from obitools3.apps.optiongroups import addMinimalInputOption, \
                                        addExportOutputOption
 import sys
 __title__="Export a view to a different file format"
 def addOptions(parser):
    addMinimalInputOption(parser)
    addExportOutputOption(parser)
 def run(config):
    DMS.obi_atexit()
    logger("info", "obi export : exports a view to a different file format")
    # Open the input
    input = open_uri(config['obi']['inputURI'])
    if input is None:
        raise Exception("Could not read input")
    iview = input[1]
    # Open the output
    output = open_uri(config['obi']['outputURI'],
                      input=False)
    if output is None:
        raise Exception("Could not open output URI")
    output_object = output[0]
    writer = output[1]
     # Check that the input view has the type NUC_SEQS if needed    # TODO discuss, maybe bool property
    if (output[2] == Nuc_Seq) and (iview.type != b"NUC_SEQS_VIEW") :  # Nuc_Seq_Stored? TODO
        raise Exception("Error: the view to export in fasta or fastq format is not a NUC_SEQS view")
    # Initialize the progress bar
    pb = ProgressBar(len(iview), config, seconde=5)
    i=0
    for seq in iview :
        pb(i)
        try:
            writer(seq)
        except StopIteration:
            break
        i+=1
    pb(i, force=True)
    print("", file=sys.stderr)
    # TODO save command in input dms?
    output_object.close()
    iview.close()
    input[0].close()
    logger("info", "Done.")
--- a/python/obitools3/commands/grep.cfiles
+++ b/python/obitools3/commands/grep.cfiles
@ -1,103 +0,0 @@
 ../../../src/obi_lcs.h
 ../../../src/obi_lcs.c
 ../../../src/obierrno.h
 ../../../src/obierrno.c
 ../../../src/upperband.h
 ../../../src/upperband.c
 ../../../src/sse_banded_LCS_alignment.h
 ../../../src/sse_banded_LCS_alignment.c
 ../../../src/obiblob.h
 ../../../src/obiblob.c
 ../../../src/utils.h
 ../../../src/utils.c
 ../../../src/obidms.h
 ../../../src/obidms.c
 ../../../src/libjson/json_utils.h
 ../../../src/libjson/json_utils.c
 ../../../src/libjson/cJSON.h
 ../../../src/libjson/cJSON.c
 ../../../src/obiavl.h
 ../../../src/obiavl.c
 ../../../src/bloom.h
 ../../../src/bloom.c
 ../../../src/crc64.h
 ../../../src/crc64.c
 ../../../src/murmurhash2.h
 ../../../src/murmurhash2.c
 ../../../src/obidmscolumn.h
 ../../../src/obidmscolumn.c
 ../../../src/obitypes.h
 ../../../src/obitypes.c
 ../../../src/obidmscolumndir.h
 ../../../src/obidmscolumndir.c
 ../../../src/obiblob_indexer.h
 ../../../src/obiblob_indexer.c
 ../../../src/obiview.h
 ../../../src/obiview.c
 ../../../src/hashtable.h
 ../../../src/hashtable.c
 ../../../src/linked_list.h
 ../../../src/linked_list.c
 ../../../src/obidmscolumn_array.h
 ../../../src/obidmscolumn_array.c
 ../../../src/obidmscolumn_blob.h
 ../../../src/obidmscolumn_blob.c
 ../../../src/obidmscolumn_idx.h
 ../../../src/obidmscolumn_idx.c
 ../../../src/obidmscolumn_bool.h
 ../../../src/obidmscolumn_bool.c
 ../../../src/obidmscolumn_char.h
 ../../../src/obidmscolumn_char.c
 ../../../src/obidmscolumn_float.h
 ../../../src/obidmscolumn_float.c
 ../../../src/obidmscolumn_int.h
 ../../../src/obidmscolumn_int.c
 ../../../src/obidmscolumn_qual.h
 ../../../src/obidmscolumn_qual.c
 ../../../src/obidmscolumn_seq.h
 ../../../src/obidmscolumn_seq.c
 ../../../src/obidmscolumn_str.h
 ../../../src/obidmscolumn_str.c
 ../../../src/array_indexer.h
 ../../../src/array_indexer.c
 ../../../src/char_str_indexer.h
 ../../../src/char_str_indexer.c
 ../../../src/dna_seq_indexer.h
 ../../../src/dna_seq_indexer.c
 ../../../src/encode.c
 ../../../src/encode.h
 ../../../src/uint8_indexer.c
 ../../../src/uint8_indexer.h
 ../../../src/build_reference_db.c
 ../../../src/build_reference_db.h
 ../../../src/kmer_similarity.c
 ../../../src/kmer_similarity.h
 ../../../src/obi_clean.c
 ../../../src/obi_clean.h
 ../../../src/obi_ecopcr.c
 ../../../src/obi_ecopcr.h
 ../../../src/obi_ecotag.c
 ../../../src/obi_ecotag.h
 ../../../src/obidms_taxonomy.c
 ../../../src/obidms_taxonomy.h
 ../../../src/obilittlebigman.c
 ../../../src/obilittlebigman.h
 ../../../src/_sse.h
 ../../../src/obidebug.h
 ../../../src/libecoPCR/libapat/CODES/dft_code.h
 ../../../src/libecoPCR/libapat/CODES/dna_code.h
 ../../../src/libecoPCR/libapat/CODES/prot_code.h
 ../../../src/libecoPCR/libapat/apat_parse.c
 ../../../src/libecoPCR/libapat/apat_search.c
 ../../../src/libecoPCR/libapat/apat.h
 ../../../src/libecoPCR/libapat/Gmach.h
 ../../../src/libecoPCR/libapat/Gtypes.h
 ../../../src/libecoPCR/libapat/libstki.c
 ../../../src/libecoPCR/libapat/libstki.h
 ../../../src/libecoPCR/libthermo/nnparams.h
 ../../../src/libecoPCR/libthermo/nnparams.c
 ../../../src/libecoPCR/ecoapat.c
 ../../../src/libecoPCR/ecodna.c
 ../../../src/libecoPCR/ecoError.c
 ../../../src/libecoPCR/ecoMalloc.c
 ../../../src/libecoPCR/ecoPCR.h
--- a/python/obitools3/commands/grep.pyx
+++ b/python/obitools3/commands/grep.pyx
@ -1,352 +0,0 @@
 #cython: language_level=3
 from obitools3.apps.progress cimport ProgressBar  # @UnresolvedImport
 from obitools3.dms import DMS
 from obitools3.dms.view.view cimport View, Line_selection
 from obitools3.uri.decode import open_uri
 from obitools3.apps.optiongroups import addMinimalInputOption, addTaxonomyOption, addMinimalOutputOption
 from obitools3.dms.view import RollbackException
 from obitools3.apps.config import logger
 from obitools3.utils cimport tobytes, str2bytes
 from functools import reduce
 import time
 import re
 import sys
 __title__="Grep view lines that match the given predicates"
 # TODO should sequences that have a grepped attribute at None be grepped or not? (in obi1 they are but....)
 def addOptions(parser):
    addMinimalInputOption(parser)
    addTaxonomyOption(parser)
    addMinimalOutputOption(parser)
    group=parser.add_argument_group("obi grep specific options")
    group.add_argument("--predicate", "-p",
                       action="append", dest="grep:grep_predicates",
                       metavar="<PREDICATE>",
                       default=None,
                       type=str,
                       help="Python boolean expression to be evaluated in the "
                            "sequence/line context. The attribute name can be "
                            "used in the expression as a variable name."
                            "An extra variable named 'sequence' or 'line' refers"
                            "to the sequence or line object itself. "
                            "Several -p options can be used on the same "
                            "commande line.")
    group.add_argument("-S", "--sequence",
                       action="store", dest="grep:seq_pattern",
                       metavar="<REGULAR_PATTERN>",
                       type=str,
                       help="Regular expression pattern used to select "
                            "the sequence. The pattern is case insensitive.")
    group.add_argument("-D", "--definition",
                       action="store", dest="grep:def_pattern",
                       metavar="<REGULAR_PATTERN>",
                       type=str,
                       help="Regular expression pattern used to select "
                            "the definition of the sequence. The pattern is case insensitive.")
    group.add_argument("-I", "--identifier",
                       action="store", dest="grep:id_pattern",
                       metavar="<REGULAR_PATTERN>",
                       type=str,
                       help="Regular expression pattern used to select "
                            "the identifier of the sequence. The pattern is case insensitive.")
    group.add_argument("--id-list",
                       action="store", dest="grep:id_list",
                       metavar="<FILE_NAME>",
                       type=str,
                       help="File containing the identifiers of the sequences to select.")
    group.add_argument("-a", "--attribute",
                       action="append", dest="grep:attribute_patterns",
                       type=str,
                       default=[],
                       metavar="<ATTRIBUTE_NAME>:<REGULAR_PATTERN>",
                       help="Regular expression pattern matched against "
                            "the attributes of the sequence. "
                            "The pattern is case sensitive. "
                            "Several -a options can be used on the same "
                            "command line.")
    group.add_argument("-A", "--has-attribute",
                       action="append", dest="grep:attributes",
                       type=str,
                       default=[],
                       metavar="<ATTRIBUTE_NAME>",
                       help="Select records with the attribute <ATTRIBUTE_NAME> "
                            "defined (not set to NA value). "
                            "Several -a options can be used on the same "
                            "command line.")
    group.add_argument("-L", "--lmax",
                       action="store", dest="grep:lmax",
                       metavar="<MAX_LENGTH>",
                       type=int,
                       help="Keep sequences shorter than MAX_LENGTH.")
    group.add_argument("-l", "--lmin",
                       action="store", dest="grep:lmin",
                       metavar="<MIN_LENGTH>",
                       type=int,
                       help="Keep sequences longer than MIN_LENGTH.")
    group.add_argument("-v", "--invert-selection",
                       action="store_true", dest="grep:invert_selection",
                       default=False,
                       help="Invert the selection.")
    group=parser.add_argument_group("Taxonomy filtering specific options")  #TODO put somewhere else? not in grep
    group.add_argument('--require-rank',
                       action="append", dest="grep:required_ranks",
                       metavar="<RANK_NAME>",
                       type=str,
                       default=[],
                       help="Select sequences with a taxid that is or has "
                            "a parent of rank <RANK_NAME>.")
    group.add_argument('-r', '--required',
                       action="append", dest="grep:required_taxids",
                       metavar="<TAXID>",
                       type=int,
                       default=[],
                       help="Select the sequences having the ancestor of taxid <TAXID>. "
                            "If several ancestors are specified (with \n'-r taxid1 -r taxid2'), "
                            "the sequences having at least one of them are selected.")
    # TODO useless option equivalent to -r -v?
    group.add_argument('-i','--ignore',
                     action="append", dest="grep:ignored_taxids",
                     metavar="<TAXID>",
                     type=int,
                     default=[],
                     help="Ignore the sequences having the ancestor of taxid <TAXID>. "
                          "If several ancestors are specified (with \n'-r taxid1 -r taxid2'), "
                          "the sequences having at least one of them are ignored.")
 def Filter_generator(options, tax_filter):
    #taxfilter = taxonomyFilterGenerator(options)
    # Initialize conditions
    predicates = None
    if "predicates" in options:
        predicates = options["predicates"]
    attributes = None
    if "attributes" in options:
        attributes = options["attributes"]
    lmax = None
    if "lmax" in options:
        lmax = options["lmax"]
    lmin = None
    if "lmin" in options:
        lmin = options["lmin"]
    invert_selection = options["invert_selection"]
    id_set = None
    if "id_list" in options:
        id_set = set(x.strip() for x in open(options["id_list"]))
    # Initialize the regular expression patterns
    seq_pattern = None
    if "seq_pattern" in options:
        seq_pattern = re.compile(tobytes(options["seq_pattern"]), re.I)
    id_pattern = None
    if "id_pattern" in options:
        id_pattern = re.compile(tobytes(options["id_pattern"]))
    def_pattern = None
    if "def_pattern" in options:
        def_pattern = re.compile(tobytes(options["def_pattern"]))
    attribute_patterns={}
    if "attribute_patterns" in options:
        for p in options["attribute_patterns"]:
            attribute, pattern = p.split(":", 1)
            attribute_patterns[tobytes(attribute)] = re.compile(tobytes(pattern))
    def filter(line, loc_env):
        cdef bint good = True
        if seq_pattern and hasattr(line, "seq"):
            good = <bint>(seq_pattern.search(line.seq))
        if good and id_pattern and hasattr(line, "id"):
            good = <bint>(id_pattern.search(line.id))
        if good and id_set is not None and hasattr(line, "id"):
            good = line.id in id_set
        if good and def_pattern and hasattr(line, "definition"):
            good = <bint>(def_pattern.search(line.definition))
        if good and attributes:  # TODO discuss that we test not None
            good = reduce(lambda bint x, bint y: x and y,
                           (line[attribute] is not None for attribute in attributes),
                           True)
        if good and attribute_patterns:
            good = (reduce(lambda bint x, bint y : x and y, 
                        (line[attribute] is not None for attribute in attributes),
                        True)
                    and
                    reduce(lambda bint x, bint y: x and y,
                        (<bint>(attribute_patterns[attribute].search(tobytes(str(line[attribute]))))
                        for attribute in attribute_patterns), 
                        True)
                   )
        if good and predicates:
            good = (reduce(lambda bint x, bint y: x and y,
                    (bool(eval(p, loc_env, line))
                    for p in predicates), True))
        if good and lmin:
            good = len(line) >= lmin
        if good and lmax:
            good = len(line) <= lmax
        if good:
            good = tax_filter(line)
        if invert_selection :
            good = not good
        return good
    return filter
 def Taxonomy_filter_generator(taxo, options):
    if taxo is not None:
        def tax_filter(seq):
            good = True
            if b'TAXID' in seq and seq[b'TAXID'] is not None:   # TODO use macro
                taxid = seq[b'TAXID']
                if "required_ranks" in options and options["required_ranks"]:
                    taxon_at_rank = reduce(lambda x,y: x and y,
                                           (taxo.get_taxon_at_rank(seq[b'TAXID'], rank) is not None
                                            for rank in options["required_ranks"]),
                                           True)
                    good = good and taxon_at_rank 
                if "required_taxids" in options and options["required_taxids"]:
                    good = good and reduce(lambda x,y: x or y,
                                           (taxo.is_ancestor(r, taxid) 
                                            for r in options["required_taxids"]),
                                           False)
                if "ignored_taxids" in options and options["ignored_taxids"]:
                    good = good and not reduce(lambda x,y: x or y,
                                               (taxo.is_ancestor(r,taxid) 
                                                for r in options["ignored_taxids"]),
                                               False)
            return good
    else:
        def tax_filter(seq):
            return True
    return tax_filter
 def run(config):
    DMS.obi_atexit()
    logger("info", "obi grep")
    # Open the input
    input = open_uri(config["obi"]["inputURI"])
    if input is None:
        raise Exception("Could not read input view")
    i_dms = input[0]
    i_view = input[1]
    # Open the output: only the DMS
    output = open_uri(config['obi']['outputURI'],
                      input=False,
                      dms_only=True)
    if output is None:
        raise Exception("Could not create output view")
    o_dms = output[0]
    o_view_name_final = output[1]
    o_view_name = o_view_name_final
    # If the input and output DMS are not the same, create output view in input DMS first, then export it
    # to output DMS, making sure the temporary view name is unique in the input DMS 
    if i_dms != o_dms:
        i=0
        while o_view_name in i_dms:
            o_view_name = o_view_name_final+b"_"+str2bytes(str(i))
            i+=1        
    if 'taxoURI' in config['obi'] and config['obi']['taxoURI'] is not None:
        taxo_uri = open_uri(config["obi"]["taxoURI"])
        if taxo_uri is None:
            raise Exception("Couldn't open taxonomy")
        taxo = taxo_uri[1]
    else :
        taxo = None
    # Initialize the progress bar
    pb = ProgressBar(len(i_view), config, seconde=5)
    # Apply filter
    tax_filter = Taxonomy_filter_generator(taxo, config["grep"])
    filter = Filter_generator(config["grep"], tax_filter)
    selection = Line_selection(i_view)
    for i in range(len(i_view)):
        pb(i)
        line = i_view[i]
        loc_env = {"sequence": line, "line": line, "taxonomy": taxo}
        good = filter(line, loc_env)
        if good :
            selection.append(i)
    pb(i, force=True)
    print("", file=sys.stderr)
    # Create output view with the line selection
    try:
        o_view = selection.materialize(o_view_name)
    except Exception, e:
        raise RollbackException("obi grep error, rollbacking view: "+str(e), o_view)
    # Save command config in View and DMS comments
    command_line = " ".join(sys.argv[1:])
    input_dms_name=[input[0].name]
    input_view_name=[input[1].name]
    if 'taxoURI' in config['obi'] and config['obi']['taxoURI'] is not None:
        input_dms_name.append(config['obi']['taxoURI'].split("/")[-3])
        input_view_name.append("taxonomy/"+config['obi']['taxoURI'].split("/")[-1])
    o_view.write_config(config, "grep", command_line, input_dms_name=input_dms_name, input_view_name=input_view_name)
    o_dms.record_command_line(command_line)
    # If input and output DMS are not the same, export the temporary view to the output DMS
    # and delete the temporary view in the input DMS
    if i_dms != o_dms:
        o_view.close()
        View.import_view(i_dms.full_path[:-7], o_dms.full_path[:-7], o_view_name, o_view_name_final)
        o_view = o_dms[o_view_name_final]
    #print("\n\nOutput view:\n````````````", file=sys.stderr)
    #print(repr(o_view), file=sys.stderr)
    # If the input and the output DMS are different, delete the temporary imported view used to create the final view
    if i_dms != o_dms:
        View.delete_view(i_dms, o_view_name)
        o_dms.close()
    i_dms.close()
    logger("info", "Done.")
--- a/python/obitools3/commands/head.cfiles
+++ b/python/obitools3/commands/head.cfiles
@ -1,103 +0,0 @@
 ../../../src/obi_lcs.h
 ../../../src/obi_lcs.c
 ../../../src/obierrno.h
 ../../../src/obierrno.c
 ../../../src/upperband.h
 ../../../src/upperband.c
 ../../../src/sse_banded_LCS_alignment.h
 ../../../src/sse_banded_LCS_alignment.c
 ../../../src/obiblob.h
 ../../../src/obiblob.c
 ../../../src/utils.h
 ../../../src/utils.c
 ../../../src/obidms.h
 ../../../src/obidms.c
 ../../../src/libjson/json_utils.h
 ../../../src/libjson/json_utils.c
 ../../../src/libjson/cJSON.h
 ../../../src/libjson/cJSON.c
 ../../../src/obiavl.h
 ../../../src/obiavl.c
 ../../../src/bloom.h
 ../../../src/bloom.c
 ../../../src/crc64.h
 ../../../src/crc64.c
 ../../../src/murmurhash2.h
 ../../../src/murmurhash2.c
 ../../../src/obidmscolumn.h
 ../../../src/obidmscolumn.c
 ../../../src/obitypes.h
 ../../../src/obitypes.c
 ../../../src/obidmscolumndir.h
 ../../../src/obidmscolumndir.c
 ../../../src/obiblob_indexer.h
 ../../../src/obiblob_indexer.c
 ../../../src/obiview.h
 ../../../src/obiview.c
 ../../../src/hashtable.h
 ../../../src/hashtable.c
 ../../../src/linked_list.h
 ../../../src/linked_list.c
 ../../../src/obidmscolumn_array.h
 ../../../src/obidmscolumn_array.c
 ../../../src/obidmscolumn_blob.h
 ../../../src/obidmscolumn_blob.c
 ../../../src/obidmscolumn_idx.h
 ../../../src/obidmscolumn_idx.c
 ../../../src/obidmscolumn_bool.h
 ../../../src/obidmscolumn_bool.c
 ../../../src/obidmscolumn_char.h
 ../../../src/obidmscolumn_char.c
 ../../../src/obidmscolumn_float.h
 ../../../src/obidmscolumn_float.c
 ../../../src/obidmscolumn_int.h
 ../../../src/obidmscolumn_int.c
 ../../../src/obidmscolumn_qual.h
 ../../../src/obidmscolumn_qual.c
 ../../../src/obidmscolumn_seq.h
 ../../../src/obidmscolumn_seq.c
 ../../../src/obidmscolumn_str.h
 ../../../src/obidmscolumn_str.c
 ../../../src/array_indexer.h
 ../../../src/array_indexer.c
 ../../../src/char_str_indexer.h
 ../../../src/char_str_indexer.c
 ../../../src/dna_seq_indexer.h
 ../../../src/dna_seq_indexer.c
 ../../../src/encode.c
 ../../../src/encode.h
 ../../../src/uint8_indexer.c
 ../../../src/uint8_indexer.h
 ../../../src/build_reference_db.c
 ../../../src/build_reference_db.h
 ../../../src/kmer_similarity.c
 ../../../src/kmer_similarity.h
 ../../../src/obi_clean.c
 ../../../src/obi_clean.h
 ../../../src/obi_ecopcr.c
 ../../../src/obi_ecopcr.h
 ../../../src/obi_ecotag.c
 ../../../src/obi_ecotag.h
 ../../../src/obidms_taxonomy.c
 ../../../src/obidms_taxonomy.h
 ../../../src/obilittlebigman.c
 ../../../src/obilittlebigman.h
 ../../../src/_sse.h
 ../../../src/obidebug.h
 ../../../src/libecoPCR/libapat/CODES/dft_code.h
 ../../../src/libecoPCR/libapat/CODES/dna_code.h
 ../../../src/libecoPCR/libapat/CODES/prot_code.h
 ../../../src/libecoPCR/libapat/apat_parse.c
 ../../../src/libecoPCR/libapat/apat_search.c
 ../../../src/libecoPCR/libapat/apat.h
 ../../../src/libecoPCR/libapat/Gmach.h
 ../../../src/libecoPCR/libapat/Gtypes.h
 ../../../src/libecoPCR/libapat/libstki.c
 ../../../src/libecoPCR/libapat/libstki.h
 ../../../src/libecoPCR/libthermo/nnparams.h
 ../../../src/libecoPCR/libthermo/nnparams.c
 ../../../src/libecoPCR/ecoapat.c
 ../../../src/libecoPCR/ecodna.c
 ../../../src/libecoPCR/ecoError.c
 ../../../src/libecoPCR/ecoMalloc.c
 ../../../src/libecoPCR/ecoPCR.h
--- a/python/obitools3/commands/head.pyx
+++ b/python/obitools3/commands/head.pyx
@ -1,106 +0,0 @@
 #cython: language_level=3
 from obitools3.apps.progress cimport ProgressBar  # @UnresolvedImport
 from obitools3.dms import DMS
 from obitools3.dms.view.view cimport View, Line_selection
 from obitools3.uri.decode import open_uri
 from obitools3.apps.optiongroups import addMinimalInputOption, addMinimalOutputOption
 from obitools3.dms.view import RollbackException
 from obitools3.apps.config import logger
 from obitools3.utils cimport str2bytes
 import time
 import sys
 __title__="Keep the N first lines of a view."
 def addOptions(parser):
    addMinimalInputOption(parser)
    addMinimalOutputOption(parser)
    group=parser.add_argument_group('obi head specific options')
    group.add_argument('-n', '--sequence-count',
                       action="store", dest="head:count",
                       metavar='<N>',
                       default=10,
                       type=int,
                       help="Number of first records to keep.")
 def run(config):
    DMS.obi_atexit()
    logger("info", "obi head")
    # Open the input
    input = open_uri(config["obi"]["inputURI"])
    if input is None:
        raise Exception("Could not read input view")
    i_dms = input[0]
    i_view = input[1]
    # Open the output: only the DMS
    output = open_uri(config['obi']['outputURI'],
                      input=False,
                      dms_only=True)
    if output is None:
        raise Exception("Could not create output view")
    o_dms = output[0]
    o_view_name_final = output[1]
    o_view_name = o_view_name_final
    # If the input and output DMS are not the same, create output view in input DMS first, then export it
    # to output DMS, making sure the temporary view name is unique in the input DMS 
    if i_dms != o_dms:
        i=0
        while o_view_name in i_dms:
            o_view_name = o_view_name_final+b"_"+str2bytes(str(i))
            i+=1        
    n = min(config['head']['count'], len(i_view))
    # Initialize the progress bar
    pb = ProgressBar(n, config, seconde=5)
    selection = Line_selection(i_view)
    for i in range(n):
        pb(i)
        selection.append(i)
    pb(i, force=True)
    print("", file=sys.stderr)
    # Create output view with the line selection
    try:
        o_view = selection.materialize(o_view_name)
    except Exception, e:
        raise RollbackException("obi head error, rollbacking view: "+str(e), o_view)
    # Save command config in DMS comments
    command_line = " ".join(sys.argv[1:])
    o_view.write_config(config, "head", command_line, input_dms_name=[i_dms.name], input_view_name=[i_view.name])
    o_dms.record_command_line(command_line)
    # If input and output DMS are not the same, export the temporary view to the output DMS
    # and delete the temporary view in the input DMS
    if i_dms != o_dms:
        o_view.close()
        View.import_view(i_dms.full_path[:-7], o_dms.full_path[:-7], o_view_name, o_view_name_final)
        o_view = o_dms[o_view_name_final]
    #print("\n\nOutput view:\n````````````", file=sys.stderr)
    #print(repr(view), file=sys.stderr)
    # If the input and the output DMS are different, delete the temporary imported view used to create the final view
    if i_dms != o_dms:
        View.delete_view(i_dms, o_view_name)
        o_dms.close()
    i_dms.close()
    logger("info", "Done.")
--- a/python/obitools3/commands/history.cfiles
+++ b/python/obitools3/commands/history.cfiles
@ -1,103 +0,0 @@
 ../../../src/obi_lcs.h
 ../../../src/obi_lcs.c
 ../../../src/obierrno.h
 ../../../src/obierrno.c
 ../../../src/upperband.h
 ../../../src/upperband.c
 ../../../src/sse_banded_LCS_alignment.h
 ../../../src/sse_banded_LCS_alignment.c
 ../../../src/obiblob.h
 ../../../src/obiblob.c
 ../../../src/utils.h
 ../../../src/utils.c
 ../../../src/obidms.h
 ../../../src/obidms.c
 ../../../src/libjson/json_utils.h
 ../../../src/libjson/json_utils.c
 ../../../src/libjson/cJSON.h
 ../../../src/libjson/cJSON.c
 ../../../src/obiavl.h
 ../../../src/obiavl.c
 ../../../src/bloom.h
 ../../../src/bloom.c
 ../../../src/crc64.h
 ../../../src/crc64.c
 ../../../src/murmurhash2.h
 ../../../src/murmurhash2.c
 ../../../src/obidmscolumn.h
 ../../../src/obidmscolumn.c
 ../../../src/obitypes.h
 ../../../src/obitypes.c
 ../../../src/obidmscolumndir.h
 ../../../src/obidmscolumndir.c
 ../../../src/obiblob_indexer.h
 ../../../src/obiblob_indexer.c
 ../../../src/obiview.h
 ../../../src/obiview.c
 ../../../src/hashtable.h
 ../../../src/hashtable.c
 ../../../src/linked_list.h
 ../../../src/linked_list.c
 ../../../src/obidmscolumn_array.h
 ../../../src/obidmscolumn_array.c
 ../../../src/obidmscolumn_blob.h
 ../../../src/obidmscolumn_blob.c
 ../../../src/obidmscolumn_idx.h
 ../../../src/obidmscolumn_idx.c
 ../../../src/obidmscolumn_bool.h
 ../../../src/obidmscolumn_bool.c
 ../../../src/obidmscolumn_char.h
 ../../../src/obidmscolumn_char.c
 ../../../src/obidmscolumn_float.h
 ../../../src/obidmscolumn_float.c
 ../../../src/obidmscolumn_int.h
 ../../../src/obidmscolumn_int.c
 ../../../src/obidmscolumn_qual.h
 ../../../src/obidmscolumn_qual.c
 ../../../src/obidmscolumn_seq.h
 ../../../src/obidmscolumn_seq.c
 ../../../src/obidmscolumn_str.h
 ../../../src/obidmscolumn_str.c
 ../../../src/array_indexer.h
 ../../../src/array_indexer.c
 ../../../src/char_str_indexer.h
 ../../../src/char_str_indexer.c
 ../../../src/dna_seq_indexer.h
 ../../../src/dna_seq_indexer.c
 ../../../src/encode.c
 ../../../src/encode.h
 ../../../src/uint8_indexer.c
 ../../../src/uint8_indexer.h
 ../../../src/build_reference_db.c
 ../../../src/build_reference_db.h
 ../../../src/kmer_similarity.c
 ../../../src/kmer_similarity.h
 ../../../src/obi_clean.c
 ../../../src/obi_clean.h
 ../../../src/obi_ecopcr.c
 ../../../src/obi_ecopcr.h
 ../../../src/obi_ecotag.c
 ../../../src/obi_ecotag.h
 ../../../src/obidms_taxonomy.c
 ../../../src/obidms_taxonomy.h
 ../../../src/obilittlebigman.c
 ../../../src/obilittlebigman.h
 ../../../src/_sse.h
 ../../../src/obidebug.h
 ../../../src/libecoPCR/libapat/CODES/dft_code.h
 ../../../src/libecoPCR/libapat/CODES/dna_code.h
 ../../../src/libecoPCR/libapat/CODES/prot_code.h
 ../../../src/libecoPCR/libapat/apat_parse.c
 ../../../src/libecoPCR/libapat/apat_search.c
 ../../../src/libecoPCR/libapat/apat.h
 ../../../src/libecoPCR/libapat/Gmach.h
 ../../../src/libecoPCR/libapat/Gtypes.h
 ../../../src/libecoPCR/libapat/libstki.c
 ../../../src/libecoPCR/libapat/libstki.h
 ../../../src/libecoPCR/libthermo/nnparams.h
 ../../../src/libecoPCR/libthermo/nnparams.c
 ../../../src/libecoPCR/ecoapat.c
 ../../../src/libecoPCR/ecodna.c
 ../../../src/libecoPCR/ecoError.c
 ../../../src/libecoPCR/ecoMalloc.c
 ../../../src/libecoPCR/ecoPCR.h
--- a/python/obitools3/commands/history.pyx
+++ b/python/obitools3/commands/history.pyx
@ -1,57 +0,0 @@
 #cython: language_level=3
 from obitools3.apps.optiongroups import addMinimalInputOption
 from obitools3.uri.decode import open_uri
 from obitools3.dms import DMS
 from obitools3.dms.view import View
 from obitools3.utils cimport bytes2str
 __title__="Command line histories and view history graphs"
 def addOptions(parser):
    addMinimalInputOption(parser)
    group=parser.add_argument_group('obi history specific options')
    group.add_argument('--bash', '-b',
                     action="store_const", dest="history:format",
                     default="bash",
                     const="bash",
                     help="Print history in bash format")
    group.add_argument('--dot', '-d',
                     action="store_const", dest="history:format",
                     default="bash",
                     const="dot",
                     help="Print history in DOT format (default: bash format)")
    group.add_argument('--ascii', '-a',
                     action="store_const", dest="history:format",
                     default="bash",
                     const="ascii",
                     help="Print history in ASCII format (only for views; default: bash format)")
 def run(config):
    cdef object entries
    DMS.obi_atexit()
    input = open_uri(config['obi']['inputURI'])
    entries = input[1]
    if config['history']['format'] == "bash" :
        print(bytes2str(entries.bash_history))
    elif config['history']['format'] == "dot" :
        print(bytes2str(entries.dot_history_graph))
    elif config['history']['format'] == "ascii" :
        if isinstance(entries, View):
            print(bytes2str(entries.ascii_history_graph))
        else:
            raise Exception("ASCII history only available for views")
--- a/python/obitools3/commands/import.cfiles
+++ b/python/obitools3/commands/import.cfiles
@ -1,103 +0,0 @@
 ../../../src/obi_lcs.h
 ../../../src/obi_lcs.c
 ../../../src/obierrno.h
 ../../../src/obierrno.c
 ../../../src/upperband.h
 ../../../src/upperband.c
 ../../../src/sse_banded_LCS_alignment.h
 ../../../src/sse_banded_LCS_alignment.c
 ../../../src/obiblob.h
 ../../../src/obiblob.c
 ../../../src/utils.h
 ../../../src/utils.c
 ../../../src/obidms.h
 ../../../src/obidms.c
 ../../../src/libjson/json_utils.h
 ../../../src/libjson/json_utils.c
 ../../../src/libjson/cJSON.h
 ../../../src/libjson/cJSON.c
 ../../../src/obiavl.h
 ../../../src/obiavl.c
 ../../../src/bloom.h
 ../../../src/bloom.c
 ../../../src/crc64.h
 ../../../src/crc64.c
 ../../../src/murmurhash2.h
 ../../../src/murmurhash2.c
 ../../../src/obidmscolumn.h
 ../../../src/obidmscolumn.c
 ../../../src/obitypes.h
 ../../../src/obitypes.c
 ../../../src/obidmscolumndir.h
 ../../../src/obidmscolumndir.c
 ../../../src/obiblob_indexer.h
 ../../../src/obiblob_indexer.c
 ../../../src/obiview.h
 ../../../src/obiview.c
 ../../../src/hashtable.h
 ../../../src/hashtable.c
 ../../../src/linked_list.h
 ../../../src/linked_list.c
 ../../../src/obidmscolumn_array.h
 ../../../src/obidmscolumn_array.c
 ../../../src/obidmscolumn_blob.h
 ../../../src/obidmscolumn_blob.c
 ../../../src/obidmscolumn_idx.h
 ../../../src/obidmscolumn_idx.c
 ../../../src/obidmscolumn_bool.h
 ../../../src/obidmscolumn_bool.c
 ../../../src/obidmscolumn_char.h
 ../../../src/obidmscolumn_char.c
 ../../../src/obidmscolumn_float.h
 ../../../src/obidmscolumn_float.c
 ../../../src/obidmscolumn_int.h
 ../../../src/obidmscolumn_int.c
 ../../../src/obidmscolumn_qual.h
 ../../../src/obidmscolumn_qual.c
 ../../../src/obidmscolumn_seq.h
 ../../../src/obidmscolumn_seq.c
 ../../../src/obidmscolumn_str.h
 ../../../src/obidmscolumn_str.c
 ../../../src/array_indexer.h
 ../../../src/array_indexer.c
 ../../../src/char_str_indexer.h
 ../../../src/char_str_indexer.c
 ../../../src/dna_seq_indexer.h
 ../../../src/dna_seq_indexer.c
 ../../../src/encode.c
 ../../../src/encode.h
 ../../../src/uint8_indexer.c
 ../../../src/uint8_indexer.h
 ../../../src/build_reference_db.c
 ../../../src/build_reference_db.h
 ../../../src/kmer_similarity.c
 ../../../src/kmer_similarity.h
 ../../../src/obi_clean.c
 ../../../src/obi_clean.h
 ../../../src/obi_ecopcr.c
 ../../../src/obi_ecopcr.h
 ../../../src/obi_ecotag.c
 ../../../src/obi_ecotag.h
 ../../../src/obidms_taxonomy.c
 ../../../src/obidms_taxonomy.h
 ../../../src/obilittlebigman.c
 ../../../src/obilittlebigman.h
 ../../../src/_sse.h
 ../../../src/obidebug.h
 ../../../src/libecoPCR/libapat/CODES/dft_code.h
 ../../../src/libecoPCR/libapat/CODES/dna_code.h
 ../../../src/libecoPCR/libapat/CODES/prot_code.h
 ../../../src/libecoPCR/libapat/apat_parse.c
 ../../../src/libecoPCR/libapat/apat_search.c
 ../../../src/libecoPCR/libapat/apat.h
 ../../../src/libecoPCR/libapat/Gmach.h
 ../../../src/libecoPCR/libapat/Gtypes.h
 ../../../src/libecoPCR/libapat/libstki.c
 ../../../src/libecoPCR/libapat/libstki.h
 ../../../src/libecoPCR/libthermo/nnparams.h
 ../../../src/libecoPCR/libthermo/nnparams.c
 ../../../src/libecoPCR/ecoapat.c
 ../../../src/libecoPCR/ecodna.c
 ../../../src/libecoPCR/ecoError.c
 ../../../src/libecoPCR/ecoMalloc.c
 ../../../src/libecoPCR/ecoPCR.h
--- a/python/obitools3/commands/import.pyx
+++ b/python/obitools3/commands/import.pyx
@ -1,301 +0,0 @@
 #cython: language_level=3
 import sys
 import os
 from obitools3.apps.progress cimport ProgressBar  # @UnresolvedImport
 from obitools3.dms.view.view cimport View
 from obitools3.dms.view import RollbackException
 from obitools3.dms.view.typed_view.view_NUC_SEQS cimport View_NUC_SEQS
 from obitools3.dms.column.column cimport Column
 from obitools3.dms.obiseq cimport Nuc_Seq
 from obitools3.dms import DMS
 from obitools3.dms.taxo.taxo cimport Taxonomy
 from obitools3.utils cimport tobytes, \
                             get_obitype, \
                             update_obitype
 from obitools3.dms.capi.obiview cimport VIEW_TYPE_NUC_SEQS, \
                                        NUC_SEQUENCE_COLUMN, \
                                        ID_COLUMN, \
                                        DEFINITION_COLUMN, \
                                        QUALITY_COLUMN, \
                                        COUNT_COLUMN, \
                                        TAXID_COLUMN
 from obitools3.dms.capi.obitypes cimport obitype_t, \
                                         OBI_VOID, \
                                         OBI_QUAL
 from obitools3.dms.capi.obierrno cimport obi_errno
 from obitools3.apps.optiongroups import addImportInputOption, \
                                        addTabularInputOption, \
                                        addTaxdumpInputOption, \
                                        addMinimalOutputOption
 from obitools3.uri.decode import open_uri
 from obitools3.apps.config import logger
 __title__="Imports sequences from different formats into a DMS"
 default_config = {   'destview'     : None,
                     'skip'         : 0,
                     'only'         : None,
                     'skiperror'    : False,
                     'seqinformat'  : None,
                     'moltype'      : 'nuc',
                     'source'     : None
                 }
 def addOptions(parser):
    addImportInputOption(parser)
    addTabularInputOption(parser)
    addTaxdumpInputOption(parser)
    addMinimalOutputOption(parser)
 def run(config):
    cdef   tuple       input
    cdef   tuple       output 
    cdef   int         i
    cdef   type        value_type
    cdef   obitype_t   value_obitype
    cdef   obitype_t   old_type
    cdef   obitype_t   new_type
    cdef   bint        get_quality
    cdef   bint        NUC_SEQS_view
    cdef   int         nb_elts
    cdef   object      d
    cdef   View        view
    cdef   object      entries
    cdef   object      entry
    cdef   Column      id_col
    cdef   Column      def_col
    cdef   Column      seq_col
    cdef   Column      qual_col
    cdef   Column      old_column
    cdef   bint        rewrite
    cdef   dict        dcols
    cdef   int         skipping
    cdef   bytes       tag
    cdef   object      value
    cdef   list        elt_names
    cdef   int         old_nb_elements_per_line
    cdef   int         new_nb_elements_per_line
    cdef   list        old_elements_names
    cdef   list        new_elements_names
    cdef   ProgressBar pb
    global             obi_errno
    DMS.obi_atexit()
    logger("info", "obi import: imports an object (file(s), obiview, taxonomy...) into a DMS")
    entry_count = -1
    if not config['obi']['taxdump']:
        input = open_uri(config['obi']['inputURI'])
        if input is None:  # TODO check for bytes instead now?
            raise Exception("Could not open input URI")
        entry_count = input[4]
        logger("info", "Importing %d entries", entry_count)
        # TODO a bit dirty?
        if input[2]==Nuc_Seq:
            v = View_NUC_SEQS
        else:
            v = View 
    else:
        v = None
    output = open_uri(config['obi']['outputURI'],
                      input=False,
                      newviewtype=v)
    if output is None:
        raise Exception("Could not create output view")
    # Read taxdump
    if config['obi']['taxdump']:  # The input is a taxdump to import in a DMS
        taxo = Taxonomy.open_taxdump(output[0], config['obi']['inputURI'])
        taxo.write(output[1])
        taxo.close()
        output[0].record_command_line(" ".join(sys.argv[1:]))
        output[0].close()
        return
    if entry_count >= 0:
        pb = ProgressBar(entry_count, config, seconde=5)
    else:
        pb = None
    entries = input[1]
    NUC_SEQS_view = False
    if isinstance(output[1], View) :
        view = output[1]
        if output[2] == View_NUC_SEQS :
            NUC_SEQS_view = True
    else: 
        raise NotImplementedError()
    # Save basic columns in variables for optimization
    if NUC_SEQS_view :
        id_col = view[ID_COLUMN]
        def_col = view[DEFINITION_COLUMN]
        seq_col = view[NUC_SEQUENCE_COLUMN]
    dcols = {}
    i = 0
    for entry in entries :
        if entry is None:  # error or exception handled at lower level, not raised because Python generators can't resume after any exception is raised
            if config['obi']['skiperror']:
                i-=1
                continue
            else:
                raise RollbackException("obi import error, rollbacking view", view)
        if pb is not None:
            pb(i)
        if NUC_SEQS_view: 
            id_col[i] = entry.id
            def_col[i] = entry.definition
            seq_col[i] = entry.seq
            # Check if there is a sequencing quality associated by checking the first entry    # TODO haven't found a more robust solution yet
            if i == 0:
                get_quality = QUALITY_COLUMN in entry
                if get_quality:
                    Column.new_column(view, QUALITY_COLUMN, OBI_QUAL)
                    qual_col = view[QUALITY_COLUMN]
            if get_quality:
                qual_col[i] = entry.quality
        for tag in entry :
            if tag != ID_COLUMN and tag != DEFINITION_COLUMN and tag != NUC_SEQUENCE_COLUMN and tag != QUALITY_COLUMN :  # TODO dirty 
                value = entry[tag]
                if tag == b"taxid":
                    tag = TAXID_COLUMN
                if tag == b"count":
                    tag = COUNT_COLUMN
                if tag not in dcols :
                    value_type = type(value)
                    nb_elts = 1
                    value_obitype = OBI_VOID
                    if value_type == dict or value_type == list :
                        nb_elts = len(value)
                        elt_names = list(value)
                    else :
                        nb_elts = 1
                        elt_names = None
                    value_obitype = get_obitype(value)
                    if value_obitype != OBI_VOID :
                        dcols[tag] = (Column.new_column(view, tag, value_obitype, nb_elements_per_line=nb_elts, elements_names=elt_names), value_obitype)
                        # Fill value
                        dcols[tag][0][i] = value
                    # TODO else log error?
                else :
                    rewrite = False
                    # Check type adequation
                    old_type = dcols[tag][1]
                    new_type = OBI_VOID
                    new_type = update_obitype(old_type, value)
                    if old_type != new_type :
                        rewrite = True
                    try:
                        # Fill value
                        dcols[tag][0][i] = value
                    except IndexError :
                        value_type = type(value)
                        old_column = dcols[tag][0]
                        old_nb_elements_per_line = old_column.nb_elements_per_line
                        new_nb_elements_per_line = 0
                        old_elements_names = old_column.elements_names
                        new_elements_names = None
                        #####################################################################
                        # Check the length and keys of column lines if needed
                        if value_type == dict :    # Check dictionary keys
                            for k in value :
                                if k not in old_elements_names :
                                    new_elements_names = list(set(old_elements_names+[tobytes(k) for k in value]))
                                    rewrite = True
                                    break
                        elif value_type == list or value_type == tuple :  # Check vector length
                            if old_nb_elements_per_line < len(value) :
                                new_nb_elements_per_line = len(value)
                                rewrite = True
                        #####################################################################
                        if rewrite :
                            if new_nb_elements_per_line == 0 and new_elements_names is not None :
                                new_nb_elements_per_line = len(new_elements_names)
                            # Reset obierrno 
                            obi_errno = 0
                            dcols[tag] = (view.rewrite_column_with_diff_attributes(old_column.name, 
                                                                                   new_data_type=new_type, 
                                                                                   new_nb_elements_per_line=new_nb_elements_per_line,
                                                                                   new_elements_names=new_elements_names,
                                                                                   rewrite_last_line=False), 
                                          value_obitype)
                            # Update the dictionary:
                            for t in dcols :
                                dcols[t] = (view[t], dcols[t][1])
                            # Fill value
                            dcols[tag][0][i] = value
        i+=1
    if pb is not None:
        pb(i, force=True)
        print("", file=sys.stderr)
    # Save command config in View and DMS comments
    command_line = " ".join(sys.argv[1:])
    view.write_config(config, "import", command_line, input_str=[os.path.abspath(config['obi']['inputURI'])])
    output[0].record_command_line(command_line)
    #print("\n\nOutput view:\n````````````", file=sys.stderr)
    #print(repr(view), file=sys.stderr)
    try:
        input[0].close()
    except AttributeError:
        pass
    try:
        output[0].close()
    except AttributeError:
        pass
    logger("info", "Done.")
--- a/python/obitools3/commands/less.cfiles
+++ b/python/obitools3/commands/less.cfiles
@ -1,103 +0,0 @@
 ../../../src/obi_lcs.h
 ../../../src/obi_lcs.c
 ../../../src/obierrno.h
 ../../../src/obierrno.c
 ../../../src/upperband.h
 ../../../src/upperband.c
 ../../../src/sse_banded_LCS_alignment.h
 ../../../src/sse_banded_LCS_alignment.c
 ../../../src/obiblob.h
 ../../../src/obiblob.c
 ../../../src/utils.h
 ../../../src/utils.c
 ../../../src/obidms.h
 ../../../src/obidms.c
 ../../../src/libjson/json_utils.h
 ../../../src/libjson/json_utils.c
 ../../../src/libjson/cJSON.h
 ../../../src/libjson/cJSON.c
 ../../../src/obiavl.h
 ../../../src/obiavl.c
 ../../../src/bloom.h
 ../../../src/bloom.c
 ../../../src/crc64.h
 ../../../src/crc64.c
 ../../../src/murmurhash2.h
 ../../../src/murmurhash2.c
 ../../../src/obidmscolumn.h
 ../../../src/obidmscolumn.c
 ../../../src/obitypes.h
 ../../../src/obitypes.c
 ../../../src/obidmscolumndir.h
 ../../../src/obidmscolumndir.c
 ../../../src/obiblob_indexer.h
 ../../../src/obiblob_indexer.c
 ../../../src/obiview.h
 ../../../src/obiview.c
 ../../../src/hashtable.h
 ../../../src/hashtable.c
 ../../../src/linked_list.h
 ../../../src/linked_list.c
 ../../../src/obidmscolumn_array.h
 ../../../src/obidmscolumn_array.c
 ../../../src/obidmscolumn_blob.h
 ../../../src/obidmscolumn_blob.c
 ../../../src/obidmscolumn_idx.h
 ../../../src/obidmscolumn_idx.c
 ../../../src/obidmscolumn_bool.h
 ../../../src/obidmscolumn_bool.c
 ../../../src/obidmscolumn_char.h
 ../../../src/obidmscolumn_char.c
 ../../../src/obidmscolumn_float.h
 ../../../src/obidmscolumn_float.c
 ../../../src/obidmscolumn_int.h
 ../../../src/obidmscolumn_int.c
 ../../../src/obidmscolumn_qual.h
 ../../../src/obidmscolumn_qual.c
 ../../../src/obidmscolumn_seq.h
 ../../../src/obidmscolumn_seq.c
 ../../../src/obidmscolumn_str.h
 ../../../src/obidmscolumn_str.c
 ../../../src/array_indexer.h
 ../../../src/array_indexer.c
 ../../../src/char_str_indexer.h
 ../../../src/char_str_indexer.c
 ../../../src/dna_seq_indexer.h
 ../../../src/dna_seq_indexer.c
 ../../../src/encode.c
 ../../../src/encode.h
 ../../../src/uint8_indexer.c
 ../../../src/uint8_indexer.h
 ../../../src/build_reference_db.c
 ../../../src/build_reference_db.h
 ../../../src/kmer_similarity.c
 ../../../src/kmer_similarity.h
 ../../../src/obi_clean.c
 ../../../src/obi_clean.h
 ../../../src/obi_ecopcr.c
 ../../../src/obi_ecopcr.h
 ../../../src/obi_ecotag.c
 ../../../src/obi_ecotag.h
 ../../../src/obidms_taxonomy.c
 ../../../src/obidms_taxonomy.h
 ../../../src/obilittlebigman.c
 ../../../src/obilittlebigman.h
 ../../../src/_sse.h
 ../../../src/obidebug.h
 ../../../src/libecoPCR/libapat/CODES/dft_code.h
 ../../../src/libecoPCR/libapat/CODES/dna_code.h
 ../../../src/libecoPCR/libapat/CODES/prot_code.h
 ../../../src/libecoPCR/libapat/apat_parse.c
 ../../../src/libecoPCR/libapat/apat_search.c
 ../../../src/libecoPCR/libapat/apat.h
 ../../../src/libecoPCR/libapat/Gmach.h
 ../../../src/libecoPCR/libapat/Gtypes.h
 ../../../src/libecoPCR/libapat/libstki.c
 ../../../src/libecoPCR/libapat/libstki.h
 ../../../src/libecoPCR/libthermo/nnparams.h
 ../../../src/libecoPCR/libthermo/nnparams.c
 ../../../src/libecoPCR/ecoapat.c
 ../../../src/libecoPCR/ecodna.c
 ../../../src/libecoPCR/ecoError.c
 ../../../src/libecoPCR/ecoMalloc.c
 ../../../src/libecoPCR/ecoPCR.h
--- a/python/obitools3/commands/less.pyx
+++ b/python/obitools3/commands/less.pyx
@ -1,44 +0,0 @@
 #cython: language_level=3
 from obitools3.apps.optiongroups import addMinimalInputOption
 from obitools3.uri.decode import open_uri
 from obitools3.dms import DMS
 __title__="Less equivalent"
 def addOptions(parser):
    addMinimalInputOption(parser)
    group=parser.add_argument_group('obi less specific options')
    group.add_argument('--print', '-n',
                     action="store", dest="less:print",
                     metavar='<N>',
                     default=10,
                     type=int,
                     help="Print N entries (default: 10)")
 def run(config):
    cdef object entries
    cdef int    n
    DMS.obi_atexit()
    input = open_uri(config['obi']['inputURI'])
    entries = input[1]
    if config['less']['print'] > len(entries) :
        n = len(entries)
    else :
        n = config['less']['print']
    # Print
    for i in range(n) :
        print(repr(entries[i]))
--- a/python/obitools3/commands/ls.cfiles
+++ b/python/obitools3/commands/ls.cfiles
@ -1,103 +0,0 @@
 ../../../src/obi_lcs.h
 ../../../src/obi_lcs.c
 ../../../src/obierrno.h
 ../../../src/obierrno.c
 ../../../src/upperband.h
 ../../../src/upperband.c
 ../../../src/sse_banded_LCS_alignment.h
 ../../../src/sse_banded_LCS_alignment.c
 ../../../src/obiblob.h
 ../../../src/obiblob.c
 ../../../src/utils.h
 ../../../src/utils.c
 ../../../src/obidms.h
 ../../../src/obidms.c
 ../../../src/libjson/json_utils.h
 ../../../src/libjson/json_utils.c
 ../../../src/libjson/cJSON.h
 ../../../src/libjson/cJSON.c
 ../../../src/obiavl.h
 ../../../src/obiavl.c
 ../../../src/bloom.h
 ../../../src/bloom.c
 ../../../src/crc64.h
 ../../../src/crc64.c
 ../../../src/murmurhash2.h
 ../../../src/murmurhash2.c
 ../../../src/obidmscolumn.h
 ../../../src/obidmscolumn.c
 ../../../src/obitypes.h
 ../../../src/obitypes.c
 ../../../src/obidmscolumndir.h
 ../../../src/obidmscolumndir.c
 ../../../src/obiblob_indexer.h
 ../../../src/obiblob_indexer.c
 ../../../src/obiview.h
 ../../../src/obiview.c
 ../../../src/hashtable.h
 ../../../src/hashtable.c
 ../../../src/linked_list.h
 ../../../src/linked_list.c
 ../../../src/obidmscolumn_array.h
 ../../../src/obidmscolumn_array.c
 ../../../src/obidmscolumn_blob.h
 ../../../src/obidmscolumn_blob.c
 ../../../src/obidmscolumn_idx.h
 ../../../src/obidmscolumn_idx.c
 ../../../src/obidmscolumn_bool.h
 ../../../src/obidmscolumn_bool.c
 ../../../src/obidmscolumn_char.h
 ../../../src/obidmscolumn_char.c
 ../../../src/obidmscolumn_float.h
 ../../../src/obidmscolumn_float.c
 ../../../src/obidmscolumn_int.h
 ../../../src/obidmscolumn_int.c
 ../../../src/obidmscolumn_qual.h
 ../../../src/obidmscolumn_qual.c
 ../../../src/obidmscolumn_seq.h
 ../../../src/obidmscolumn_seq.c
 ../../../src/obidmscolumn_str.h
 ../../../src/obidmscolumn_str.c
 ../../../src/array_indexer.h
 ../../../src/array_indexer.c
 ../../../src/char_str_indexer.h
 ../../../src/char_str_indexer.c
 ../../../src/dna_seq_indexer.h
 ../../../src/dna_seq_indexer.c
 ../../../src/encode.c
 ../../../src/encode.h
 ../../../src/uint8_indexer.c
 ../../../src/uint8_indexer.h
 ../../../src/build_reference_db.c
 ../../../src/build_reference_db.h
 ../../../src/kmer_similarity.c
 ../../../src/kmer_similarity.h
 ../../../src/obi_clean.c
 ../../../src/obi_clean.h
 ../../../src/obi_ecopcr.c
 ../../../src/obi_ecopcr.h
 ../../../src/obi_ecotag.c
 ../../../src/obi_ecotag.h
 ../../../src/obidms_taxonomy.c
 ../../../src/obidms_taxonomy.h
 ../../../src/obilittlebigman.c
 ../../../src/obilittlebigman.h
 ../../../src/_sse.h
 ../../../src/obidebug.h
 ../../../src/libecoPCR/libapat/CODES/dft_code.h
 ../../../src/libecoPCR/libapat/CODES/dna_code.h
 ../../../src/libecoPCR/libapat/CODES/prot_code.h
 ../../../src/libecoPCR/libapat/apat_parse.c
 ../../../src/libecoPCR/libapat/apat_search.c
 ../../../src/libecoPCR/libapat/apat.h
 ../../../src/libecoPCR/libapat/Gmach.h
 ../../../src/libecoPCR/libapat/Gtypes.h
 ../../../src/libecoPCR/libapat/libstki.c
 ../../../src/libecoPCR/libapat/libstki.h
 ../../../src/libecoPCR/libthermo/nnparams.h
 ../../../src/libecoPCR/libthermo/nnparams.c
 ../../../src/libecoPCR/ecoapat.c
 ../../../src/libecoPCR/ecodna.c
 ../../../src/libecoPCR/ecoError.c
 ../../../src/libecoPCR/ecoMalloc.c
 ../../../src/libecoPCR/ecoPCR.h
--- a/python/obitools3/commands/ls.pyx
+++ b/python/obitools3/commands/ls.pyx
@ -1,28 +0,0 @@
 #cython: language_level=3
 from obitools3.uri.decode import open_uri
 from obitools3.apps.config import logger
 from obitools3.dms import DMS
 from obitools3.apps.optiongroups import addMinimalInputOption
 __title__="Print a preview of a DMS, view, column...."
 def addOptions(parser):    
    addMinimalInputOption(parser)
 def run(config):
    DMS.obi_atexit()
    logger("info", "obi ls")
    # Open the input
    input = open_uri(config['obi']['inputURI'])
    if input is None:
        raise Exception("Could not read input")
    print(repr(input[1]))
--- a/python/obitools3/commands/ngsfilter.cfiles
+++ b/python/obitools3/commands/ngsfilter.cfiles
@ -1,103 +0,0 @@
 ../../../src/obi_lcs.h
 ../../../src/obi_lcs.c
 ../../../src/obierrno.h
 ../../../src/obierrno.c
 ../../../src/upperband.h
 ../../../src/upperband.c
 ../../../src/sse_banded_LCS_alignment.h
 ../../../src/sse_banded_LCS_alignment.c
 ../../../src/obiblob.h
 ../../../src/obiblob.c
 ../../../src/utils.h
 ../../../src/utils.c
 ../../../src/obidms.h
 ../../../src/obidms.c
 ../../../src/libjson/json_utils.h
 ../../../src/libjson/json_utils.c
 ../../../src/libjson/cJSON.h
 ../../../src/libjson/cJSON.c
 ../../../src/obiavl.h
 ../../../src/obiavl.c
 ../../../src/bloom.h
 ../../../src/bloom.c
 ../../../src/crc64.h
 ../../../src/crc64.c
 ../../../src/murmurhash2.h
 ../../../src/murmurhash2.c
 ../../../src/obidmscolumn.h
 ../../../src/obidmscolumn.c
 ../../../src/obitypes.h
 ../../../src/obitypes.c
 ../../../src/obidmscolumndir.h
 ../../../src/obidmscolumndir.c
 ../../../src/obiblob_indexer.h
 ../../../src/obiblob_indexer.c
 ../../../src/obiview.h
 ../../../src/obiview.c
 ../../../src/hashtable.h
 ../../../src/hashtable.c
 ../../../src/linked_list.h
 ../../../src/linked_list.c
 ../../../src/obidmscolumn_array.h
 ../../../src/obidmscolumn_array.c
 ../../../src/obidmscolumn_blob.h
 ../../../src/obidmscolumn_blob.c
 ../../../src/obidmscolumn_idx.h
 ../../../src/obidmscolumn_idx.c
 ../../../src/obidmscolumn_bool.h
 ../../../src/obidmscolumn_bool.c
 ../../../src/obidmscolumn_char.h
 ../../../src/obidmscolumn_char.c
 ../../../src/obidmscolumn_float.h
 ../../../src/obidmscolumn_float.c
 ../../../src/obidmscolumn_int.h
 ../../../src/obidmscolumn_int.c
 ../../../src/obidmscolumn_qual.h
 ../../../src/obidmscolumn_qual.c
 ../../../src/obidmscolumn_seq.h
 ../../../src/obidmscolumn_seq.c
 ../../../src/obidmscolumn_str.h
 ../../../src/obidmscolumn_str.c
 ../../../src/array_indexer.h
 ../../../src/array_indexer.c
 ../../../src/char_str_indexer.h
 ../../../src/char_str_indexer.c
 ../../../src/dna_seq_indexer.h
 ../../../src/dna_seq_indexer.c
 ../../../src/encode.c
 ../../../src/encode.h
 ../../../src/uint8_indexer.c
 ../../../src/uint8_indexer.h
 ../../../src/build_reference_db.c
 ../../../src/build_reference_db.h
 ../../../src/kmer_similarity.c
 ../../../src/kmer_similarity.h
 ../../../src/obi_clean.c
 ../../../src/obi_clean.h
 ../../../src/obi_ecopcr.c
 ../../../src/obi_ecopcr.h
 ../../../src/obi_ecotag.c
 ../../../src/obi_ecotag.h
 ../../../src/obidms_taxonomy.c
 ../../../src/obidms_taxonomy.h
 ../../../src/obilittlebigman.c
 ../../../src/obilittlebigman.h
 ../../../src/_sse.h
 ../../../src/obidebug.h
 ../../../src/libecoPCR/libapat/CODES/dft_code.h
 ../../../src/libecoPCR/libapat/CODES/dna_code.h
 ../../../src/libecoPCR/libapat/CODES/prot_code.h
 ../../../src/libecoPCR/libapat/apat_parse.c
 ../../../src/libecoPCR/libapat/apat_search.c
 ../../../src/libecoPCR/libapat/apat.h
 ../../../src/libecoPCR/libapat/Gmach.h
 ../../../src/libecoPCR/libapat/Gtypes.h
 ../../../src/libecoPCR/libapat/libstki.c
 ../../../src/libecoPCR/libapat/libstki.h
 ../../../src/libecoPCR/libthermo/nnparams.h
 ../../../src/libecoPCR/libthermo/nnparams.c
 ../../../src/libecoPCR/ecoapat.c
 ../../../src/libecoPCR/ecodna.c
 ../../../src/libecoPCR/ecoError.c
 ../../../src/libecoPCR/ecoMalloc.c
 ../../../src/libecoPCR/ecoPCR.h
--- a/python/obitools3/commands/ngsfilter.pyx
+++ b/python/obitools3/commands/ngsfilter.pyx
@ -1,604 +0,0 @@
 #cython: language_level=3
 from obitools3.apps.progress cimport ProgressBar  # @UnresolvedImport
 from obitools3.dms import DMS
 from obitools3.dms.view import RollbackException
 from obitools3.dms.view.typed_view.view_NUC_SEQS cimport View_NUC_SEQS
 from obitools3.dms.column.column cimport Column, Column_line
 from obitools3.apps.optiongroups import addMinimalInputOption, addMinimalOutputOption
 from obitools3.uri.decode import open_uri
 from obitools3.apps.config import logger
 from obitools3.libalign._freeendgapfm import FreeEndGapFullMatch
 from obitools3.libalign.apat_pattern import Primer_search
 from obitools3.dms.obiseq cimport Nuc_Seq
 from obitools3.dms.capi.obitypes cimport OBI_SEQ, OBI_QUAL
 from obitools3.dms.capi.apat cimport MAX_PATTERN
 from obitools3.utils cimport tobytes
 from libc.stdint cimport INT32_MAX
 from functools import reduce
 import math
 import sys
 REVERSE_SEQ_COLUMN_NAME = b"REVERSE_SEQUENCE"      # used by alignpairedend tool
 REVERSE_QUALITY_COLUMN_NAME = b"REVERSE_QUALITY"   # used by alignpairedend tool
 __title__="Assigns sequence records to the corresponding experiment/sample based on DNA tags and primers"
 def addOptions(parser):
    addMinimalInputOption(parser)
    addMinimalOutputOption(parser)
    group = parser.add_argument_group('obi ngsfilter specific options')
    group.add_argument('-t','--info-view',
                     action="store", dest="ngsfilter:info_view",
                     metavar="<URI>",
                     type=str,
                     default=None,
                     help="URI to the view containing the samples definition (with tags, primers, sample names,...)")
    group.add_argument('-R', '--reverse-reads',
                     action="store", dest="ngsfilter:reverse",
                     metavar="<URI>",
                     default=None,
                     type=str,
                     help="URI to the reverse reads if the paired-end reads haven't been aligned yet")
    group.add_argument('-u','--unidentified',
                     action="store", dest="ngsfilter:unidentified",
                     metavar="<URI>",
                     type=str,
                     default=None,
                     help="URI to the view used to store the sequences unassigned to any sample")
    group.add_argument('-e','--error',
                     action="store", dest="ngsfilter:error",
                     metavar="###",
                     type=int,
                     default=2,
                     help="Number of errors allowed for matching primers [default = 2]")
 class Primer:
    collection={}
    def __init__(self, sequence, taglength, forward=True, max_errors=2, verbose=False, primer_pair_idx=0, primer_idx=0):
        '''
        @param sequence:
        @type sequence:
        @param direct:
        @type direct:
        '''
        assert sequence not in Primer.collection        \
            or Primer.collection[sequence]==taglength,  \
            "Primer %s must always be used with tags of the same length" % sequence
        Primer.collection[sequence]=taglength
        self.primer_pair_idx = primer_pair_idx
        self.primer_idx = primer_idx
        self.is_revcomp = False
        self.revcomp = None
        self.raw=sequence
        self.sequence = Nuc_Seq(b"primer", sequence)        
        self.lseq = len(self.sequence)
        self.max_errors=max_errors
        self.taglength=taglength
        self.forward = forward
        self.verbose=verbose
    def reverse_complement(self):
        p = Primer(self.raw,
                   self.taglength,
                   not self.forward,
                   verbose=self.verbose,
                   max_errors=self.max_errors,
                   primer_pair_idx=self.primer_pair_idx,
                   primer_idx=self.primer_idx)
        p.sequence=p.sequence.reverse_complement
        p.is_revcomp = True
        p.revcomp = None
        return p
    def __hash__(self):
        return hash(str(self.raw))
    def __eq__(self,primer):
        return self.raw==primer.raw 
    def __call__(self, sequence, same_sequence=False, pattern=0, begin=0):
        if len(sequence) <= self.lseq:
            return None
        ali = self.aligner.search_one_primer(sequence.seq, 
                                             self.primer_pair_idx, 
                                             self.primer_idx, 
                                             reverse_comp=self.is_revcomp, 
                                             same_sequence=same_sequence,
                                             pattern_ref=pattern,
                                             begin=begin)
        if ali is None:  # no match
            return None 
        errors, start = ali.first_encountered()
        if errors <= self.max_errors:
            end = start + self.lseq
            if self.taglength is not None:
                if self.sequence.is_revcomp:
                    if (len(sequence)-end) >= self.taglength:
                        tag_start = len(sequence) - end - self.taglength
                        tag = sequence.reverse_complement[tag_start:tag_start+self.taglength].seq
                    else:
                        tag=None
                else:
                    if start >= self.taglength:
                        tag = tobytes((sequence[start - self.taglength:start].seq).lower())  # turn back to lowercase because apat turned to uppercase
                    else:
                        tag=None
            else:
                tag=None
            return errors,start,end,tag
        return None 
    def __str__(self):
        return "%s: %s" % ({True:'D',False:'R'}[self.forward],self.raw)
    __repr__=__str__
 cdef read_info_view(info_view, max_errors=2, verbose=False, not_aligned=False):
    infos = {}
    primer_list = []
    i=0
    for p in info_view:
        forward=Primer(p[b'forward_primer'],
                       len(p[b'forward_tag']) if p[b'forward_tag']!=b'-' else None,
                       True,
                       max_errors=max_errors,
                       verbose=verbose,
                       primer_pair_idx=i,
                       primer_idx=0)
        fp = infos.get(forward,{})
        infos[forward]=fp
        reverse=Primer(p[b'reverse_primer'],
                       len(p[b'reverse_tag']) if p[b'reverse_tag']!=b'-' else None,
                       False,
                       max_errors=max_errors,
                       verbose=verbose,
                       primer_pair_idx=i,
                       primer_idx=1)
        primer_list.append((p[b'forward_primer'], p[b'reverse_primer']))
        rp = infos.get(reverse,{})
        infos[reverse]=rp
        if not_aligned:
            cf=forward
            cr=reverse
            cf.revcomp = forward.reverse_complement()
            cr.revcomp = reverse.reverse_complement()
            dpp=fp.get(cr,{})
            fp[cr]=dpp
            rpp=rp.get(cf,{})
            rp[cf]=rpp
        else:
            cf=forward.reverse_complement()
            cr=reverse.reverse_complement()
            dpp=fp.get(cr,{})
            fp[cr]=dpp
            rpp=rp.get(cf,{})
            rp[cf]=rpp
        tags = (p[b'forward_tag'] if p[b'forward_tag']!=b'-' else None,
                p[b'reverse_tag'] if p[b'reverse_tag']!=b'-' else None)
        assert tags not in dpp, \
               "Tag pair %s is already used with primer pairs: (%s,%s)" % (str(tags),forward,reverse)
        # Save additional data
        special_keys = [b'forward_primer', b'reverse_primer', b'forward_tag', b'reverse_tag']
        data={}
        for key in p:
            if key not in special_keys:
                data[key] = p[key]
        dpp[tags] = data
        rpp[tags] = data
        i+=1
    return infos, primer_list
 cdef tuple annotate(sequences, infos, verbose=False):
    def sortMatch(match):
        if match[1] is None:
            return INT32_MAX
        else:
            return match[1][1]
    def sortReverseMatch(match):
        if match[1] is None:
            return -1
        else:
            return match[1][1]
    not_aligned = len(sequences) > 1
    sequenceF = sequences[0]
    sequenceR = None
    if not not_aligned: 
        final_sequence = sequenceF
    else:
        final_sequence = sequenceF.clone()   # TODO maybe not cloning and then deleting quality tags is more efficient
    if not_aligned:
        sequenceR = sequences[1]
        final_sequence[REVERSE_SEQ_COLUMN_NAME] = sequenceR.seq             # used by alignpairedend tool
        final_sequence[REVERSE_QUALITY_COLUMN_NAME] = sequenceR.quality     # used by alignpairedend tool
    for seq in sequences:
        if hasattr(seq, "quality_array"): 
            q = -reduce(lambda x,y:x+y,(math.log10(z) for z in seq.quality_array),0)/len(seq.quality_array)*10
            seq[b'avg_quality']=q
            q = -reduce(lambda x,y:x+y,(math.log10(z) for z in seq.quality_array[0:10]),0)
            seq[b'head_quality']=q
            if len(seq.quality_array[10:-10]) :
                q = -reduce(lambda x,y:x+y,(math.log10(z) for z in seq.quality_array[10:-10]),0)/len(seq.quality_array[10:-10])*10
                seq[b'mid_quality']=q
            q = -reduce(lambda x,y:x+y,(math.log10(z) for z in seq.quality_array[-10:]),0)
            seq[b'tail_quality']=q
    # Try direct matching:
    directmatch = []
    first_matched_seq = None
    second_matched_seq = None
    for seq in sequences:
        new_seq = True
        pattern = 0
        for p in infos:
            if pattern == MAX_PATTERN:
                new_seq = True
                pattern = 0
            directmatch.append((p, p(seq, same_sequence=not new_seq, pattern=pattern), seq))
            new_seq = False
            pattern+=1
    # Choose match closer to the start of (one of the) sequence(s)
    directmatch = sorted(directmatch, key=sortMatch)
    all_direct_matches = directmatch
    directmatch = directmatch[0] if directmatch[0][1] is not None else None
    if directmatch is None:
        final_sequence[b'error']=b'No primer match'
        return False, final_sequence
    first_matched_seq = directmatch[2]
    if id(first_matched_seq) == id(sequenceF) and not_aligned:
        second_matched_seq = sequenceR
    else:
        second_matched_seq = sequenceF
    match = first_matched_seq[directmatch[1][1]:directmatch[1][2]]
    if not not_aligned:
        final_sequence[b'seq_length_ori']=len(final_sequence)
    if not not_aligned or id(first_matched_seq) == id(sequenceF):
        final_sequence = final_sequence[directmatch[1][2]:]
    else:
        cut_seq = sequenceR[directmatch[1][2]:]
        final_sequence[REVERSE_SEQ_COLUMN_NAME] = cut_seq.seq           # used by alignpairedend tool
        final_sequence[REVERSE_QUALITY_COLUMN_NAME] = cut_seq.quality   # used by alignpairedend tool
    if directmatch[0].forward:
        final_sequence[b'direction']=b'forward'
        final_sequence[b'forward_errors']=directmatch[1][0]
        final_sequence[b'forward_primer']=directmatch[0].raw
        final_sequence[b'forward_match']=match.seq
    else:
        final_sequence[b'direction']=b'reverse'
        final_sequence[b'reverse_errors']=directmatch[1][0]
        final_sequence[b'reverse_primer']=directmatch[0].raw
        final_sequence[b'reverse_match']=match.seq
    # Keep only paired reverse primer
    infos = infos[directmatch[0]]    
    # If not aligned, look for other match in already computed match (choose the one that makes the biggest amplicon)
    if not_aligned:
        i=1
        while all_direct_matches[i][1] is None and all_direct_matches[i][0].forward and i<len(all_direct_matches):
            i+=1
        if i < len(all_direct_matches):
            reversematch = all_direct_matches[i]
        else:
            reversematch = None
    # Look for other primer in the other direction on the sequence, or
    # If sequences are not already aligned and reverse primer not found in most likely sequence (the one without the forward primer), try matching on the same sequence than the first match (primer in the other direction)
    if not not_aligned or (not_aligned and reversematch[1] is None):
        if not not_aligned:
            sequence_to_match = second_matched_seq
        else:
            sequence_to_match = first_matched_seq
        reversematch = []
        # Compute begin
        begin=directmatch[1][2]+1  # end of match + 1 on the same sequence
        # Try reverse matching on the other sequence:
        new_seq = True
        pattern = 0
        for p in infos:
            if pattern == MAX_PATTERN:
                new_seq = True
                pattern = 0
            if not_aligned:
                primer=p.revcomp
            else:
                primer=p
            reversematch.append((primer, primer(sequence_to_match, same_sequence=not new_seq, pattern=pattern, begin=begin)))
            new_seq = False
            pattern+=1
        # Choose match closer to the end of the sequence
        reversematch = sorted(reversematch, key=sortReverseMatch, reverse=True)
        all_reverse_matches = reversematch
        reversematch = reversematch[0] if reversematch[0][1] is not None else None
    if reversematch is None and None not in infos:
        if directmatch[0].forward:
            message = b'No reverse primer match'
        else:
            message = b'No direct primer match'
        final_sequence[b'error']=message
        return False, final_sequence
    if reversematch is None:
        final_sequence[b'status']=b'partial'
        if directmatch[0].forward:
            tags=(directmatch[1][3],None)
        else:
            tags=(None,directmatch[1][3])
        samples = infos[None]
    else:
        final_sequence[b'status']=b'full'
        match = second_matched_seq[reversematch[1][1]:reversematch[1][2]]
        match = match.reverse_complement
        if not not_aligned or id(second_matched_seq) == id(sequenceF):
            final_sequence = final_sequence[0:reversematch[1][1]]
        else:
            cut_seq = sequenceR[reversematch[1][2]:]
            final_sequence[REVERSE_SEQ_COLUMN_NAME] = cut_seq.seq           # used by alignpairedend tool
            final_sequence[REVERSE_QUALITY_COLUMN_NAME] = cut_seq.quality   # used by alignpairedend tool
        if directmatch[0].forward:
            tags=(directmatch[1][3], reversematch[1][3])
            final_sequence[b'reverse_errors'] = reversematch[1][0]
            final_sequence[b'reverse_primer'] = reversematch[0].raw
            final_sequence[b'reverse_match'] = match.seq
        else:
            tags=(reversematch[1][3], directmatch[1][3])
            final_sequence[b'forward_errors'] = reversematch[1][0]
            final_sequence[b'forward_primer'] = reversematch[0].raw
            final_sequence[b'forward_match'] = match.seq
        if tags[0] is not None:
            final_sequence[b'forward_tag'] = tags[0]
        if tags[1] is not None:
            final_sequence[b'reverse_tag'] = tags[1]
        samples = infos[reversematch[0]]
    if not directmatch[0].forward and not not_aligned:   # don't reverse complement if not_aligned
        final_sequence = final_sequence.reverse_complement
    sample=None
    if tags[0] is not None:                                    # Direct  tag known
        if tags[1] is not None:                                # Reverse tag known
            sample = samples.get(tags, None)             
        else:                                                   # Only direct tag known
            s=[samples[x] for x in samples if x[0]==tags[0]]
            if len(s)==1:
                sample=s[0]
            elif len(s)>1:
                final_sequence[b'error']=b'multiple samples match tags'
                return False, final_sequence
            else:
                sample=None
    else: 
        if tags[1] is not None:                                 # Only reverse tag known
            s=[samples[x] for x in samples if x[1]==tags[1]]
            if len(s)==1:
                sample=s[0]
            elif len(s)>1:
                final_sequence[b'error']=b'multiple samples match tags'
                return False, final_sequence
            else:
                sample=None
    if sample is None:
        final_sequence[b'error']=b"Cannot assign sequence to a sample"
        return False, final_sequence
    final_sequence.update(sample)
    if not not_aligned:
        final_sequence[b'seq_length']=len(final_sequence)
    return True, final_sequence
 def run(config):
    DMS.obi_atexit()
    logger("info", "obi ngsfilter")
    assert config['ngsfilter']['info_view'] is not None, "Option -t must be specified"
    # Open the input
    forward = None
    reverse = None
    input = None
    not_aligned = False
    input = open_uri(config['obi']['inputURI'])
    if input is None:
        raise Exception("Could not open input reads")
    if input[2] != View_NUC_SEQS:
        raise NotImplementedError('obi ngsfilter only works on NUC_SEQS views')
    if "reverse" in config["ngsfilter"]:
        forward = input[1]        
        rinput = open_uri(config["ngsfilter"]["reverse"])
        if rinput is None:
            raise Exception("Could not open reverse reads")
        if rinput[2] != View_NUC_SEQS:
            raise NotImplementedError('obi ngsfilter only works on NUC_SEQS views')
        reverse = rinput[1]
        if len(forward) != len(reverse):
            raise Exception("Error: the number of forward and reverse reads are different")
        entries = [forward, reverse]
        not_aligned = True
        input_dms_name = [forward.dms.name, reverse.dms.name]
        input_view_name = [forward.name, reverse.name]
    else:
        entries = input[1]
        input_dms_name = [entries.dms.name]
        input_view_name = [entries.name]
    if not_aligned:
        entries_len = len(forward)
    else:
        entries_len = len(entries)
    # Open the output
    output = open_uri(config['obi']['outputURI'],
                      input=False,
                      newviewtype=View_NUC_SEQS)
    if output is None:
        raise Exception("Could not create output view")
    o_view = output[1]
    # Open the view containing the informations about the tags and the primers
    info_input = open_uri(config['ngsfilter']['info_view'])
    if info_input is None:
        raise Exception("Could not read the view containing the informations about the tags and the primers")
    info_view = info_input[1]
    input_dms_name.append(info_input[0].name)   
    input_view_name.append(info_input[1].name)
    # Open the unidentified view
    if 'unidentified' in config['ngsfilter'] and config['ngsfilter']['unidentified'] is not None:  # TODO keyError if undefined problem
        unidentified_input = open_uri(config['ngsfilter']['unidentified'],
                                      input=False,
                                      newviewtype=View_NUC_SEQS)
        if unidentified_input is None:
            raise Exception("Could not open the view containing the unidentified reads")
        unidentified = unidentified_input[1]
    else:
        unidentified = None
    # Initialize the progress bar
    pb = ProgressBar(entries_len, config, seconde=5)
    # Check and store primers and tags
    infos, primer_list = read_info_view(info_view, max_errors=config['ngsfilter']['error'], verbose=False, not_aligned=not_aligned)   # TODO obi verbose option
    aligner = Primer_search(primer_list, config['ngsfilter']['error'])
    for p in infos:
        p.aligner = aligner
        for paired_p in infos[p]:
            paired_p.aligner = aligner
            if paired_p.revcomp is not None:
                paired_p.revcomp.aligner = aligner
    if not_aligned:   # create columns used by alignpairedend tool
        Column.new_column(o_view, REVERSE_SEQ_COLUMN_NAME, OBI_SEQ)
        Column.new_column(o_view, REVERSE_QUALITY_COLUMN_NAME, OBI_QUAL, associated_column_name=REVERSE_SEQ_COLUMN_NAME, associated_column_version=o_view[REVERSE_SEQ_COLUMN_NAME].version)
        Column.new_column(unidentified, REVERSE_SEQ_COLUMN_NAME, OBI_SEQ)
        Column.new_column(unidentified, REVERSE_QUALITY_COLUMN_NAME, OBI_QUAL, associated_column_name=REVERSE_SEQ_COLUMN_NAME, associated_column_version=unidentified[REVERSE_SEQ_COLUMN_NAME].version)
    g = 0
    u = 0
    try:
        for i in range(entries_len):
            pb(i)
            if not_aligned:
                modseq = [Nuc_Seq.new_from_stored(forward[i]), Nuc_Seq.new_from_stored(reverse[i])]
            else:
                modseq = [Nuc_Seq.new_from_stored(entries[i])]
            good, oseq = annotate(modseq, infos)
            if good:
                o_view[g].set(oseq.id, oseq.seq, definition=oseq.definition, quality=oseq.quality, tags=oseq)
                g+=1
            elif unidentified is not None:
                unidentified[u].set(oseq.id, oseq.seq, definition=oseq.definition, quality=oseq.quality, tags=oseq)
                u+=1
    except Exception, e:
        raise RollbackException("obi ngsfilter error, rollbacking views: "+str(e), o_view, unidentified)
    pb(i, force=True)
    print("", file=sys.stderr)
    # Save command config in View and DMS comments
    command_line = " ".join(sys.argv[1:])
    o_view.write_config(config, "ngsfilter", command_line, input_dms_name=input_dms_name, input_view_name=input_view_name)
    unidentified.write_config(config, "ngsfilter", command_line, input_dms_name=input_dms_name, input_view_name=input_view_name)
    # Add comment about unidentified seqs
    unidentified.comments["info"] = "View containing sequences categorized as unidentified by the ngsfilter command"
    output[0].record_command_line(command_line)
    #print("\n\nOutput view:\n````````````", file=sys.stderr)
    #print(repr(o_view), file=sys.stderr)
    input[0].close()
    output[0].close()
    info_input[0].close()
    unidentified_input[0].close()
    aligner.free()
    logger("info", "Done.")
--- a/python/obitools3/commands/sort.cfiles
+++ b/python/obitools3/commands/sort.cfiles
@ -1,103 +0,0 @@
 ../../../src/obi_lcs.h
 ../../../src/obi_lcs.c
 ../../../src/obierrno.h
 ../../../src/obierrno.c
 ../../../src/upperband.h
 ../../../src/upperband.c
 ../../../src/sse_banded_LCS_alignment.h
 ../../../src/sse_banded_LCS_alignment.c
 ../../../src/obiblob.h
 ../../../src/obiblob.c
 ../../../src/utils.h
 ../../../src/utils.c
 ../../../src/obidms.h
 ../../../src/obidms.c
 ../../../src/libjson/json_utils.h
 ../../../src/libjson/json_utils.c
 ../../../src/libjson/cJSON.h
 ../../../src/libjson/cJSON.c
 ../../../src/obiavl.h
 ../../../src/obiavl.c
 ../../../src/bloom.h
 ../../../src/bloom.c
 ../../../src/crc64.h
 ../../../src/crc64.c
 ../../../src/murmurhash2.h
 ../../../src/murmurhash2.c
 ../../../src/obidmscolumn.h
 ../../../src/obidmscolumn.c
 ../../../src/obitypes.h
 ../../../src/obitypes.c
 ../../../src/obidmscolumndir.h
 ../../../src/obidmscolumndir.c
 ../../../src/obiblob_indexer.h
 ../../../src/obiblob_indexer.c
 ../../../src/obiview.h
 ../../../src/obiview.c
 ../../../src/hashtable.h
 ../../../src/hashtable.c
 ../../../src/linked_list.h
 ../../../src/linked_list.c
 ../../../src/obidmscolumn_array.h
 ../../../src/obidmscolumn_array.c
 ../../../src/obidmscolumn_blob.h
 ../../../src/obidmscolumn_blob.c
 ../../../src/obidmscolumn_idx.h
 ../../../src/obidmscolumn_idx.c
 ../../../src/obidmscolumn_bool.h
 ../../../src/obidmscolumn_bool.c
 ../../../src/obidmscolumn_char.h
 ../../../src/obidmscolumn_char.c
 ../../../src/obidmscolumn_float.h
 ../../../src/obidmscolumn_float.c
 ../../../src/obidmscolumn_int.h
 ../../../src/obidmscolumn_int.c
 ../../../src/obidmscolumn_qual.h
 ../../../src/obidmscolumn_qual.c
 ../../../src/obidmscolumn_seq.h
 ../../../src/obidmscolumn_seq.c
 ../../../src/obidmscolumn_str.h
 ../../../src/obidmscolumn_str.c
 ../../../src/array_indexer.h
 ../../../src/array_indexer.c
 ../../../src/char_str_indexer.h
 ../../../src/char_str_indexer.c
 ../../../src/dna_seq_indexer.h
 ../../../src/dna_seq_indexer.c
 ../../../src/encode.c
 ../../../src/encode.h
 ../../../src/uint8_indexer.c
 ../../../src/uint8_indexer.h
 ../../../src/build_reference_db.c
 ../../../src/build_reference_db.h
 ../../../src/kmer_similarity.c
 ../../../src/kmer_similarity.h
 ../../../src/obi_clean.c
 ../../../src/obi_clean.h
 ../../../src/obi_ecopcr.c
 ../../../src/obi_ecopcr.h
 ../../../src/obi_ecotag.c
 ../../../src/obi_ecotag.h
 ../../../src/obidms_taxonomy.c
 ../../../src/obidms_taxonomy.h
 ../../../src/obilittlebigman.c
 ../../../src/obilittlebigman.h
 ../../../src/_sse.h
 ../../../src/obidebug.h
 ../../../src/libecoPCR/libapat/CODES/dft_code.h
 ../../../src/libecoPCR/libapat/CODES/dna_code.h
 ../../../src/libecoPCR/libapat/CODES/prot_code.h
 ../../../src/libecoPCR/libapat/apat_parse.c
 ../../../src/libecoPCR/libapat/apat_search.c
 ../../../src/libecoPCR/libapat/apat.h
 ../../../src/libecoPCR/libapat/Gmach.h
 ../../../src/libecoPCR/libapat/Gtypes.h
 ../../../src/libecoPCR/libapat/libstki.c
 ../../../src/libecoPCR/libapat/libstki.h
 ../../../src/libecoPCR/libthermo/nnparams.h
 ../../../src/libecoPCR/libthermo/nnparams.c
 ../../../src/libecoPCR/ecoapat.c
 ../../../src/libecoPCR/ecodna.c
 ../../../src/libecoPCR/ecoError.c
 ../../../src/libecoPCR/ecoMalloc.c
 ../../../src/libecoPCR/ecoPCR.h
--- a/python/obitools3/commands/sort.pyx
+++ b/python/obitools3/commands/sort.pyx
@ -1,144 +0,0 @@
 #cython: language_level=3
 from obitools3.apps.progress cimport ProgressBar  # @UnresolvedImport
 from obitools3.dms import DMS
 from obitools3.dms.view.view cimport View, Line_selection
 from obitools3.uri.decode import open_uri
 from obitools3.apps.optiongroups import addMinimalInputOption, addMinimalOutputOption
 from obitools3.dms.view import RollbackException
 from obitools3.apps.config import logger
 from obitools3.utils cimport str2bytes
 from obitools3.dms.capi.obitypes cimport OBI_BOOL, \
                                         OBI_CHAR, \
                                         OBI_FLOAT, \
                                         OBI_INT, \
                                         OBI_QUAL, \
                                         OBI_SEQ, \
                                         OBI_STR, \
                                         OBIBool_NA, \
                                         OBIChar_NA, \
                                         OBIFloat_NA, \
                                         OBIInt_NA
 import time
 import sys
 NULL_VALUE = {OBI_BOOL: OBIBool_NA, 
              OBI_CHAR: OBIChar_NA, 
              OBI_FLOAT: OBIFloat_NA,
              OBI_INT: OBIInt_NA,
              OBI_QUAL: [],
              OBI_SEQ: b"",
              OBI_STR: b""}
 __title__="Sort view lines according to the value of a given attribute."
 def addOptions(parser):
    addMinimalInputOption(parser)
    addMinimalOutputOption(parser)
    group=parser.add_argument_group('obi sort specific options')
    group.add_argument('--key', '-k',
                       action="append", dest="sort:keys",
                       metavar='<TAG NAME>',
                       default=[],
                       type=str,
                       help="Attribute used to sort the sequence records.")
    group.add_argument('--reverse', '-r',
                       action="store_true", dest="sort:reverse",
                       default=False,
                       help="Sort in reverse order.")
 def line_cmp(line, key, pb): 
    pb   
    if line[key] is None:
        return NULL_VALUE[line.view[key].data_type_int]
    else:
        return line[key]
 def run(config):
    DMS.obi_atexit()
    logger("info", "obi sort")
    # Open the input
    input = open_uri(config["obi"]["inputURI"])
    if input is None:
        raise Exception("Could not read input view")
    i_dms = input[0]
    i_view = input[1]
    # Open the output: only the DMS
    output = open_uri(config['obi']['outputURI'],
                      input=False,
                      dms_only=True)
    if output is None:
        raise Exception("Could not create output view")
    o_dms = output[0]
    o_view_name_final = output[1]
    o_view_name = o_view_name_final
    # If the input and output DMS are not the same, create output view in input DMS first, then export it
    # to output DMS, making sure the temporary view name is unique in the input DMS 
    if i_dms != o_dms:
        i=0
        while o_view_name in i_dms:
            o_view_name = o_view_name_final+b"_"+str2bytes(str(i))
            i+=1        
    # Initialize the progress bar
    pb = ProgressBar(len(i_view), config, seconde=5)
    keys = config['sort']['keys']
    selection = Line_selection(i_view)
    for i in range(len(i_view)):  # TODO special function?
        selection.append(i)
    for k in keys:  # TODO order?
        selection.sort(key=lambda line_idx: line_cmp(i_view[line_idx], k, pb(line_idx)), reverse=config['sort']['reverse'])
    pb(len(i_view), force=True)
    print("", file=sys.stderr)
    # Create output view with the sorted line selection
    try:
        o_view = selection.materialize(o_view_name)
    except Exception, e:
        raise RollbackException("obi sort error, rollbacking view: "+str(e), o_view)
    # Save command config in View and DMS comments
    command_line = " ".join(sys.argv[1:])
    input_dms_name=[input[0].name]
    input_view_name=[input[1].name]
    o_view.write_config(config, "sort", command_line, input_dms_name=input_dms_name, input_view_name=input_view_name)
    o_dms.record_command_line(command_line)
    # If input and output DMS are not the same, export the temporary view to the output DMS
    # and delete the temporary view in the input DMS
    if i_dms != o_dms:
        o_view.close()
        View.import_view(i_dms.full_path[:-7], o_dms.full_path[:-7], o_view_name, o_view_name_final)
        o_view = o_dms[o_view_name_final]
    #print("\n\nOutput view:\n````````````", file=sys.stderr)
    #print(repr(o_view), file=sys.stderr)
    # If the input and the output DMS are different, delete the temporary imported view used to create the final view
    if i_dms != o_dms:
        View.delete_view(i_dms, o_view_name)
        o_dms.close()
    i_dms.close()
    logger("info", "Done.")
--- a/python/obitools3/commands/stats.cfiles
+++ b/python/obitools3/commands/stats.cfiles
@ -1,103 +0,0 @@
 ../../../src/obi_lcs.h
 ../../../src/obi_lcs.c
 ../../../src/obierrno.h
 ../../../src/obierrno.c
 ../../../src/upperband.h
 ../../../src/upperband.c
 ../../../src/sse_banded_LCS_alignment.h
 ../../../src/sse_banded_LCS_alignment.c
 ../../../src/obiblob.h
 ../../../src/obiblob.c
 ../../../src/utils.h
 ../../../src/utils.c
 ../../../src/obidms.h
 ../../../src/obidms.c
 ../../../src/libjson/json_utils.h
 ../../../src/libjson/json_utils.c
 ../../../src/libjson/cJSON.h
 ../../../src/libjson/cJSON.c
 ../../../src/obiavl.h
 ../../../src/obiavl.c
 ../../../src/bloom.h
 ../../../src/bloom.c
 ../../../src/crc64.h
 ../../../src/crc64.c
 ../../../src/murmurhash2.h
 ../../../src/murmurhash2.c
 ../../../src/obidmscolumn.h
 ../../../src/obidmscolumn.c
 ../../../src/obitypes.h
 ../../../src/obitypes.c
 ../../../src/obidmscolumndir.h
 ../../../src/obidmscolumndir.c
 ../../../src/obiblob_indexer.h
 ../../../src/obiblob_indexer.c
 ../../../src/obiview.h
 ../../../src/obiview.c
 ../../../src/hashtable.h
 ../../../src/hashtable.c
 ../../../src/linked_list.h
 ../../../src/linked_list.c
 ../../../src/obidmscolumn_array.h
 ../../../src/obidmscolumn_array.c
 ../../../src/obidmscolumn_blob.h
 ../../../src/obidmscolumn_blob.c
 ../../../src/obidmscolumn_idx.h
 ../../../src/obidmscolumn_idx.c
 ../../../src/obidmscolumn_bool.h
 ../../../src/obidmscolumn_bool.c
 ../../../src/obidmscolumn_char.h
 ../../../src/obidmscolumn_char.c
 ../../../src/obidmscolumn_float.h
 ../../../src/obidmscolumn_float.c
 ../../../src/obidmscolumn_int.h
 ../../../src/obidmscolumn_int.c
 ../../../src/obidmscolumn_qual.h
 ../../../src/obidmscolumn_qual.c
 ../../../src/obidmscolumn_seq.h
 ../../../src/obidmscolumn_seq.c
 ../../../src/obidmscolumn_str.h
 ../../../src/obidmscolumn_str.c
 ../../../src/array_indexer.h
 ../../../src/array_indexer.c
 ../../../src/char_str_indexer.h
 ../../../src/char_str_indexer.c
 ../../../src/dna_seq_indexer.h
 ../../../src/dna_seq_indexer.c
 ../../../src/encode.c
 ../../../src/encode.h
 ../../../src/uint8_indexer.c
 ../../../src/uint8_indexer.h
 ../../../src/build_reference_db.c
 ../../../src/build_reference_db.h
 ../../../src/kmer_similarity.c
 ../../../src/kmer_similarity.h
 ../../../src/obi_clean.c
 ../../../src/obi_clean.h
 ../../../src/obi_ecopcr.c
 ../../../src/obi_ecopcr.h
 ../../../src/obi_ecotag.c
 ../../../src/obi_ecotag.h
 ../../../src/obidms_taxonomy.c
 ../../../src/obidms_taxonomy.h
 ../../../src/obilittlebigman.c
 ../../../src/obilittlebigman.h
 ../../../src/_sse.h
 ../../../src/obidebug.h
 ../../../src/libecoPCR/libapat/CODES/dft_code.h
 ../../../src/libecoPCR/libapat/CODES/dna_code.h
 ../../../src/libecoPCR/libapat/CODES/prot_code.h
 ../../../src/libecoPCR/libapat/apat_parse.c
 ../../../src/libecoPCR/libapat/apat_search.c
 ../../../src/libecoPCR/libapat/apat.h
 ../../../src/libecoPCR/libapat/Gmach.h
 ../../../src/libecoPCR/libapat/Gtypes.h
 ../../../src/libecoPCR/libapat/libstki.c
 ../../../src/libecoPCR/libapat/libstki.h
 ../../../src/libecoPCR/libthermo/nnparams.h
 ../../../src/libecoPCR/libthermo/nnparams.c
 ../../../src/libecoPCR/ecoapat.c
 ../../../src/libecoPCR/ecodna.c
 ../../../src/libecoPCR/ecoError.c
 ../../../src/libecoPCR/ecoMalloc.c
 ../../../src/libecoPCR/ecoPCR.h
--- a/python/obitools3/commands/stats.pyx
+++ b/python/obitools3/commands/stats.pyx
@ -1,265 +0,0 @@
 #cython: language_level=3
 from obitools3.apps.progress cimport ProgressBar  # @UnresolvedImport
 from obitools3.dms import DMS
 from obitools3.uri.decode import open_uri
 from obitools3.apps.optiongroups import addMinimalInputOption, addTaxonomyOption
 from obitools3.dms.view import RollbackException
 from obitools3.apps.config import logger
 from obitools3.dms.capi.obiview cimport COUNT_COLUMN
 from functools import reduce
 import math
 import time
 import sys
 __title__="Compute basic statistics for attribute values."
 '''
 `obi stats` computes basic statistics for attribute values of sequence records.
 The sequence records can be categorized or not using one or several ``-c`` options.
 By default, only the number of sequence records and the total count are computed for each category. 
 Additional statistics can be computed for attribute values in each category, such as:
    - minimum value (``-m`` option) 
    - maximum value (``-M`` option) 
    - mean value (``-a`` option) 
    - variance (``-v`` option) 
    - standard deviation (``-s`` option)
 The result is a contingency table with the different categories in rows, and the 
 computed statistics in columns. 
 '''
 # TODO: when is the taxonomy possibly used?
 def addOptions(parser):
    addMinimalInputOption(parser)
    addTaxonomyOption(parser)
    group=parser.add_argument_group('obi stats specific options')
    group.add_argument('-c','--category-attribute',
                       action="append", dest="stats:categories",
                             metavar="<Attribute Name>",
                             default=[],
                             help="Attribute used to categorize the records.")
    group.add_argument('-m','--min',
                       action="append", dest="stats:minimum",
                       metavar="<Attribute Name>",
                       default=[],
                       help="Compute the minimum value of attribute for each category.")
    group.add_argument('-M','--max',
                       action="append", dest="stats:maximum",
                       metavar="<Attribute Name>",
                       default=[],
                       help="Compute the maximum value of attribute for each category.")
    group.add_argument('-a','--mean',
                       action="append", dest="stats:mean",
                       metavar="<Attribute Name>",
                       default=[],
                       help="Compute the mean value of attribute for each category.")
    group.add_argument('-v','--variance',
                       action="append", dest="stats:var",
                       metavar="<Attribute Name>",
                       default=[],
                       help="Compute the variance of attribute for each category.")
    group.add_argument('-s','--std-dev',
                       action="append", dest="stats:sd",
                       metavar="<Attribute Name>",
                       default=[],
                       help="Compute the standard deviation of attribute for each category.")
 def statistics(values, attributes, func):
    stat={}
    lstat={}
    for var in attributes:
        if var in values:
            stat[var]={}
            lstat[var]=0
            for c in values[var]:
                v = values[var][c]
                m = func(v)
                stat[var][c]=m
                lm=len(str(m))
                if lm > lstat[var]:
                    lstat[var]=lm
    return stat, lstat
 def minimum(values, options):
    return statistics(values, options['minimum'], min)
 def maximum(values, options):
    return statistics(values, options['maximum'], max)
 def mean(values, options):
    def average(v):
        s = reduce(lambda x,y:x+y,v,0)
        return float(s)/len(v)
    return statistics(values, options['mean'], average)
 def variance(v):
    if len(v)==1: 
        return 0 
    s = reduce(lambda x,y:(x[0]+y,x[1]+y**2),v,(0.,0.))
    return s[1]/(len(v)-1) - s[0]**2/len(v)/(len(v)-1)
 def varpop(values, options):
    return statistics(values, options['var'], variance)
 def sd(values, options):
    def stddev(v):
        return math.sqrt(variance(v))
    return statistics(values, options['sd'], stddev)
 def run(config):
    DMS.obi_atexit()
    logger("info", "obi stats")
    # Open the input
    input = open_uri(config['obi']['inputURI'])
    if input is None:
        raise Exception("Could not read input view")
    i_view = input[1]
    if 'taxoURI' in config['obi'] and config['obi']['taxoURI'] is not None:
        taxo_uri = open_uri(config['obi']['taxoURI'])
        if taxo_uri is None:
            raise Exception("Couldn't open taxonomy")
        taxo = taxo_uri[1]
    else :
        taxo = None
    statistics = set(config['stats']['minimum']) | set(config['stats']['maximum']) | set(config['stats']['mean'])
    total = 0
    catcount={}
    totcount={}
    values={}
    lcat=0
    # Initialize the progress bar
    pb = ProgressBar(len(i_view), config, seconde=5)
    for i in range(len(i_view)):
        pb(i)
        line = i_view[i]
        category = []
        for c in config['stats']['categories']:
            try:
                if taxo is not None:
                    loc_env = {'sequence': line, 'line': line, 'taxonomy': taxo}
                else:
                    loc_env = {'sequence': line, 'line': line}
                v = eval(c, loc_env, line)
                lv=len(str(v))
                if lv > lcat:
                    lcat=lv
                category.append(v)
            except:
                category.append(None)
                if 4 > lcat:
                    lcat=4
        category=tuple(category)
        catcount[category]=catcount.get(category,0)+1
        try: 
            totcount[category]=totcount.get(category,0)+line[COUNT_COLUMN]
        except KeyError:
            totcount[category]=totcount.get(category,0)+1
        for var in statistics:
            if var in line:
                v = line[var]
                if var not in values:
                    values[var]={}
                if category not in values[var]:
                    values[var][category]=[]
                values[var][category].append(v)    
    pb(i, force=True)
    print("", file=sys.stderr)
    mini, lmini = minimum(values, config['stats'])
    maxi, lmaxi = maximum(values, config['stats'])
    avg, lavg = mean(values, config['stats'])
    varp, lvarp = varpop(values, config['stats'])
    sigma, lsigma = sd(values, config['stats'])
    pcat = "%%-%ds" % lcat
    if config['stats']['minimum']:
        minvar= "min_%%-%ds" % max(len(x) for x in config['stats']['minimum'])
    else:
        minvar= "%s"
    if config['stats']['maximum']:
        maxvar= "max_%%-%ds" % max(len(x) for x in config['stats']['maximum'])
    else:
        maxvar= "%s"
    if config['stats']['mean']:
        meanvar= "mean_%%-%ds" % max(len(x) for x in config['stats']['mean'])
    else:
        meanvar= "%s"
    if config['stats']['var']:
        varvar= "var_%%-%ds" % max(len(x) for x in config['stats']['var'])
    else:
        varvar= "%s"
    if config['stats']['sd']:
        sdvar= "sd_%%-%ds" % max(len(x) for x in config['stats']['sd'])
    else:
        sdvar= "%s"
    hcat = "\t".join([pcat % x for x in config['stats']['categories']]) + "\t" +\
           "\t".join([minvar % x for x in config['stats']['minimum']])  + "\t" +\
           "\t".join([maxvar % x for x in config['stats']['maximum']])  + "\t" +\
           "\t".join([meanvar % x for x in config['stats']['mean']])  + "\t" +\
           "\t".join([varvar % x for x in config['stats']['var']])  + "\t" +\
           "\t".join([sdvar % x for x in config['stats']['sd']]) + \
           "\t   count" + \
           "\t   total" 
    print(hcat)
    for c in catcount:
        for v in c:
            print(pcat % str(v)+"\t", end="")
        for m in config['stats']['minimum']:
            print((("%%%dd" % lmini[m]) % mini[m][c])+"\t", end="")
        for m in config['stats']['maximum']:
            print((("%%%dd" % lmaxi[m]) % maxi[m][c])+"\t", end="")
        for m in config['stats']['mean']:
            print((("%%%df" % lavg[m]) % avg[m][c])+"\t", end="")
        for m in config['stats']['var']:
            print((("%%%df" % lvarp[m]) % varp[m][c])+"\t", end="")
        for m in config['stats']['sd']:
            print((("%%%df" % lsigma[m]) % sigma[m][c])+"\t", end="")
        print("%7d" %catcount[c], end="")
        print("%9d" %totcount[c])
    input[0].close()
    logger("info", "Done.")
--- a/python/obitools3/commands/tail.cfiles
+++ b/python/obitools3/commands/tail.cfiles
@ -1,103 +0,0 @@
 ../../../src/obi_lcs.h
 ../../../src/obi_lcs.c
 ../../../src/obierrno.h
 ../../../src/obierrno.c
 ../../../src/upperband.h
 ../../../src/upperband.c
 ../../../src/sse_banded_LCS_alignment.h
 ../../../src/sse_banded_LCS_alignment.c
 ../../../src/obiblob.h
 ../../../src/obiblob.c
 ../../../src/utils.h
 ../../../src/utils.c
 ../../../src/obidms.h
 ../../../src/obidms.c
 ../../../src/libjson/json_utils.h
 ../../../src/libjson/json_utils.c
 ../../../src/libjson/cJSON.h
 ../../../src/libjson/cJSON.c
 ../../../src/obiavl.h
 ../../../src/obiavl.c
 ../../../src/bloom.h
 ../../../src/bloom.c
 ../../../src/crc64.h
 ../../../src/crc64.c
 ../../../src/murmurhash2.h
 ../../../src/murmurhash2.c
 ../../../src/obidmscolumn.h
 ../../../src/obidmscolumn.c
 ../../../src/obitypes.h
 ../../../src/obitypes.c
 ../../../src/obidmscolumndir.h
 ../../../src/obidmscolumndir.c
 ../../../src/obiblob_indexer.h
 ../../../src/obiblob_indexer.c
 ../../../src/obiview.h
 ../../../src/obiview.c
 ../../../src/hashtable.h
 ../../../src/hashtable.c
 ../../../src/linked_list.h
 ../../../src/linked_list.c
 ../../../src/obidmscolumn_array.h
 ../../../src/obidmscolumn_array.c
 ../../../src/obidmscolumn_blob.h
 ../../../src/obidmscolumn_blob.c
 ../../../src/obidmscolumn_idx.h
 ../../../src/obidmscolumn_idx.c
 ../../../src/obidmscolumn_bool.h
 ../../../src/obidmscolumn_bool.c
 ../../../src/obidmscolumn_char.h
 ../../../src/obidmscolumn_char.c
 ../../../src/obidmscolumn_float.h
 ../../../src/obidmscolumn_float.c
 ../../../src/obidmscolumn_int.h
 ../../../src/obidmscolumn_int.c
 ../../../src/obidmscolumn_qual.h
 ../../../src/obidmscolumn_qual.c
 ../../../src/obidmscolumn_seq.h
 ../../../src/obidmscolumn_seq.c
 ../../../src/obidmscolumn_str.h
 ../../../src/obidmscolumn_str.c
 ../../../src/array_indexer.h
 ../../../src/array_indexer.c
 ../../../src/char_str_indexer.h
 ../../../src/char_str_indexer.c
 ../../../src/dna_seq_indexer.h
 ../../../src/dna_seq_indexer.c
 ../../../src/encode.c
 ../../../src/encode.h
 ../../../src/uint8_indexer.c
 ../../../src/uint8_indexer.h
 ../../../src/build_reference_db.c
 ../../../src/build_reference_db.h
 ../../../src/kmer_similarity.c
 ../../../src/kmer_similarity.h
 ../../../src/obi_clean.c
 ../../../src/obi_clean.h
 ../../../src/obi_ecopcr.c
 ../../../src/obi_ecopcr.h
 ../../../src/obi_ecotag.c
 ../../../src/obi_ecotag.h
 ../../../src/obidms_taxonomy.c
 ../../../src/obidms_taxonomy.h
 ../../../src/obilittlebigman.c
 ../../../src/obilittlebigman.h
 ../../../src/_sse.h
 ../../../src/obidebug.h
 ../../../src/libecoPCR/libapat/CODES/dft_code.h
 ../../../src/libecoPCR/libapat/CODES/dna_code.h
 ../../../src/libecoPCR/libapat/CODES/prot_code.h
 ../../../src/libecoPCR/libapat/apat_parse.c
 ../../../src/libecoPCR/libapat/apat_search.c
 ../../../src/libecoPCR/libapat/apat.h
 ../../../src/libecoPCR/libapat/Gmach.h
 ../../../src/libecoPCR/libapat/Gtypes.h
 ../../../src/libecoPCR/libapat/libstki.c
 ../../../src/libecoPCR/libapat/libstki.h
 ../../../src/libecoPCR/libthermo/nnparams.h
 ../../../src/libecoPCR/libthermo/nnparams.c
 ../../../src/libecoPCR/ecoapat.c
 ../../../src/libecoPCR/ecodna.c
 ../../../src/libecoPCR/ecoError.c
 ../../../src/libecoPCR/ecoMalloc.c
 ../../../src/libecoPCR/ecoPCR.h
--- a/python/obitools3/commands/tail.pyx
+++ b/python/obitools3/commands/tail.pyx
@ -1,110 +0,0 @@
 #cython: language_level=3
 from obitools3.apps.progress cimport ProgressBar  # @UnresolvedImport
 from obitools3.dms import DMS
 from obitools3.dms.view.view cimport View, Line_selection
 from obitools3.uri.decode import open_uri
 from obitools3.apps.optiongroups import addMinimalInputOption, addMinimalOutputOption
 from obitools3.dms.view import RollbackException
 from obitools3.apps.config import logger
 from obitools3.utils cimport str2bytes
 import time
 import sys
 __title__="Keep the N last lines of a view."
 def addOptions(parser):
    addMinimalInputOption(parser)
    addMinimalOutputOption(parser)
    group=parser.add_argument_group('obi tail specific options')
    group.add_argument('-n', '--sequence-count',
                       action="store", dest="tail:count",
                       metavar='<N>',
                       default=10,
                       type=int,
                       help="Number of last records to keep.")
 def run(config):
    DMS.obi_atexit()
    logger("info", "obi tail")
    # Open the input
    input = open_uri(config["obi"]["inputURI"])
    if input is None:
        raise Exception("Could not read input view")
    i_dms = input[0]
    i_view = input[1]
    # Open the output: only the DMS
    output = open_uri(config['obi']['outputURI'],
                      input=False,
                      dms_only=True)
    if output is None:
        raise Exception("Could not create output view")
    o_dms = output[0]
    o_view_name_final = output[1]
    o_view_name = o_view_name_final
    # If the input and output DMS are not the same, create output view in input DMS first, then export it
    # to output DMS, making sure the temporary view name is unique in the input DMS 
    if i_dms != o_dms:
        i=0
        while o_view_name in i_dms:
            o_view_name = o_view_name_final+b"_"+str2bytes(str(i))
            i+=1        
    start = max(len(i_view) - config['tail']['count'], 0)
    # Initialize the progress bar
    pb = ProgressBar(len(i_view) - start, config, seconde=5)
    selection = Line_selection(i_view)
    for i in range(start, len(i_view)):
        pb(i)
        selection.append(i)
    pb(i, force=True)
    print("", file=sys.stderr)
    # Save command config in View comments
    command_line = " ".join(sys.argv[1:])
    comments = View.get_config_dict(config, "tail", command_line, input_dms_name=[i_dms.name], input_view_name=[i_view.name])
    # Create output view with the line selection
    try:
        o_view = selection.materialize(o_view_name)
    except Exception, e:
        raise RollbackException("obi tail error, rollbacking view: "+str(e), o_view)
    # Save command config in DMS comments
    command_line = " ".join(sys.argv[1:])
    o_view.write_config(config, "tail", command_line, input_dms_name=[i_dms.name], input_view_name=[i_view.name])
    o_dms.record_command_line(command_line)
    # If input and output DMS are not the same, export the temporary view to the output DMS
    # and delete the temporary view in the input DMS
    if i_dms != o_dms:
        o_view.close()
        View.import_view(i_dms.full_path[:-7], o_dms.full_path[:-7], o_view_name, o_view_name_final)
        o_view = o_dms[o_view_name_final]
    #print("\n\nOutput view:\n````````````", file=sys.stderr)
    #print(repr(o_view), file=sys.stderr)
    # If the input and the output DMS are different, delete the temporary imported view used to create the final view
    if i_dms != o_dms:
        View.delete_view(i_dms, o_view_name)
        o_dms.close()
    i_dms.close()
    logger("info", "Done.")
--- a/python/obitools3/commands/test.cfiles
+++ b/python/obitools3/commands/test.cfiles
@ -1,103 +0,0 @@
 ../../../src/obi_lcs.h
 ../../../src/obi_lcs.c
 ../../../src/obierrno.h
 ../../../src/obierrno.c
 ../../../src/upperband.h
 ../../../src/upperband.c
 ../../../src/sse_banded_LCS_alignment.h
 ../../../src/sse_banded_LCS_alignment.c
 ../../../src/obiblob.h
 ../../../src/obiblob.c
 ../../../src/utils.h
 ../../../src/utils.c
 ../../../src/obidms.h
 ../../../src/obidms.c
 ../../../src/libjson/json_utils.h
 ../../../src/libjson/json_utils.c
 ../../../src/libjson/cJSON.h
 ../../../src/libjson/cJSON.c
 ../../../src/obiavl.h
 ../../../src/obiavl.c
 ../../../src/bloom.h
 ../../../src/bloom.c
 ../../../src/crc64.h
 ../../../src/crc64.c
 ../../../src/murmurhash2.h
 ../../../src/murmurhash2.c
 ../../../src/obidmscolumn.h
 ../../../src/obidmscolumn.c
 ../../../src/obitypes.h
 ../../../src/obitypes.c
 ../../../src/obidmscolumndir.h
 ../../../src/obidmscolumndir.c
 ../../../src/obiblob_indexer.h
 ../../../src/obiblob_indexer.c
 ../../../src/obiview.h
 ../../../src/obiview.c
 ../../../src/hashtable.h
 ../../../src/hashtable.c
 ../../../src/linked_list.h
 ../../../src/linked_list.c
 ../../../src/obidmscolumn_array.h
 ../../../src/obidmscolumn_array.c
 ../../../src/obidmscolumn_blob.h
 ../../../src/obidmscolumn_blob.c
 ../../../src/obidmscolumn_idx.h
 ../../../src/obidmscolumn_idx.c
 ../../../src/obidmscolumn_bool.h
 ../../../src/obidmscolumn_bool.c
 ../../../src/obidmscolumn_char.h
 ../../../src/obidmscolumn_char.c
 ../../../src/obidmscolumn_float.h
 ../../../src/obidmscolumn_float.c
 ../../../src/obidmscolumn_int.h
 ../../../src/obidmscolumn_int.c
 ../../../src/obidmscolumn_qual.h
 ../../../src/obidmscolumn_qual.c
 ../../../src/obidmscolumn_seq.h
 ../../../src/obidmscolumn_seq.c
 ../../../src/obidmscolumn_str.h
 ../../../src/obidmscolumn_str.c
 ../../../src/array_indexer.h
 ../../../src/array_indexer.c
 ../../../src/char_str_indexer.h
 ../../../src/char_str_indexer.c
 ../../../src/dna_seq_indexer.h
 ../../../src/dna_seq_indexer.c
 ../../../src/encode.c
 ../../../src/encode.h
 ../../../src/uint8_indexer.c
 ../../../src/uint8_indexer.h
 ../../../src/build_reference_db.c
 ../../../src/build_reference_db.h
 ../../../src/kmer_similarity.c
 ../../../src/kmer_similarity.h
 ../../../src/obi_clean.c
 ../../../src/obi_clean.h
 ../../../src/obi_ecopcr.c
 ../../../src/obi_ecopcr.h
 ../../../src/obi_ecotag.c
 ../../../src/obi_ecotag.h
 ../../../src/obidms_taxonomy.c
 ../../../src/obidms_taxonomy.h
 ../../../src/obilittlebigman.c
 ../../../src/obilittlebigman.h
 ../../../src/_sse.h
 ../../../src/obidebug.h
 ../../../src/libecoPCR/libapat/CODES/dft_code.h
 ../../../src/libecoPCR/libapat/CODES/dna_code.h
 ../../../src/libecoPCR/libapat/CODES/prot_code.h
 ../../../src/libecoPCR/libapat/apat_parse.c
 ../../../src/libecoPCR/libapat/apat_search.c
 ../../../src/libecoPCR/libapat/apat.h
 ../../../src/libecoPCR/libapat/Gmach.h
 ../../../src/libecoPCR/libapat/Gtypes.h
 ../../../src/libecoPCR/libapat/libstki.c
 ../../../src/libecoPCR/libapat/libstki.h
 ../../../src/libecoPCR/libthermo/nnparams.h
 ../../../src/libecoPCR/libthermo/nnparams.c
 ../../../src/libecoPCR/ecoapat.c
 ../../../src/libecoPCR/ecodna.c
 ../../../src/libecoPCR/ecoError.c
 ../../../src/libecoPCR/ecoMalloc.c
 ../../../src/libecoPCR/ecoPCR.h
--- a/python/obitools3/commands/test.pyx
+++ b/python/obitools3/commands/test.pyx
@ -1,531 +0,0 @@
 #cython: language_level=3
 from obitools3.apps.progress cimport ProgressBar  # TODO I absolutely don't understand why it doesn't work without that line
 from obitools3.dms.view import View, Line_selection
 from obitools3.dms.view.typed_view.view_NUC_SEQS import View_NUC_SEQS
 from obitools3.dms import DMS
 from obitools3.dms.column import Column
 from obitools3.dms.taxo import Taxonomy
 from obitools3.utils cimport str2bytes
 from obitools3.dms.capi.obitypes cimport OBI_INT, \
                                         OBI_FLOAT, \
                                         OBI_BOOL, \
                                         OBI_CHAR, \
                                         OBI_STR, \
                                         OBI_SEQ
 from obitools3.dms.capi.obiview cimport NUC_SEQUENCE_COLUMN, \
                                        ID_COLUMN, \
                                        DEFINITION_COLUMN, \
                                        QUALITY_COLUMN, \
                                        COUNT_COLUMN
 import shutil
 import string
 import random
 VIEW_TYPES = [b"", b"NUC_SEQS_VIEW"]
 COL_TYPES = [OBI_INT, OBI_FLOAT, OBI_BOOL, OBI_CHAR, OBI_STR, OBI_SEQ]
 SPECIAL_COLUMNS = [NUC_SEQUENCE_COLUMN, ID_COLUMN, DEFINITION_COLUMN, QUALITY_COLUMN]
 #TAXDUMP = "" TODO path=?
 TAXTEST = b"taxtest"
 NAME_MAX_LEN = 200
 COL_COMMENTS_MAX_LEN = 2048
 MAX_INT = 2147483647    # used to generate random float values
 __title__="Tests if the obitools are working properly"
 default_config = {
                 }
 def test_taxo(config, infos):
    tax1 = Taxonomy.open_taxdump(infos['dms'], config['obi']['taxo'])
    tax1.write(TAXTEST)
    tax2 = Taxonomy.open(infos['dms'], TAXTEST)
    assert len(tax1) == len(tax2), "Length of written taxonomy != length of read taxdump : "+str(len(tax2))+" != "+str(len(tax1))
    i = 0
    for x in range(config['test']['nbtests']):
        idx = random.randint(0, len(tax1)-1)
        t1 = tax1.get_taxon_by_idx(idx)
        taxid1 = t1.taxid
        t2 = tax2.get_taxon_by_idx(idx)
        taxid2 = t2.taxid
        assert t1 == t2, "Taxon gotten from written taxonomy with index != taxon read from taxdump : "+str(t2)+" != "+str(t1)
        t1 = tax1[taxid1]
        t2 = tax2[taxid2]
        assert t1 == t2, "Taxon gotten from written taxonomy with taxid != taxon read from taxdump : "+str(t2)+" != "+str(t1)
        i+=1
        if (i%(config['test']['nbtests']/10)) == 0 :
            print("Testing taxonomy functions......"+str(i*100/config['test']['nbtests'])+"%")
    tax1.close()
    tax2.close()
 def random_length(max_len):
    return random.randint(1, max_len)
 def random_bool(config):
    return random.choice([True, False])
 def random_bool_tuples(config):
    l=[]
    for i in range(random.randint(1, config['test']['tuplemaxlen'])) :
        l.append(random.choice([None, random_bool(config)]))
    return tuple(l)
 def random_char(config):
    return str2bytes(random.choice(string.ascii_lowercase))
 def random_char_tuples(config):
    l=[]
    for i in range(random.randint(1, config['test']['tuplemaxlen'])) :
        l.append(random.choice([None, random_char(config)]))
    return tuple(l)
 def random_float(config):
    return random.randint(0, MAX_INT) + random.random()
 def random_float_tuples(config):
    l=[]
    for i in range(random.randint(1, config['test']['tuplemaxlen'])) :
        l.append(random.choice([None, random_float(config)]))
    return tuple(l)
 def random_int(config):
    return random.randint(0, config['test']['maxlinenb'])
 def random_int_tuples(config):
    l=[]
    for i in range(random.randint(1, config['test']['tuplemaxlen'])) :
        l.append(random.choice([None, random_int(config)]))
    return tuple(l)
 def random_seq(config):
    return str2bytes(''.join(random.choice(['a','t','g','c']) for i in range(random_length(config['test']['seqmaxlen']))))
 def random_seq_tuples(config):
    l=[]
    for i in range(random.randint(1, config['test']['tuplemaxlen'])) :
        l.append(random.choice([None, random_seq(config)]))
    return tuple(l)
 def random_bytes(config):
    return random_bytes_with_max_len(config['test']['strmaxlen'])
 def random_bytes_tuples(config):
    l=[]
    for i in range(random.randint(1, config['test']['tuplemaxlen'])) :
        l.append(random.choice([None, random_bytes(config)]))
    return tuple(l)
 def random_str_with_max_len(max_len):
    return ''.join(random.choice(string.ascii_lowercase) for i in range(random_length(max_len)))
 def random_bytes_with_max_len(max_len):
    return str2bytes(random_str_with_max_len(max_len))
 RANDOM_FUNCTIONS = [random_bool, random_char, random_bytes, random_float, random_int]
 def random_comments(config):
    comments = {}
    for i in range(random_length(1000)):
        to_add = {random_bytes(config): random.choice(RANDOM_FUNCTIONS)(config)}
        if len(str(comments)) + len(str(to_add)) >= COL_COMMENTS_MAX_LEN:
            return comments
        else:
            comments.update(to_add)
    return comments
 def random_column(infos):
    return random.choice(sorted(list(infos['view'].keys())))
 def random_unique_name(infos):
    name = b""
    while name == b"" or name in infos['unique_names'] :
        name = random_bytes_with_max_len(NAME_MAX_LEN)
    infos['unique_names'].append(name)
    return name
 def random_unique_element_name(config, infos):
    name = b""
    while name == b"" or name in infos['unique_names'] :
        name = random_bytes_with_max_len(config['test']['elt_name_max_len'])
    infos['unique_names'].append(name)
    return name
 def print_test(config, sentence):
    if config['test']['verbose'] :
        print(sentence)
 def test_set_and_get(config, infos):
    print_test(config, ">>> Set and get test")
    col_name = random_column(infos)
    col = infos['view'][col_name]
    element_names = col.elements_names
    data_type = col.data_type
    if data_type == b"OBI_QUAL" :
        print_test(config, "-")
        return
    idx = random_int(config)
    value = random.choice([None, infos['random_generator'][(data_type, col.tuples)](config)])
    if col.nb_elements_per_line > 1 :
        elt = random.choice(element_names)
        col[idx][elt] = value
        assert col[idx][elt] == value, "Column: "+repr(col)+"\nSet value != gotten value "+str(value)+" != "+str(col[idx][elt])
    elif col.tuples:
        col[idx] = value
        if value is None:
            totest = None
        else:
            totest = []
            for e in value:
                if e is not None and e != '':
                    totest.append(e)
            if len(totest) == 0:
                totest = None
            else:
                totest = tuple(totest)
        assert col[idx] == totest, "Column: "+repr(col)+"\nSet value != gotten value "+str(totest)+" != "+str(col[idx])
        if totest is not None:
            for i in range(len(totest)) :
                assert col[idx][i] == totest[i], "Column: "+repr(col)+"\nSet value[i] != gotten value[i] "+str(totest[i])+" != "+str(col[idx][i])
    else:
        col[idx] = value
        assert col[idx] == value, "Column: "+repr(col)+"\nSet value != gotten value "+str(value)+" != "+str(col[idx])
    print_test(config, ">>> Set and get test OK")
 def test_add_col(config, infos):
    print_test(config, ">>> Add column test")
    #existing_col = random_bool(config)    # TODO doesn't work because of line count problem. See obiview.c line 1737
    #if existing_col and infos["view_names"] != [] :
    #    random_view = infos['dms'].open_view(random.choice(infos["view_names"]))
    #    random_column = random_view[random.choice(sorted(list(random_view.columns))]
    #    random_column_refs = random_column.refs
    #    if random_column_refs['name'] in infos['view'] :
    #        alias = random_unique_name(infos)
    #    else :
    #        alias = ''
    #    infos['view'].add_column(random_column_refs['name'], version_number=random_column_refs['version'], alias=alias, create=False)
    #    random_view.close()
    #else :
    create_random_column(config, infos)
    print_test(config, ">>> Add column test OK")
 def test_delete_col(config, infos):
    print_test(config, ">>> Delete column test")
    if len(list(infos['view'].keys())) <= 1 :
        print_test(config, "-")
        return
    col_name = random_column(infos)
    if col_name in SPECIAL_COLUMNS :
        print_test(config, "-")
        return
    infos['view'].delete_column(col_name)
    print_test(config, ">>> Delete column test OK")
 def test_col_alias(config, infos):
    print_test(config, ">>> Changing column alias test")
    col_name = random_column(infos)
    if col_name in SPECIAL_COLUMNS :
        print_test(config, "-")
        return
    infos['view'][col_name].name = random_unique_name(infos)
    print_test(config, ">>> Changing column alias test OK")
 def test_new_view(config, infos):
    print_test(config, ">>> New view test")
    random_new_view(config, infos)
    print_test(config, ">>> New view test OK")
 def random_test(config, infos):
    return random.choice(infos['tests'])(config, infos)
 def random_view_type():
    return random.choice(VIEW_TYPES)
 def random_col_type():
    return random.choice(COL_TYPES)    
 def fill_column(config, infos, col) :
    data_type = col.data_type
    element_names = col.elements_names
    if len(element_names) > 1 :
        for i in range(random_int(config)) :
            for j in range(len(element_names)) :
                col[i][element_names[j]] = random.choice([None, infos['random_generator'][(data_type, col.tuples)](config)])
    else :
        for i in range(random_int(config)) :
            r = random.choice([None, infos['random_generator'][(data_type, col.tuples)](config)])
            col[i] = r
 def create_random_column(config, infos) :
    alias = random.choice([b'', random_unique_name(infos)])
    tuples = random.choice([True, False])
    if not tuples :
        nb_elements_per_line=random.randint(1, config['test']['maxelts'])
        elements_names = []
        for i in range(nb_elements_per_line) :
            elements_names.append(random_unique_element_name(config, infos))
        elements_names = random.choice([None, elements_names])
    else :
        nb_elements_per_line = 1
        elements_names = None
    name = random_unique_name(infos)
    data_type = random_col_type()
    column = Column.new_column(infos['view'], 
                               name, 
                               data_type, 
                               nb_elements_per_line=nb_elements_per_line,
                               elements_names=elements_names,
                               tuples=tuples,
                               comments=random_comments(config),
                               alias=alias
                               )   
    if alias != b'' :
        assert infos['view'][alias] == column
    else :
        assert infos['view'][name] == column
    return column
 def fill_view(config, infos):
    for i in range(random.randint(1, config['test']['maxinicolcount'])) :
        col = create_random_column(config, infos)
        fill_column(config, infos, col)
 def random_new_view(config, infos, first=False):
    v_to_clone = None
    line_selection = None
    quality_col = False     # TODO
    if not first:
        infos['view_names'].append(infos['view'].name)
        infos['view'].close()
        v_to_clone = View.open(infos['dms'], random.choice(infos["view_names"]))
        v_type = b""
        print_test(config, "View to clone: ")
        print_test(config, repr(v_to_clone))
        create_line_selection = random_bool(config)
        if create_line_selection and v_to_clone.line_count > 0:
            print_test(config, "New view with new line selection.")
            line_selection = Line_selection(v_to_clone)
            for i in range(random.randint(1, v_to_clone.line_count)) :
                line_selection.append(random.randint(0, v_to_clone.line_count-1))
            #print_test(config, "New line selection: "+str(line_selection))
    else :
        v_type = random_view_type()
    if line_selection is not None :
        infos['view'] = line_selection.materialize(random_unique_name(infos), comments=random_comments(config))
    elif v_to_clone is not None :
        infos['view'] = v_to_clone.clone(random_unique_name(infos), comments=random_comments(config))
    else :
        if v_type == "NUC_SEQS_VIEW" :
            infos['view'] = View_NUC_SEQS.new(infos['dms'], random_unique_name(infos), comments=random_comments(config))   # TODO quality column
        else :
            infos['view'] = View.new(infos['dms'], random_unique_name(infos), comments=random_comments(config))   # TODO quality column
    print_test(config, repr(infos['view']))
    if v_to_clone is not None :
        if line_selection is None:
            assert v_to_clone.line_count == infos['view'].line_count, "New view and cloned view don't have the same line count : "+str(v_to_clone.line_count)+" (view to clone line count) != "+str(infos['view'].line_count)+" (new view line count)"
        else :
            assert len(line_selection) == infos['view'].line_count, "New view with new line selection does not have the right line count : "+str(len(line_selection))+" (line selection length) != "+str(infos['view'].line_count)+" (new view line count)"
        v_to_clone.close()
    if first :
        fill_view(config, infos)
 def create_test_obidms(config, infos):
    infos['dms'] = DMS.new(config['obi']['defaultdms'])
 def ini_dms_and_first_view(config, infos):
    create_test_obidms(config, infos)
    random_new_view(config, infos, first=True)
    infos['view_names'] = []
 def addOptions(parser):
    # TODO put this common group somewhere else but I don't know where
    group=parser.add_argument_group('DMS and view options')
    group.add_argument('--default-dms','-d', 
                       action="store", dest="obi:defaultdms",
                       metavar='<DMS NAME>',
                       default="/tmp/test_dms",
                       type=str,
                       help="Name of the default DMS for reading and writing data. "
                            "Default: /tmp/test_dms")
    group.add_argument('--taxo','-t',     # TODO I don't understand why the option is not registered if it is not set
                       action="store", dest="obi:taxo",
                       metavar='<TAXDUMP PATH>',
                       default='',  # TODO not None because if it's None, the option is not entered in the option dictionary.
                       type=str,
                       help="Path to a taxdump to test the taxonomy.")
    group=parser.add_argument_group('obi test specific options')
    group.add_argument('--nb_tests','-n',
                       action="store", dest="test:nbtests",
                       metavar='<NB_TESTS>',
                       default=1000,
                       type=int,
                       help="Number of tests to carry out. "
                            "Default: 1000")
    group.add_argument('--seq_max_len','-s',
                       action="store", dest="test:seqmaxlen",
                       metavar='<SEQ_MAX_LEN>',
                       default=200,
                       type=int,
                       help="Maximum length of DNA sequences. "
                            "Default: 200")
    group.add_argument('--str_max_len','-r',
                       action="store", dest="test:strmaxlen",
                       metavar='<STR_MAX_LEN>',
                       default=200,
                       type=int,
                       help="Maximum length of character strings. "
                            "Default: 200")
    group.add_argument('--tuple_max_len','-u',
                       action="store", dest="test:tuplemaxlen",
                       metavar='<TUPLE_MAX_LEN>',
                       default=20,
                       type=int,
                       help="Maximum length of tuples. "
                            "Default: 200")
    group.add_argument('--max_ini_col_count','-o',
                       action="store", dest="test:maxinicolcount",
                       metavar='<MAX_INI_COL_COUNT>',
                       default=10,
                       type=int,
                       help="Maximum number of columns in the initial view. "
                            "Default: 10")
    group.add_argument('--max_line_nb','-l',
                       action="store", dest="test:maxlinenb",
                       metavar='<MAX_LINE_NB>',
                       default=10000,
                       type=int,
                       help="Maximum number of lines in a column. "
                            "Default: 10000")
    group.add_argument('--max_elts_per_line','-e',
                       action="store", dest="test:maxelts",
                       metavar='<MAX_ELTS_PER_LINE>',
                       default=20,
                       type=int,
                       help="Maximum number of elements per line in a column. "
                            "Default: 20")
    group.add_argument('--verbose','-v',
                       action="store_true", dest="test:verbose",
                       default=False,
                       help="Print the tests. "
                            "Default: Don't print the tests")
    group.add_argument('--seed','-g',
                       action="store", dest="test:seed",
                       metavar='<SEED>',
                       default=None,
                       help="Seed (use for reproducible tests). "
                            "Default: Seed is determined by Python")
 def run(config):
    if 'seed' in config['test'] :
        random.seed(config['test']['seed'])
    infos = {'dms': None, 
             'view': None, 
             'view_names': None, 
             'unique_names': [],
             'random_generator': {
                                    (b"OBI_BOOL", False): random_bool, (b"OBI_BOOL", True): random_bool_tuples, 
                                    (b"OBI_CHAR", False): random_char, (b"OBI_CHAR", True): random_char_tuples, 
                                    (b"OBI_FLOAT", False): random_float, (b"OBI_FLOAT", True): random_float_tuples, 
                                    (b"OBI_INT", False): random_int, (b"OBI_INT", True): random_int_tuples, 
                                    (b"OBI_SEQ", False): random_seq, (b"OBI_SEQ", True): random_seq_tuples,
                                    (b"OBI_STR", False): random_bytes, (b"OBI_STR", True): random_bytes_tuples
                                  },
             'tests': [test_set_and_get, test_add_col, test_delete_col, test_col_alias, test_new_view]
            }
    # TODO ???
    config['test']['elt_name_max_len'] = int((COL_COMMENTS_MAX_LEN - config['test']['maxelts']) / config['test']['maxelts'])
    print("Initializing the DMS and the first view...")
    shutil.rmtree(config['obi']['defaultdms']+'.obidms', ignore_errors=True)
    ini_dms_and_first_view(config, infos)
    print_test(config, repr(infos['view']))
    i = 0
    for t in range(config['test']['nbtests']):
        random_test(config, infos)
        print_test(config, repr(infos['view']))
        i+=1
        if (i%(config['test']['nbtests']/10)) == 0 :
            print("Testing......"+str(i*100/config['test']['nbtests'])+"%")
    #print(infos)
    if config['obi']['taxo'] != '' :
        test_taxo(config, infos)
    infos['view'].close()
    infos['dms'].close()
    shutil.rmtree(config['obi']['defaultdms']+'.obidms', ignore_errors=True)
    print("Done.")
--- a/python/obitools3/commands/uniq.cfiles
+++ b/python/obitools3/commands/uniq.cfiles
@ -1,103 +0,0 @@
 ../../../src/obi_lcs.h
 ../../../src/obi_lcs.c
 ../../../src/obierrno.h
 ../../../src/obierrno.c
 ../../../src/upperband.h
 ../../../src/upperband.c
 ../../../src/sse_banded_LCS_alignment.h
 ../../../src/sse_banded_LCS_alignment.c
 ../../../src/obiblob.h
 ../../../src/obiblob.c
 ../../../src/utils.h
 ../../../src/utils.c
 ../../../src/obidms.h
 ../../../src/obidms.c
 ../../../src/libjson/json_utils.h
 ../../../src/libjson/json_utils.c
 ../../../src/libjson/cJSON.h
 ../../../src/libjson/cJSON.c
 ../../../src/obiavl.h
 ../../../src/obiavl.c
 ../../../src/bloom.h
 ../../../src/bloom.c
 ../../../src/crc64.h
 ../../../src/crc64.c
 ../../../src/murmurhash2.h
 ../../../src/murmurhash2.c
 ../../../src/obidmscolumn.h
 ../../../src/obidmscolumn.c
 ../../../src/obitypes.h
 ../../../src/obitypes.c
 ../../../src/obidmscolumndir.h
 ../../../src/obidmscolumndir.c
 ../../../src/obiblob_indexer.h
 ../../../src/obiblob_indexer.c
 ../../../src/obiview.h
 ../../../src/obiview.c
 ../../../src/hashtable.h
 ../../../src/hashtable.c
 ../../../src/linked_list.h
 ../../../src/linked_list.c
 ../../../src/obidmscolumn_array.h
 ../../../src/obidmscolumn_array.c
 ../../../src/obidmscolumn_blob.h
 ../../../src/obidmscolumn_blob.c
 ../../../src/obidmscolumn_idx.h
 ../../../src/obidmscolumn_idx.c
 ../../../src/obidmscolumn_bool.h
 ../../../src/obidmscolumn_bool.c
 ../../../src/obidmscolumn_char.h
 ../../../src/obidmscolumn_char.c
 ../../../src/obidmscolumn_float.h
 ../../../src/obidmscolumn_float.c
 ../../../src/obidmscolumn_int.h
 ../../../src/obidmscolumn_int.c
 ../../../src/obidmscolumn_qual.h
 ../../../src/obidmscolumn_qual.c
 ../../../src/obidmscolumn_seq.h
 ../../../src/obidmscolumn_seq.c
 ../../../src/obidmscolumn_str.h
 ../../../src/obidmscolumn_str.c
 ../../../src/array_indexer.h
 ../../../src/array_indexer.c
 ../../../src/char_str_indexer.h
 ../../../src/char_str_indexer.c
 ../../../src/dna_seq_indexer.h
 ../../../src/dna_seq_indexer.c
 ../../../src/encode.c
 ../../../src/encode.h
 ../../../src/uint8_indexer.c
 ../../../src/uint8_indexer.h
 ../../../src/build_reference_db.c
 ../../../src/build_reference_db.h
 ../../../src/kmer_similarity.c
 ../../../src/kmer_similarity.h
 ../../../src/obi_clean.c
 ../../../src/obi_clean.h
 ../../../src/obi_ecopcr.c
 ../../../src/obi_ecopcr.h
 ../../../src/obi_ecotag.c
 ../../../src/obi_ecotag.h
 ../../../src/obidms_taxonomy.c
 ../../../src/obidms_taxonomy.h
 ../../../src/obilittlebigman.c
 ../../../src/obilittlebigman.h
 ../../../src/_sse.h
 ../../../src/obidebug.h
 ../../../src/libecoPCR/libapat/CODES/dft_code.h
 ../../../src/libecoPCR/libapat/CODES/dna_code.h
 ../../../src/libecoPCR/libapat/CODES/prot_code.h
 ../../../src/libecoPCR/libapat/apat_parse.c
 ../../../src/libecoPCR/libapat/apat_search.c
 ../../../src/libecoPCR/libapat/apat.h
 ../../../src/libecoPCR/libapat/Gmach.h
 ../../../src/libecoPCR/libapat/Gtypes.h
 ../../../src/libecoPCR/libapat/libstki.c
 ../../../src/libecoPCR/libapat/libstki.h
 ../../../src/libecoPCR/libthermo/nnparams.h
 ../../../src/libecoPCR/libthermo/nnparams.c
 ../../../src/libecoPCR/ecoapat.c
 ../../../src/libecoPCR/ecodna.c
 ../../../src/libecoPCR/ecoError.c
 ../../../src/libecoPCR/ecoMalloc.c
 ../../../src/libecoPCR/ecoPCR.h
--- a/python/obitools3/commands/uniq.pxd
+++ b/python/obitools3/commands/uniq.pxd
@ -1,9 +0,0 @@
 #cython: language_level=3
 from obitools3.apps.progress cimport ProgressBar  # @UnresolvedImport
 from obitools3.dms.taxo.taxo cimport Taxonomy
 from obitools3.dms.view.typed_view.view_NUC_SEQS cimport View_NUC_SEQS
 cdef merge_taxonomy_classification(View_NUC_SEQS o_view, Taxonomy taxonomy)
 cdef uniq_sequences(View_NUC_SEQS view, View_NUC_SEQS o_view, ProgressBar pb, list mergedKeys_list=*, Taxonomy taxonomy=*, bint mergeIds=*, list categories=*, int max_elts=*)
--- a/python/obitools3/commands/uniq.pyx
+++ b/python/obitools3/commands/uniq.pyx
@ -1,542 +0,0 @@
 #cython: language_level=3
 from obitools3.apps.progress cimport ProgressBar  # @UnresolvedImport
 from obitools3.dms import DMS
 from obitools3.dms.view.view cimport View
 from obitools3.dms.obiseq cimport Nuc_Seq_Stored
 from obitools3.dms.view import RollbackException
 from obitools3.dms.view.typed_view.view_NUC_SEQS cimport View_NUC_SEQS
 from obitools3.dms.column.column cimport Column, Column_line
 from obitools3.dms.capi.obiview cimport QUALITY_COLUMN, COUNT_COLUMN, NUC_SEQUENCE_COLUMN, ID_COLUMN
 from obitools3.dms.capi.obitypes cimport OBI_INT, OBI_STR, index_t
 from obitools3.apps.optiongroups import addMinimalInputOption, addMinimalOutputOption, addTaxonomyOption
 from obitools3.uri.decode import open_uri
 from obitools3.apps.config import logger
 from obitools3.utils cimport tobytes
 import sys
 __title__="Group sequence records together"
 def addOptions(parser):
    addMinimalInputOption(parser)
    addTaxonomyOption(parser)
    addMinimalOutputOption(parser)
    group = parser.add_argument_group('obi uniq specific options')
    group.add_argument('--merge', '-m',
                       action="append", dest="uniq:merge",
                       metavar="<TAG NAME>",
                       default=[],
                       type=str,
                       help="Attributes to merge.")     # TODO must be a 1 elt/line column
    group.add_argument('--merge-ids', '-e',
                       action="store_true", dest="uniq:mergeids",
                       default=False,
                       help="ONLY WORKING ON SMALL SETS FOR NOW Add the merged key with all ids of merged sequences.")  # TODO ?
    group.add_argument('--category-attribute', '-c',
                        action="append", dest="uniq:categories",
                        metavar="<Attribute Name>",
                        default=[],
                        help="Add one attribute to the list of attributes "
                             "used to group sequences before dereplication "
                             "(option can be used several times).")
 cdef merge_taxonomy_classification(View_NUC_SEQS o_view, Taxonomy taxonomy) :
    cdef int             taxid
    cdef Nuc_Seq_Stored  seq
    cdef list            m_taxids
    cdef bytes           k
    cdef object          tsp
    cdef object          tgn
    cdef object          tfa
    cdef object          sp_sn
    cdef object          gn_sn
    cdef object          fa_sn
    # Create columns
    if b"species" in o_view and o_view[b"species"].data_type_int != OBI_INT :
        o_view.delete_column(b"species")
    if b"species" not in o_view:
        Column.new_column(o_view, 
                          b"species", 
                          OBI_INT
                         )
    if b"genus" in o_view and o_view[b"genus"].data_type_int != OBI_INT :
        o_view.delete_column(b"genus")
    if b"genus" not in o_view:
        Column.new_column(o_view, 
                          b"genus", 
                          OBI_INT
                         )
    if b"family" in o_view and o_view[b"family"].data_type_int != OBI_INT :
        o_view.delete_column(b"family")
    if b"family" not in o_view:
        Column.new_column(o_view, 
                          b"family", 
                          OBI_INT
                         )
    if b"species_name" in o_view and o_view[b"species_name"].data_type_int != OBI_STR :
        o_view.delete_column(b"species_name")
    if b"species_name" not in o_view:
        Column.new_column(o_view, 
                          b"species_name", 
                          OBI_STR
                         )
    if b"genus_name" in o_view and o_view[b"genus_name"].data_type_int != OBI_STR :
        o_view.delete_column(b"genus_name")
    if b"genus_name" not in o_view:
        Column.new_column(o_view, 
                          b"genus_name", 
                          OBI_STR
                         )
    if b"family_name" in o_view and o_view[b"family_name"].data_type_int != OBI_STR :
        o_view.delete_column(b"family_name")
    if b"family_name" not in o_view:
        Column.new_column(o_view, 
                          b"family_name", 
                          OBI_STR
                         )
    if b"rank" in o_view and o_view[b"rank"].data_type_int != OBI_STR :
        o_view.delete_column(b"rank")
    if b"rank" not in o_view:
        Column.new_column(o_view, 
                          b"rank", 
                          OBI_STR
                         )
    if b"scientific_name" in o_view and o_view[b"scientific_name"].data_type_int != OBI_STR :
        o_view.delete_column(b"scientific_name")
    if b"scientific_name" not in o_view:
        Column.new_column(o_view, 
                          b"scientific_name", 
                          OBI_STR
                         )
    for seq in o_view:        
        if b"merged_taxid" in seq :
            m_taxids = []            
            m_taxids_dict = seq[b"merged_taxid"]
            for k in m_taxids_dict.keys() :
                if m_taxids_dict[k] is not None:
                    m_taxids.append(int(k))
            taxid = taxonomy.last_common_taxon(*m_taxids)
            seq[b"taxid"] = taxid
            tsp = taxonomy.get_species(taxid)
            tgn = taxonomy.get_genus(taxid)
            tfa = taxonomy.get_family(taxid)
            if tsp is not None:
                sp_sn = taxonomy.get_scientific_name(tsp)
            else:
                sp_sn = None   # TODO was '###', discuss
                tsp = None     # TODO was '-1', discuss
            if tgn is not None:
                gn_sn = taxonomy.get_scientific_name(tgn)
            else:
                gn_sn = None
                tgn = None
            if tfa is not None:
                fa_sn = taxonomy.get_scientific_name(tfa)
            else:
                fa_sn = None
                tfa = None
            seq[b"species"] = tsp
            seq[b"genus"] = tgn
            seq[b"family"] = tfa
            seq[b"species_name"] = sp_sn
            seq[b"genus_name"] = gn_sn
            seq[b"family_name"] = fa_sn
            seq[b"rank"] = taxonomy.get_rank(taxid)
            seq[b"scientific_name"] = taxonomy.get_scientific_name(taxid)
 cdef uniq_sequences(View_NUC_SEQS view, View_NUC_SEQS o_view, ProgressBar pb, list mergedKeys_list=None, Taxonomy taxonomy=None, bint mergeIds=False, list categories=None, int max_elts=10000) :
    cdef int            i
    cdef int            k
    cdef int            k_count
    cdef int            o_idx
    cdef int            u_idx
    cdef int            i_idx
    cdef int            i_count
    cdef str            key_str
    cdef bytes          key
    cdef bytes          mkey
    cdef bytes          merged_col_name
    cdef bytes          o_id
    cdef bytes          i_id
    cdef set            mergedKeys_set
    cdef tuple          unique_id
    cdef list           catl
    cdef list           mergedKeys
    cdef list           mergedKeys_list_b
    cdef list           mergedKeys_m
    cdef list           str_merged_cols
    cdef list           merged_sequences
    cdef dict           uniques
    cdef dict           merged_infos
    cdef dict           mkey_infos
    cdef dict           merged_dict
    cdef dict           mkey_cols
    cdef Nuc_Seq_Stored i_seq
    cdef Nuc_Seq_Stored o_seq
    cdef Nuc_Seq_Stored u_seq
    cdef Column         i_col
    cdef Column         i_seq_col
    cdef Column         i_id_col
    cdef Column         i_taxid_col
    cdef Column         i_taxid_dist_col
    cdef Column         o_id_col
    cdef Column         o_taxid_dist_col
    cdef Column         o_merged_col
    cdef Column_line    i_mcol  
    cdef object         taxid_dist_dict
    cdef object         iter_view
    cdef object         mcol
    cdef object         to_merge
    uniques = {}
    mergedKeys_list_b = []
    if mergedKeys_list is not None:
        for key_str in mergedKeys_list:
            mergedKeys_list_b.append(tobytes(key_str))
        mergedKeys_set=set(mergedKeys_list_b)
    else:
        mergedKeys_set=set() 
    if taxonomy is not None:
        mergedKeys_set.add(b"taxid")
    mergedKeys = list(mergedKeys_set)
    k_count = len(mergedKeys)
    mergedKeys_m = []
    for k in range(k_count):
        mergedKeys_m.append(b"merged_" + mergedKeys[k])
    if categories is None:
        categories = []
    # Keep columns that are going to be used a lot in variables 
    i_seq_col = view[NUC_SEQUENCE_COLUMN]
    i_id_col = view[ID_COLUMN]
    if b"taxid" in view:
        i_taxid_col = view[b"taxid"]
    if b"taxid_dist" in view:
        i_taxid_dist_col = view[b"taxid_dist"]
    # First browsing
    i = 0
    o_idx = 0
    logger("info", "First browsing through the input")
    merged_infos = {}
    iter_view = iter(view)
    for i_seq in iter_view :
        pb(i)
        # This can't be done in the same line as the unique_id tuple creation because it generates a bug
        # where Cython (version 0.25.2) does not detect the reference to the categs_list variable and deallocates 
        # it at the beginning of the function.
        # (Only happens if categs_list is an optional parameter, which it is).
        catl = []
        for x in categories :
            catl.append(i_seq[x])    
        unique_id = tuple(catl) + (i_seq_col.get_line_idx(i),)
        #unique_id = tuple(i_seq[x] for x in categories) + (seq_col.get_line_idx(i),)  # The line that cython can't read properly
        if unique_id in uniques:
            uniques[unique_id].append(i)
        else:
            uniques[unique_id] = [i]
        for k in range(k_count):
            key = mergedKeys[k]
            mkey = mergedKeys_m[k]
            if key in i_seq:    # TODO what if mkey already in i_seq?  --> should update
                if mkey not in merged_infos:
                    merged_infos[mkey] = {}
                    mkey_infos = merged_infos[mkey]
                    mkey_infos['nb_elts'] = 1
                    mkey_infos['elt_names'] = [i_seq[key]]
                else:
                    mkey_infos = merged_infos[mkey]
                    if i_seq[key] not in mkey_infos['elt_names']:     # TODO make faster? but how?
                        mkey_infos['elt_names'].append(i_seq[key])
                        mkey_infos['nb_elts'] += 1
        i+=1
    # Create merged columns
    str_merged_cols = []
    mkey_cols = {}
    for k in range(k_count):
        key = mergedKeys[k]
        merged_col_name = mergedKeys_m[k]
        i_col = view[key]
        if merged_infos[merged_col_name]['nb_elts'] > max_elts:
            str_merged_cols.append(merged_col_name)
            Column.new_column(o_view,
                              merged_col_name,
                              OBI_STR,
                              to_eval=True,
                              comments=i_col.comments,
                              alias=merged_col_name     # TODO what if it already exists
                             )
        else:
            Column.new_column(o_view,
                              merged_col_name,
                              OBI_INT,
                              nb_elements_per_line=merged_infos[merged_col_name]['nb_elts'],
                              elements_names=list(merged_infos[merged_col_name]['elt_names']),
                              comments=i_col.comments,
                              alias=merged_col_name     # TODO what if it already exists
                             )
        mkey_cols[merged_col_name] = o_view[merged_col_name]
    # taxid_dist column
    if mergeIds and b"taxid" in mergedKeys:
        if len(view) > max_elts: #The number of different IDs corresponds to the number of sequences in the view
            str_merged_cols.append(b"taxid_dist")
            Column.new_column(o_view, 
                              b"taxid_dist", 
                              OBI_STR,
                              to_eval=True,
                              comments=b"obi uniq taxid dist, stored as character strings to be read as dict",
                              alias=b"taxid_dist"     # TODO what if it already exists
                             )
        else:
            Column.new_column(o_view, 
                              b"taxid_dist", 
                              OBI_INT,
                              nb_elements_per_line=len(view),
                              elements_names=[id for id in i_id_col],
                              comments=b"obi uniq taxid dist",
                              alias=b"taxid_dist"     # TODO what if it already exists
                             )
    del(merged_infos)
    # Merged ids column
    if mergeIds :
        Column.new_column(o_view,
                          b"merged",
                          OBI_STR,
                          tuples=True,
                          comments=b"obi uniq merged ids",
                          alias=b"merged"     # TODO what if it already exists
                         )
    # Keep columns that are going to be used a lot in variables 
    o_id_col = o_view[ID_COLUMN]
    if b"taxid_dist" in o_view:
        o_taxid_dist_col = o_view[b"taxid_dist"]
    if b"merged" in o_view:
        o_merged_col = o_view[b"merged"]
    print("\n")  # TODO because in the middle of progress bar. Better solution?
    logger("info", "Second browsing through the input")
    # Initialize the progress bar
    pb = ProgressBar(len(uniques), seconde=5)
    o_idx = 0
    for unique_id in uniques :
        pb(o_idx)
        merged_sequences = uniques[unique_id]
        u_idx = uniques[unique_id][0]
        u_seq = view[u_idx]
        o_view[o_idx] = u_seq
        o_seq = o_view[o_idx]
        o_id = o_seq.id
        if mergeIds:
            o_merged_col[o_idx] = [view[idx].id for idx in merged_sequences]
        o_seq[COUNT_COLUMN] = 0
        if b"taxid_dist" in u_seq and i_taxid_dist_col[u_idx] is not None:
            taxid_dist_dict = i_taxid_dist_col[u_idx]
        else:
            taxid_dist_dict = {}           
        merged_dict = {}
        for mkey in mergedKeys_m:
            merged_dict[mkey] = {}
        for i_idx in merged_sequences:
            i_id = i_id_col[i_idx]
            i_seq = view[i_idx]
            if COUNT_COLUMN not in i_seq or i_seq[COUNT_COLUMN] is None:
                i_count = 1
            else:
                i_count = i_seq[COUNT_COLUMN]
            o_seq[COUNT_COLUMN] += i_count
            for k in range(k_count):
                key = mergedKeys[k]
                mkey = mergedKeys_m[k]
                if key==b"taxid" and mergeIds:
                    if b"taxid_dist" in i_seq:
                        taxid_dist_dict.update(i_taxid_dist_col[i_idx])
                    if b"taxid" in i_seq:
                        taxid_dist_dict[i_id] = i_taxid_col[i_idx]
                #cas ou on met a jour les merged_keys mais il n'y a pas de merged_keys dans la sequence qui arrive
                if key in i_seq:
                    to_merge = i_seq[key]
                    if to_merge is not None:
                        if type(to_merge) != bytes:
                            to_merge = tobytes(str(to_merge))
                        mcol = merged_dict[mkey]
                        if to_merge not in mcol or mcol[to_merge] is None:
                            mcol[to_merge] = i_count
                        else:
                            mcol[to_merge] = mcol[to_merge] + i_count
                        o_seq[key] = None
                #cas ou merged_keys existe deja
                else:   # TODO is this a good else
                    if mkey in i_seq:
                        mcol = merged_dict[mkey]
                        i_mcol = i_seq[mkey]
                        if i_mcol is not None:
                            for key2 in i_mcol:
                                if mcol[key2] is None:
                                    mcol[key2] = i_mcol[key2]
                                else:
                                    mcol[key2] = mcol[key2] + i_mcol[key2]
            # Write taxid_dist
            if mergeIds and b"taxid" in mergedKeys:
                if b"taxid_dist" in str_merged_cols:
                    o_taxid_dist_col[o_idx] = str(taxid_dist_dict)
                else:
                    o_taxid_dist_col[o_idx] = taxid_dist_dict
            # Write merged dicts
            for mkey in merged_dict: 
                if mkey in str_merged_cols:
                    mkey_cols[mkey][o_idx] = str(merged_dict[mkey])
                else:
                    mkey_cols[mkey][o_idx] = merged_dict[mkey]
                    # Sets NA values to 0  # TODO discuss, maybe keep as None and test for None instead of testing for 0 in tools
                    #for key in mkey_cols[mkey][o_idx]:
                    #    if mkey_cols[mkey][o_idx][key] is None:
                    #        mkey_cols[mkey][o_idx][key] = 0
            for key in i_seq.keys():
                # Delete informations that differ between the merged sequences
                # TODO make special columns list?
                if key != COUNT_COLUMN and key != ID_COLUMN and key != NUC_SEQUENCE_COLUMN and key in o_seq and o_seq[key] != i_seq[key] :
                    o_seq[key] = None
        o_idx += 1
    # Deletes quality column if there is one because the matching between sequence and quality will be broken (quality set to NA when sequence not)
    if QUALITY_COLUMN in view:
        o_view.delete_column(QUALITY_COLUMN)
    if taxonomy is not None:
        print("\n")  # TODO because in the middle of progress bar. Better solution?
        logger("info", "Merging taxonomy classification")
        merge_taxonomy_classification(o_view, taxonomy)
 def run(config):
    cdef tuple         input
    cdef tuple         output 
    cdef tuple         taxo_uri
    cdef Taxonomy      taxo
    cdef View_NUC_SEQS entries
    cdef View_NUC_SEQS o_view
    cdef ProgressBar   pb
    DMS.obi_atexit()
    logger("info","obi uniq")
    # Open the input
    input = open_uri(config['obi']['inputURI'])
    if input is None:
        raise Exception("Could not read input view")    
    if input[2] != View_NUC_SEQS:
        raise NotImplementedError('obi uniq only works on NUC_SEQS views')
    # Open the output
    output = open_uri(config['obi']['outputURI'],
                      input=False,
                      newviewtype=View_NUC_SEQS)
    if output is None:
        raise Exception("Could not create output view")
    entries = input[1]
    o_view = output[1]
    if 'taxoURI' in config['obi'] and config['obi']['taxoURI'] is not None:
        taxo_uri = open_uri(config['obi']['taxoURI'])
        if taxo_uri is None:
            raise RollbackException("Couldn't open taxonomy, rollbacking view", o_view)
        taxo = taxo_uri[1]
    else :
        taxo = None
    # Initialize the progress bar
    pb = ProgressBar(len(entries), config, seconde=5)
    try:
        uniq_sequences(entries, o_view, pb, mergedKeys_list=config['uniq']['merge'], taxonomy=taxo, mergeIds=config['uniq']['mergeids'], categories=config['uniq']['categories'], max_elts=config['obi']['maxelts'])       
    except Exception, e:
        raise RollbackException("obi uniq error, rollbacking view: "+str(e), o_view)
    # Save command config in View and DMS comments
    command_line = " ".join(sys.argv[1:])
    input_dms_name=[input[0].name]
    input_view_name=[input[1].name]
    if 'taxoURI' in config['obi'] and config['obi']['taxoURI'] is not None:
        input_dms_name.append(config['obi']['taxoURI'].split("/")[-3])
        input_view_name.append("taxonomy/"+config['obi']['taxoURI'].split("/")[-1])
    o_view.write_config(config, "uniq", command_line, input_dms_name=input_dms_name, input_view_name=input_view_name)
    output[0].record_command_line(command_line)
    print("\n")
    print(repr(o_view))
    input[0].close()
    output[0].close()
--- a/python/obitools3/dms/init.py
+++ b/python/obitools3/dms/init.py
@ -1,2 +0,0 @@
 from .dms import DMS  # @UnresolvedImport
--- a/python/obitools3/dms/capi/init.py
+++ b/python/obitools3/dms/capi/init.py
--- a/Show More
+++ b/Show More
`@ -4,7 +4,6 @@ OBITypes`


	`.. image:: ./UML/OBITypes_UML.png`	`.. image:: ./UML/OBITypes_UML.png`

	:download:`html version of the OBITypes UML file <UML/OBITypes_UML.class.violet.html>`	:download:`html version of the OBITypes UML file <UML/OBITypes_UML.class.violet.html>`
		`@ -1,3 +0,0 @@`
			`#cython: language_level=3`

			`cpdef buildArgumentParser(str configname, str softname)`
		`@ -1,3 +0,0 @@`
			`#cython: language_level=3`

			`cdef object loadCommand(str name,loader)`
		`@ -1,3 +0,0 @@`
			`#cython: language_level=3`

			`cpdef getLogger(dict config)`
		`@ -1,4 +0,0 @@`
			`#cython: language_level=3`


			`cdef object buildAlignment(object direct, object reverse)`