2019-02-06 17:16:08 +01:00
<!DOCTYPE html>
< html >
< head >
< title > Biodiversity metrics and metabarcoding< / title >
< meta charset = "utf-8" >
< meta http-equiv = "X-UA-Compatible" content = "chrome=1" >
< meta name = "generator" content = "pandoc" / >
< meta name = "viewport" content = "width=device-width, initial-scale=1" >
< meta name = "apple-mobile-web-app-capable" content = "yes" >
< base target = "_blank" >
< script type = "text/javascript" >
var SLIDE_CONFIG = {
// Slide settings
settings: {
title: 'Biodiversity metrics and metabarcoding',
useBuilds: true,
usePrettify: true,
enableSlideAreas: true,
enableTouch: true,
},
// Author information
presenters: [
{
name: 'Eric Coissac' ,
company: '',
gplus: '',
twitter: '',
www: '',
github: ''
},
]
};
< / script >
< link href = "index_files/ioslides-13.5.1/fonts/fonts.css" rel = "stylesheet" / >
< link href = "index_files/ioslides-13.5.1/theme/css/default.css" rel = "stylesheet" / >
< link href = "index_files/ioslides-13.5.1/theme/css/phone.css" rel = "stylesheet" / >
< script src = "index_files/ioslides-13.5.1/js/modernizr.custom.45394.js" > < / script >
< script src = "index_files/ioslides-13.5.1/js/prettify/prettify.js" > < / script >
< script src = "index_files/ioslides-13.5.1/js/prettify/lang-r.js" > < / script >
< script src = "index_files/ioslides-13.5.1/js/prettify/lang-yaml.js" > < / script >
< script src = "index_files/ioslides-13.5.1/js/hammer.js" > < / script >
< script src = "index_files/ioslides-13.5.1/js/slide-controller.js" > < / script >
< script src = "index_files/ioslides-13.5.1/js/slide-deck.js" > < / script >
< script src = "index_files/kePrint-0.0.1/kePrint.js" > < / script >
< style type = "text/css" >
b, strong {
font-weight: bold;
}
em {
font-style: italic;
}
summary {
display: list-item;
}
slides > slide {
-webkit-transition: all 0.4s ease-in-out;
-moz-transition: all 0.4s ease-in-out;
-o-transition: all 0.4s ease-in-out;
transition: all 0.4s ease-in-out;
}
.auto-fadein {
-webkit-transition: opacity 0.6s ease-in;
-webkit-transition-delay: 0.4s;
-moz-transition: opacity 0.6s ease-in 0.4s;
-o-transition: opacity 0.6s ease-in 0.4s;
transition: opacity 0.6s ease-in 0.4s;
opacity: 0;
}
2019-11-03 13:12:58 -05:00
/* https://github.com/ropensci/plotly/pull/524#issuecomment-468142578 */
slide:not(.current) .plotly.html-widget{
display: block;
}
2019-02-06 17:16:08 +01:00
< / style >
< link rel = "stylesheet" href = "slides.css" type = "text/css" / >
< / head >
< body style = "opacity: 0" >
< slides class = "layout-widescreen" >
< slide class = "title-slide segue nobackground" >
<!-- The content of this hgroup is replaced programmatically through the slide_config.json. -->
< hgroup class = "auto-fadein" >
< h1 data-config-title > <!-- populated from slide_config.json --> < / h1 >
< h2 data-config-subtitle > <!-- populated from slide_config.json --> < / h2 >
< p data-config-presenter > <!-- populated from slide_config.json --> < / p >
< p style = "margin-top: 6px; margin-left: -2px;" > 28/01/2019< / p >
< / hgroup >
< / slide >
< slide class = "segue dark nobackground level1" > < hgroup class = 'auto-fadein' > < h2 > Summary< / h2 > < / hgroup > < article id = "summary" class = "smaller " >
< ul >
2019-11-03 13:12:58 -05:00
< li > The MetabarSchool Package< / li >
2019-02-06 17:16:08 +01:00
< li > What do the reading numbers per PCR mean?< / li >
2019-11-03 13:12:58 -05:00
< li > Rarefaction vs. relative frequencies< / li >
2019-02-06 17:16:08 +01:00
< li > alpha diversity metrics< / li >
< li > beta diversity metrics< / li >
< li > multidimentionnal analysis< / li >
< li > comparison between datasets< / li >
< / ul >
2019-11-03 13:12:58 -05:00
< / article > < / slide > < slide class = "segue dark nobackground level1" > < hgroup class = 'auto-fadein' > < h2 > The MetabarSchool Package< / h2 > < / hgroup > < article id = "the-metabarschool-package" class = "smaller " >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Instaling the package< / h2 > < / hgroup > < article id = "instaling-the-package" class = "smaller " >
< p > You need the < em > devtools< / em > package< / p >
< pre class = 'prettyprint lang-r' > install.packages(" devtools" ,dependencies = TRUE)< / pre >
< p > Then you can install < em > MetabarSchool< / em > < / p >
< pre class = 'prettyprint lang-r' > devtools::install_git(" https://git.metabarcoding.org/MetabarcodingSchool/biodiversity-metrics.git" )< / pre >
< p > You will also need the < em > vegan< / em > package< / p >
< pre class = 'prettyprint lang-r' > install.packages(" vegan" ,dependencies = TRUE)< / pre >
2019-02-06 17:16:08 +01:00
< / article > < / slide > < slide class = "segue dark nobackground level1" > < hgroup class = 'auto-fadein' > < h2 > The dataset< / h2 > < / hgroup > < article id = "the-dataset" class = "smaller " >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > The mock community< / h2 > < / hgroup > < article id = "the-mock-community" class = "smaller flexbox vcenter smaller" >
< p > A 16 plants mock community< / p >
< table class = "table" style = "margin-left: auto; margin-right: auto;" >
< thead >
< tr >
< th style = "text-align:right;" >
< / th >
< th style = "text-align:left;" >
species
< / th >
< th style = "text-align:right;" >
taxid
< / th >
< th style = "text-align:right;" >
Relative aboundance
< / th >
< / tr >
< / thead >
< tbody >
< tr >
< td style = "text-align:right;" >
1
< / td >
< td style = "text-align:left;" >
Taxus baccata
< / td >
< td style = "text-align:right;" >
25629
< / td >
< td style = "text-align:right;" >
1/2
< / td >
< / tr >
< tr >
< td style = "text-align:right;" >
2
< / td >
< td style = "text-align:left;" >
Salvia pratensis
< / td >
< td style = "text-align:right;" >
49216
< / td >
< td style = "text-align:right;" >
1/4
< / td >
< / tr >
< tr >
< td style = "text-align:right;" >
3
< / td >
< td style = "text-align:left;" >
Populus tremula
< / td >
< td style = "text-align:right;" >
113636
< / td >
< td style = "text-align:right;" >
1/8
< / td >
< / tr >
< tr >
< td style = "text-align:right;" >
4
< / td >
< td style = "text-align:left;" >
Rumex acetosa
< / td >
< td style = "text-align:right;" >
41241
< / td >
< td style = "text-align:right;" >
1/16
< / td >
< / tr >
< tr >
< td style = "text-align:right;" >
5
< / td >
< td style = "text-align:left;" >
Carpinus betulus
< / td >
< td style = "text-align:right;" >
12990
< / td >
< td style = "text-align:right;" >
1/32
< / td >
< / tr >
< tr >
< td style = "text-align:right;" >
6
< / td >
< td style = "text-align:left;" >
Fraxinus excelsior
< / td >
< td style = "text-align:right;" >
38873
< / td >
< td style = "text-align:right;" >
1/64
< / td >
< / tr >
< tr >
< td style = "text-align:right;" >
7
< / td >
< td style = "text-align:left;" >
Picea abies
< / td >
< td style = "text-align:right;" >
3329
< / td >
< td style = "text-align:right;" >
1/128
< / td >
< / tr >
< tr >
< td style = "text-align:right;" >
8
< / td >
< td style = "text-align:left;" >
Lonicera xylosteum
< / td >
< td style = "text-align:right;" >
439142
< / td >
< td style = "text-align:right;" >
1/256
< / td >
< / tr >
< tr >
< td style = "text-align:right;" >
9
< / td >
< td style = "text-align:left;" >
Abies alba
< / td >
< td style = "text-align:right;" >
45372
< / td >
< td style = "text-align:right;" >
1/512
< / td >
< / tr >
< tr >
< td style = "text-align:right;" >
10
< / td >
< td style = "text-align:left;" >
Acer campestre
< / td >
< td style = "text-align:right;" >
66205
< / td >
< td style = "text-align:right;" >
1/1024
< / td >
< / tr >
< tr >
< td style = "text-align:right;" >
11
< / td >
< td style = "text-align:left;" >
Briza media
< / td >
< td style = "text-align:right;" >
281077
< / td >
< td style = "text-align:right;" >
1/2048
< / td >
< / tr >
< tr >
< td style = "text-align:right;" >
12
< / td >
< td style = "text-align:left;" >
Rosa canina
< / td >
< td style = "text-align:right;" >
74635
< / td >
< td style = "text-align:right;" >
1/4096
< / td >
< / tr >
< tr >
< td style = "text-align:right;" >
13
< / td >
< td style = "text-align:left;" >
Capsella bursa-pastoris
< / td >
< td style = "text-align:right;" >
3719
< / td >
< td style = "text-align:right;" >
1/8192
< / td >
< / tr >
< tr >
< td style = "text-align:right;" >
14
< / td >
< td style = "text-align:left;" >
Geranium robertianum
< / td >
< td style = "text-align:right;" >
122183
< / td >
< td style = "text-align:right;" >
1/16384
< / td >
< / tr >
< tr >
< td style = "text-align:right;" >
15
< / td >
< td style = "text-align:left;" >
Rhododendron ferrugineum
< / td >
< td style = "text-align:right;" >
49622
< / td >
< td style = "text-align:right;" >
1/32768
< / td >
< / tr >
< tr >
< td style = "text-align:right;" >
16
< / td >
< td style = "text-align:left;" >
Lotus corniculatus
< / td >
< td style = "text-align:right;" >
47247
< / td >
< td style = "text-align:right;" >
1/65536
< / td >
< / tr >
< / tbody >
< / table >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > The experiment< / h2 > < / hgroup > < article id = "the-experiment" class = "smaller flexbox vcenter" >
< ul >
2019-11-03 13:12:58 -05:00
< li > < p > 192 PCR of the mock community using SPER02 trnL-P6-Loop primers< / p >
< ul >
2019-02-06 17:16:08 +01:00
< li > < p > 6 dilutions of the mock community: 1/1, 1/2, 1/4, 1/8, 1/16, 1/32< / p > < / li >
< li > < p > 32 repeats per dilution< / p > < / li >
2019-11-03 13:12:58 -05:00
< / ul > < / li >
2019-02-06 17:16:08 +01:00
< / ul >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Loading data< / h2 > < / hgroup > < article id = "loading-data" class = "smaller " >
2019-11-03 13:12:58 -05:00
< pre class = 'prettyprint lang-r' > library(MetabarSchool)
data(" positive.count" )
2019-02-06 17:16:08 +01:00
data(" positive.samples" )
data(" positive.motus" )< / pre >
< ul >
< li > < code > positive.count< / code > read count matrix \(192 \; PCRs \; \times \; 24330 \; MOTUs\)< / li >
< / ul >
< table class = "table" style = "margin-left: auto; margin-right: auto;" >
< thead >
< tr >
< th style = "text-align:left;-webkit-transform: rotate(-45deg); -moz-transform: rotate(-45deg); -ms-transform: rotate(-45deg); -o-transform: rotate(-45deg); transform: rotate(-45deg);" >
< / th >
< th style = "text-align:right;-webkit-transform: rotate(-45deg); -moz-transform: rotate(-45deg); -ms-transform: rotate(-45deg); -o-transform: rotate(-45deg); transform: rotate(-45deg);" >
P000001
< / th >
< th style = "text-align:center;-webkit-transform: rotate(-45deg); -moz-transform: rotate(-45deg); -ms-transform: rotate(-45deg); -o-transform: rotate(-45deg); transform: rotate(-45deg);" >
P000002
< / th >
< th style = "text-align:right;-webkit-transform: rotate(-45deg); -moz-transform: rotate(-45deg); -ms-transform: rotate(-45deg); -o-transform: rotate(-45deg); transform: rotate(-45deg);" >
P000003
< / th >
< th style = "text-align:center;-webkit-transform: rotate(-45deg); -moz-transform: rotate(-45deg); -ms-transform: rotate(-45deg); -o-transform: rotate(-45deg); transform: rotate(-45deg);" >
P000004
< / th >
< th style = "text-align:right;-webkit-transform: rotate(-45deg); -moz-transform: rotate(-45deg); -ms-transform: rotate(-45deg); -o-transform: rotate(-45deg); transform: rotate(-45deg);" >
P000005
< / th >
< / tr >
< / thead >
< tbody >
< tr >
< td style = "text-align:left;" >
sample.TM_POS_d16_1_a_A1
< / td >
< td style = "text-align:right;" >
1167
< / td >
< td style = "text-align:center;" >
4477
< / td >
< td style = "text-align:right;" >
779
< / td >
< td style = "text-align:center;" >
0
< / td >
< td style = "text-align:right;" >
12
< / td >
< / tr >
< tr >
< td style = "text-align:left;" >
sample.TM_POS_d16_1_a_B1
< / td >
< td style = "text-align:right;" >
1072
< / td >
< td style = "text-align:center;" >
5077
< / td >
< td style = "text-align:right;" >
985
< / td >
< td style = "text-align:center;" >
2
< / td >
< td style = "text-align:right;" >
8
< / td >
< / tr >
< tr >
< td style = "text-align:left;" >
sample.TM_POS_d16_1_b_A2
< / td >
< td style = "text-align:right;" >
919
< / td >
< td style = "text-align:center;" >
3599
< / td >
< td style = "text-align:right;" >
601
< / td >
< td style = "text-align:center;" >
0
< / td >
< td style = "text-align:right;" >
10
< / td >
< / tr >
< tr >
< td style = "text-align:left;" >
sample.TM_POS_d16_1_b_B2
< / td >
< td style = "text-align:right;" >
704
< / td >
< td style = "text-align:center;" >
4129
< / td >
< td style = "text-align:right;" >
835
< / td >
< td style = "text-align:center;" >
2
< / td >
< td style = "text-align:right;" >
15
< / td >
< / tr >
< tr >
< td style = "text-align:left;" >
sample.TM_POS_d16_2_a_A1
< / td >
< td style = "text-align:right;" >
1155
< / td >
< td style = "text-align:center;" >
5341
< / td >
< td style = "text-align:right;" >
1023
< / td >
< td style = "text-align:center;" >
2
< / td >
< td style = "text-align:right;" >
6
< / td >
< / tr >
< / tbody >
< / table >
< p > < br > < / p >
< pre class = 'prettyprint lang-r' > positive.count[1:5,1:5]< / pre >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Loading data< / h2 > < / hgroup > < article id = "loading-data-1" class = "smaller " >
2019-11-03 13:12:58 -05:00
< pre class = 'prettyprint lang-r' > library(MetabarSchool)
data(" positive.count" )
2019-02-06 17:16:08 +01:00
data(" positive.samples" )
data(" positive.motus" )< / pre >
< ul >
< li > < code > positive.samples< / code > a 192 rows < code > data.frame< / code > of 2 columns describing each PCR< / li >
< / ul >
< table class = "table" style = "margin-left: auto; margin-right: auto;" >
< thead >
< tr >
< th style = "text-align:left;" >
< / th >
< th style = "text-align:right;" >
dilution
< / th >
< th style = "text-align:center;" >
repeats
< / th >
< / tr >
< / thead >
< tbody >
< tr >
< td style = "text-align:left;" >
sample.TM_POS_d16_1_a_A1
< / td >
< td style = "text-align:right;" >
2
< / td >
< td style = "text-align:center;" >
1.a.A1
< / td >
< / tr >
< tr >
< td style = "text-align:left;" >
sample.TM_POS_d16_1_a_B1
< / td >
< td style = "text-align:right;" >
2
< / td >
< td style = "text-align:center;" >
1.a.B1
< / td >
< / tr >
< tr >
< td style = "text-align:left;" >
sample.TM_POS_d16_1_b_A2
< / td >
< td style = "text-align:right;" >
2
< / td >
< td style = "text-align:center;" >
1.b.A2
< / td >
< / tr >
< / tbody >
< / table >
< p > < br > < / p >
< pre class = 'prettyprint lang-r' > head(positive.samples,n=3)< / pre >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Loading data< / h2 > < / hgroup > < article id = "loading-data-2" class = "smaller " >
2019-11-03 13:12:58 -05:00
< pre class = 'prettyprint lang-r' > library(MetabarSchool)
data(" positive.count" )
2019-02-06 17:16:08 +01:00
data(" positive.samples" )
data(" positive.motus" )< / pre >
< ul >
< li > < code > positive.motus< / code > : a 24330 rows < code > data.frame< / code > of 4 columns describing each MOTU< / li >
< / ul >
< table class = "table" style = "margin-left: auto; margin-right: auto;" >
< thead >
< tr >
< th style = "text-align:left;" >
< / th >
< th style = "text-align:right;" >
dilution
< / th >
< th style = "text-align:left;" >
species
< / th >
< th style = "text-align:right;" >
taxid
< / th >
< th style = "text-align:center;" >
true
< / th >
< / tr >
< / thead >
< tbody >
< tr >
< td style = "text-align:left;" >
P000001
< / td >
< td style = "text-align:right;" >
0.250
< / td >
< td style = "text-align:left;" >
Salvia pratensis
< / td >
< td style = "text-align:right;" >
49216
< / td >
< td style = "text-align:center;" >
TRUE
< / td >
< / tr >
< tr >
< td style = "text-align:left;" >
P000002
< / td >
< td style = "text-align:right;" >
0.125
< / td >
< td style = "text-align:left;" >
Populus tremula
< / td >
< td style = "text-align:right;" >
113636
< / td >
< td style = "text-align:center;" >
TRUE
< / td >
< / tr >
< tr >
< td style = "text-align:left;" >
P000003
< / td >
< td style = "text-align:right;" >
0.500
< / td >
< td style = "text-align:left;" >
Taxus baccata
< / td >
< td style = "text-align:right;" >
25629
< / td >
< td style = "text-align:center;" >
TRUE
< / td >
< / tr >
< / tbody >
< / table >
< p > < br > < / p >
< pre class = 'prettyprint lang-r' > head(positive.motus,n=3)< / pre >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Removing singleton sequences< / h2 > < / hgroup > < article id = "removing-singleton-sequences" class = "smaller flexbox vcenter" >
< p > Singleton sequences are observed only once over the complete dataset.< / p >
< pre class = 'prettyprint lang-r' > table(colSums(positive.count) == 1)< / pre >
< table class = "table" style = "margin-left: auto; margin-right: auto;" >
< thead >
< tr >
< th style = "text-align:right;text-align: center;" >
FALSE
< / th >
< th style = "text-align:right;text-align: center;" >
TRUE
< / th >
< / tr >
< / thead >
< tbody >
< tr >
< td style = "text-align:right;" >
5579
< / td >
< td style = "text-align:right;" >
18751
< / td >
< / tr >
< / tbody >
< / table >
< p > < br > < / p >
< p > We discard them they are unanimously considered as rubbish.< / p >
< pre class = 'prettyprint lang-r' > are.not.singleton = colSums(positive.count) > 1
positive.count = positive.count[,are.not.singleton]
positive.motus = positive.motus[are.not.singleton,]< / pre >
< ul >
< li > < code > positive.count< / code > is now a \(192 \; PCRs \; \times \; 5579 \; MOTUs\) matrix< / li >
< / ul >
2019-11-03 13:12:58 -05:00
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Not all the PCR have the same number of reads< / h2 > < / hgroup > < article id = "not-all-the-pcr-have-the-same-number-of-reads" class = "smaller flexbox vcenter" >
2019-02-06 17:16:08 +01:00
< p > Despite all standardization efforts< / p >
2019-11-03 13:12:58 -05:00
< p > < img src = "index_files/figure-html/unnamed-chunk-18-1.png" width = "720" / > < / p >
2019-02-06 17:16:08 +01:00
< div class = "green" >
< p > Is it related to the amount of DNA in the extract ?< / p > < / div >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > What do the reading numbers per PCR mean?< / h2 > < / hgroup > < article id = "what-do-the-reading-numbers-per-pcr-mean" class = "smaller smaller" >
< pre class = 'prettyprint lang-r' > par(bg=NA)
boxplot(rowSums(positive.count) ~ positive.samples$dilution,log=" y" )
abline(h = median(rowSums(positive.count)),lw=2,col=" red" ,lty=2)< / pre >
2019-11-03 13:12:58 -05:00
< p > < img src = "index_files/figure-html/unnamed-chunk-19-1.png" width = "720" / > < / p >
2019-02-06 17:16:08 +01:00
< div class = "red2" >
< center >
Only 7.4% of the PCR read count variation is explain by dilution
< / center > < / div >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > You must normalize your read counts< / h2 > < / hgroup > < article id = "you-must-normalize-your-read-counts" class = "smaller " >
< p > Two options:< / p >
< h3 > Rarefaction< / h3 >
< p > Randomly subsample the same number of reads for all the PCRs< / p >
< h3 > Relative frequencies< / h3 >
< p > Divide the read count of each MOTU in each sample by the total total read count of the same sample< / p >
< p > \[
\text{Relative fequency}(Motu_i,Sample_j) = \frac{\text{Read count}(Motu_i,Sample_j)}{\sum_{k=1}^n\text{Read count}(Motu_k,Sample_j)}
\]< / p >
< pre class = 'prettyprint lang-r' > library(vegan)< / pre >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Rarefying read count (1)< / h2 > < / hgroup > < article id = "rarefying-read-count-1" class = "smaller flexbox vcenter" >
< ul >
< li > We look for the minimum read number per PCR< / li >
< / ul >
< pre class = 'prettyprint lang-r' > min(rowSums(positive.count))< / pre >
< pre > ## [1] 2065< / pre >
< pre class = 'prettyprint lang-r' > positive.count.rarefied = rrarefy(positive.count,2000)< / pre >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Rarefying read count (2)< / h2 > < / hgroup > < article id = "rarefying-read-count-2" class = "smaller flexbox vcenter" >
2019-11-03 13:12:58 -05:00
< p > < img src = "index_files/figure-html/unnamed-chunk-24-1.png" width = "720" / > < / p >
2019-02-06 17:16:08 +01:00
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Rarefying read count (3)< / h2 > < / hgroup > < article id = "rarefying-read-count-3" class = "smaller flexbox vcenter" >
< p > Identifying the MOTUs with reads count greater than \(0\) after rarefaction.< / p >
< pre class = 'prettyprint lang-r' > are.still.present = colSums(positive.count.rarefied)> 0
are.still.present[1:5]< / pre >
< pre > ## P000001 P000002 P000003 P000004 P000005
## TRUE TRUE TRUE TRUE TRUE< / pre >
< pre class = 'prettyprint lang-r' > table(are.still.present)< / pre >
< pre > ## are.still.present
## FALSE TRUE
2019-11-03 13:12:58 -05:00
## 1886 3693< / pre >
2019-02-06 17:16:08 +01:00
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Rarefying read count (4)< / h2 > < / hgroup > < article id = "rarefying-read-count-4" class = "smaller flexbox vcenter" >
< pre class = 'prettyprint lang-r' > par(bg=NA)
boxplot(colSums(positive.count) ~ are.still.present, log=" y" )< / pre >
2019-11-03 13:12:58 -05:00
< p > < img src = "index_files/figure-html/unnamed-chunk-27-1.png" width = "720" / > < / p >
2019-02-06 17:16:08 +01:00
2019-11-03 13:12:58 -05:00
< p > The MOTUs removed by rarefaction were at most occurring 13 times< / p >
2019-02-06 17:16:08 +01:00
< p > The MOTUs kept by rarefaction were at least occurring 2 times< / p >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Rarefying read count (5)< / h2 > < / hgroup > < article id = "rarefying-read-count-5" class = "smaller vcenter" >
< h3 > Keep only sequences with reads after rarefaction< / h3 >
< pre class = 'prettyprint lang-r' > positive.count.rarefied = positive.count.rarefied[,are.still.present]
positive.motus.rare = positive.motus[are.still.present,]< / pre >
< center >
2019-11-03 13:12:58 -05:00
positive.motus.rare is now a \(192 \; PCRs \; \times \; 3693 \; MOTUs\)
2019-02-06 17:16:08 +01:00
< / center >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Why rarefying ?< / h2 > < / hgroup > < article id = "why-rarefying" class = "smaller vcenter columns-2" >
< p > < img src = "figures/subsampling.svg" width = "200px" / > < / p >
< p > < br > < br > < br > < br > Increasing the number of reads just increase the description of the subpart of the PCR you have sequenced.< / p >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Transforming read counts to relative frequencies< / h2 > < / hgroup > < article id = "transforming-read-counts-to-relative-frequencies" class = "smaller " >
< pre class = 'prettyprint lang-r' > positive.count.relfreq = decostand(positive.count,
method = " total" )< / pre >
< p > No sequences will be set to zero< / p >
< pre class = 'prettyprint lang-r' > table(colSums(positive.count.relfreq) == 0)< / pre >
< pre > ##
## FALSE
## 5579< / pre >
< / article > < / slide > < slide class = "segue dark nobackground level1" > < hgroup class = 'auto-fadein' > < h2 > Measuring diversity< / h2 > < / hgroup > < article id = "measuring-diversity" class = "smaller " >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > The different types of diversity< / h2 > < / hgroup > < article id = "the-different-types-of-diversity" class = "smaller vcenter" >
< div style = "float: left; width: 40%;" >
< p > < img src = 'figures/diversity.svg' title = '' / > <!-- --> < / p > < / div >
< div style = "float: left; width: 60%;" >
< p > < br > < br > < span class = "cite" > Whittaker (2010)< / span > < br > < br > < br > < br > < / p >
< ul >
2019-11-03 13:12:58 -05:00
< li > < p > \(\alpha\text{-diversity}\) : Mean diversity per site (\(species/site\))< / p > < / li >
< li > < p > \(\gamma\text{-diversity}\) : Regional biodiversity (\(species/region\))< / p > < / li >
< li > < p > \(\beta\text{-diversity}\) : \(\beta = \frac{\gamma}{\alpha}\) (\(sites/region\))< / p > < / li >
2019-02-06 17:16:08 +01:00
< / ul > < / div >
< / article > < / slide > < slide class = "segue dark nobackground level1" > < hgroup class = 'auto-fadein' > < h2 > \(\alpha\)-diversity< / h2 > < / hgroup > < article id = "alpha-diversity" class = "smaller " >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Which is th most diverse environment ?< / h2 > < / hgroup > < article id = "which-is-th-most-diverse-environment" class = "smaller flexbox vcenter" >
< p > < img src = "figures/alpha_diversity.svg" width = "400px" / > < / p >
< table class = "table" style = "margin-left: auto; margin-right: auto;" >
< thead >
< tr >
< th style = "text-align:left;" >
< / th >
< th style = "text-align:right;" >
A
< / th >
< th style = "text-align:right;" >
B
< / th >
< th style = "text-align:right;" >
C
< / th >
< th style = "text-align:right;" >
D
< / th >
< th style = "text-align:right;" >
E
< / th >
< th style = "text-align:right;" >
F
< / th >
< th style = "text-align:right;" >
G
< / th >
< / tr >
< / thead >
< tbody >
< tr >
< td style = "text-align:left;" >
Environment.1
< / td >
< td style = "text-align:right;" >
0.25
< / td >
< td style = "text-align:right;" >
0.25
< / td >
< td style = "text-align:right;" >
0.25
< / td >
< td style = "text-align:right;" >
0.25
< / td >
< td style = "text-align:right;" >
0.00
< / td >
< td style = "text-align:right;" >
0.00
< / td >
< td style = "text-align:right;" >
0.00
< / td >
< / tr >
< tr >
< td style = "text-align:left;" >
Environment.2
< / td >
< td style = "text-align:right;" >
0.55
< / td >
< td style = "text-align:right;" >
0.07
< / td >
< td style = "text-align:right;" >
0.02
< / td >
< td style = "text-align:right;" >
0.17
< / td >
< td style = "text-align:right;" >
0.07
< / td >
< td style = "text-align:right;" >
0.07
< / td >
< td style = "text-align:right;" >
0.03
< / td >
< / tr >
< / tbody >
< / table >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Richness< / h2 > < / hgroup > < article id = "richness" class = "smaller flexbox vcenter" >
< p > The actual number of species present in your environement whatever their aboundances< / p >
< p > < img src = "figures/alpha_diversity.svg" width = "400px" / > < / p >
< pre class = 'prettyprint lang-r' > S = rowSums(environments > 0)< / pre >
< table class = "table" style = "margin-left: auto; margin-right: auto;" >
< thead >
< tr >
< th style = "text-align:left;" >
< / th >
< th style = "text-align:right;" >
S
< / th >
< / tr >
< / thead >
< tbody >
< tr >
< td style = "text-align:left;" >
Environment.1
< / td >
< td style = "text-align:right;" >
4
< / td >
< / tr >
< tr >
< td style = "text-align:left;" >
Environment.2
< / td >
< td style = "text-align:right;" >
7
< / td >
< / tr >
< / tbody >
< / table >
2019-11-03 13:12:58 -05:00
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Gini-Simpson’ s index< / h2 > < / hgroup > < article id = "gini-simpsons-index" class = "smaller smaller" >
2019-02-06 17:16:08 +01:00
< div style = "float: left; width: 60%;" >
2019-11-03 13:12:58 -05:00
< p > The Simpson’ s index is the probability of having the same species twice when you randomly select two specimens. < br > < br > < / p > < / div >
2019-02-06 17:16:08 +01:00
< div style = "float: right; width: 40%;" >
< p > \[
\lambda =\sum _{i=1}^{S}p_{i}^{2}
\] < br > < / p > < / div >
< center >
< p > \(\lambda\) decrease when complexity of your ecosystem increase.< / p >
2019-11-03 13:12:58 -05:00
< p > Gini-Simpson’ s index defined as \(1-\lambda\) increase with diversity< / p >
2019-02-06 17:16:08 +01:00
< p > < img src = "figures/alpha_diversity.svg" width = "250px" / > < / p >
< / center >
< pre class = 'prettyprint lang-r' > GS = 1 - rowSums(environments^2)< / pre >
< table class = "table" style = "margin-left: auto; margin-right: auto;" >
< thead >
< tr >
< th style = "text-align:left;" >
< / th >
< th style = "text-align:right;" >
Gini.Simpson
< / th >
< / tr >
< / thead >
< tbody >
< tr >
< td style = "text-align:left;" >
Environment.1
< / td >
< td style = "text-align:right;" >
0.7500
< / td >
< / tr >
< tr >
< td style = "text-align:left;" >
Environment.2
< / td >
< td style = "text-align:right;" >
0.6526
< / td >
< / tr >
< / tbody >
< / table >
2019-11-03 13:12:58 -05:00
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Shannon entropy< / h2 > < / hgroup > < article id = "shannon-entropy" class = "smaller smaller" >
2019-02-06 17:16:08 +01:00
2019-11-03 13:12:58 -05:00
< p > Shannon entropy is based on information theory:< / p >
2019-02-06 17:16:08 +01:00
2019-11-03 13:12:58 -05:00
< center >
2019-02-06 17:16:08 +01:00
2019-11-03 13:12:58 -05:00
\(H^{\prime }=-\sum _{i=1}^{S}p_{i}\log p_{i}\)
2019-02-06 17:16:08 +01:00
2019-11-03 13:12:58 -05:00
< / center >
2019-02-06 17:16:08 +01:00
2019-11-03 13:12:58 -05:00
< p > if \(A\) is a community where every species are equally represented then \[
H(A) = \log|A|
\]< / p >
2019-02-06 17:16:08 +01:00
< center >
< img src = "figures/alpha_diversity.svg" width = "400px" / >
< / center >
< pre class = 'prettyprint lang-r' > H = - rowSums(environments * log(environments),na.rm = TRUE)< / pre >
< table class = "table" style = "margin-left: auto; margin-right: auto;" >
< thead >
< tr >
< th style = "text-align:left;" >
< / th >
< th style = "text-align:right;" >
2019-11-03 13:12:58 -05:00
Shannon.index
2019-02-06 17:16:08 +01:00
< / th >
< / tr >
< / thead >
< tbody >
< tr >
< td style = "text-align:left;" >
Environment.1
< / td >
< td style = "text-align:right;" >
1.386294
< / td >
< / tr >
< tr >
< td style = "text-align:left;" >
Environment.2
< / td >
< td style = "text-align:right;" >
1.371925
< / td >
< / tr >
< / tbody >
< / table >
2019-11-03 13:12:58 -05:00
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Hill’ s number< / h2 > < / hgroup > < article id = "hills-number" class = "smaller smaller" >
2019-02-06 17:16:08 +01:00
< div style = "float: left; width: 50%;" >
< p > As : \[
2019-11-03 13:12:58 -05:00
H(A) = \log|A| \;\Rightarrow\; ^1D = e^{H(A)}
2019-02-06 17:16:08 +01:00
\] < br > < / p > < / div >
< div style = "float: right; width: 50%;" >
2019-11-03 13:12:58 -05:00
< p > where \(^1D\) is the theoretical number of species in a evenly distributed community that would have the same Shannon’ s entropy than ours.< / p > < / div >
2019-02-06 17:16:08 +01:00
< center >
< BR > < BR > < img src = "figures/alpha_diversity.svg" width = "400px" / >
< / center >
< pre class = 'prettyprint lang-r' > D2 = exp(- rowSums(environments * log(environments),na.rm = TRUE))< / pre >
< table class = "table" style = "margin-left: auto; margin-right: auto;" >
< thead >
< tr >
< th style = "text-align:left;" >
< / th >
< th style = "text-align:right;" >
Hill.Numbers
< / th >
< / tr >
< / thead >
< tbody >
< tr >
< td style = "text-align:left;" >
Environment.1
< / td >
< td style = "text-align:right;" >
4.000000
< / td >
< / tr >
< tr >
< td style = "text-align:left;" >
Environment.2
< / td >
< td style = "text-align:right;" >
3.942933
< / td >
< / tr >
< / tbody >
< / table >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Generalized logaritmic function< / h2 > < / hgroup > < article id = "generalized-logaritmic-function" class = "smaller smaller" >
< p > Based on the generalized entropy < span class = "cite" > Tsallis (1994)< / span > we can propose a generalized form of logarithm.< / p >
< p > \[
^q\log(x) = \frac{x^{(1-q)}}{1-q}
\]< / p >
< p > The function is not defined for \(q=1\) but when \(q \longrightarrow 1\;,\; ^q\log(x) \longrightarrow \log(x)\)< / p >
< p > \[
^q\log(x) = \left\{
\begin{align}
\log(x),& \text{if } x = 1\\
\frac{x^{(1-q)}}{1-q},& \text{otherwise}
\end{align}
\right.
\]< / p >
2019-02-08 10:45:55 +01:00
< pre class = 'prettyprint lang-r' > log_q = function(x,q=1) {
2019-02-06 17:16:08 +01:00
if (q==1)
log(x)
else
(x^(1-q)-1)/(1-q)
}< / pre >
2019-02-08 10:45:55 +01:00
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Impact of \(q\) on the < code > log_q< / code > function< / h2 > < / hgroup > < article id = "impact-of-q-on-the-log_q-function" class = "smaller " >
2019-11-03 13:12:58 -05:00
< p > < img src = "index_files/figure-html/unnamed-chunk-48-1.png" width = "720" / > < / p >
2019-02-08 10:45:55 +01:00
2019-02-06 17:16:08 +01:00
< / article > < / slide > < slide class = "" > < hgroup > < h2 > And its inverse function< / h2 > < / hgroup > < article id = "and-its-inverse-function" class = "smaller flexbox vcenter" >
< p > \[
^qe^x = \left\{
\begin{align}
e^x,& \text{if } x = 1 \\
(1 + x(1-q))^{(\frac{1}{1-q})},& \text{otherwise}
\end{align}
\right.
\]< / p >
2019-02-08 10:45:55 +01:00
< pre class = 'prettyprint lang-r' > exp_q = function(x,q=1) {
2019-02-06 17:16:08 +01:00
if (q==1)
exp(x)
else
(1 + (1-q)*x)^(1/(1-q))
}< / pre >
2019-11-03 13:12:58 -05:00
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Generalised Shannon entropy< / h2 > < / hgroup > < article id = "generalised-shannon-entropy" class = "smaller " >
2019-02-06 17:16:08 +01:00
< p > \[
2019-11-03 13:12:58 -05:00
^qH = - \sum_{i=1}^S p_i \; ^q\log p_i
2019-02-06 17:16:08 +01:00
\]< / p >
2019-02-08 10:45:55 +01:00
< pre class = 'prettyprint lang-r' > H_q = function(x,q=1) {
sum(x * log_q(1/x,q),na.rm = TRUE)
2019-02-06 17:16:08 +01:00
}< / pre >
2019-11-03 13:12:58 -05:00
< p > and generalized the previously presented Hill’ s number< / p >
2019-02-06 17:16:08 +01:00
< p > \[
^qD=^qe^{^qH}
\]< / p >
2019-02-08 10:45:55 +01:00
< pre class = 'prettyprint lang-r' > D_q = function(x,q=1) {
exp_q(H_q(x,q),q)
2019-02-06 17:16:08 +01:00
}< / pre >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Biodiversity spectrum (1)< / h2 > < / hgroup > < article id = "biodiversity-spectrum-1" class = "smaller flexbox vcenter" >
2019-02-08 10:45:55 +01:00
< pre class = 'prettyprint lang-r' > H_spectrum = function(x,q=1) {
sapply(q,function(Q) H_q(x,Q))
2019-02-06 17:16:08 +01:00
}< / pre >
2019-02-08 10:45:55 +01:00
< pre class = 'prettyprint lang-r' > D_spectrum = function(x,q=1) {
sapply(q,function(Q) D_q(x,Q))
2019-02-06 17:16:08 +01:00
}< / pre >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Biodiversity spectrum (2)< / h2 > < / hgroup > < article id = "biodiversity-spectrum-2" class = "smaller " >
< pre class = 'prettyprint lang-r' > library(MetabarSchool)
qs = seq(from=0,to=3,by=0.1)
2019-02-08 10:45:55 +01:00
environments.hq = apply(environments,MARGIN = 1,H_spectrum,q=qs)
environments.dq = apply(environments,MARGIN = 1,D_spectrum,q=qs)< / pre >
2019-02-06 17:16:08 +01:00
2019-11-03 13:12:58 -05:00
< p > < img src = "index_files/figure-html/unnamed-chunk-55-1.png" width = "720" / > < / p >
2019-02-06 17:16:08 +01:00
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Generalized entropy \(vs\) \(\alpha\)-diversity indices< / h2 > < / hgroup > < article id = "generalized-entropy-vs-alpha-diversity-indices" class = "smaller " >
< ul >
< li > < p > \(^0H(X) = S - 1\) : the richness minus one.< / p > < / li >
2019-11-03 13:12:58 -05:00
< li > < p > \(^1H(X) = H^{\prime}\) : the Shannon’ s entropy.< / p > < / li >
< li > < p > \(^2H(X) = 1 - \lambda\) : Gini-Simpson’ s index.< / p > < / li >
2019-02-06 17:16:08 +01:00
< / ul >
2019-11-03 13:12:58 -05:00
< h3 > When computing the exponential of entropy : Hill’ s number< / h3 >
2019-02-06 17:16:08 +01:00
< ul >
< li > < p > \(^0D(X) = S\) : The richness.< / p > < / li >
< li > < p > \(^1D(X) = e^{H^{\prime}}\) : The number of species in an even community having the same \(H^{\prime}\).< / p > < / li >
2019-11-03 13:12:58 -05:00
< li > < p > \(^2D(X) = 1 / \lambda\) : The number of species in an even community having the same Gini-Simpson’ s index.< / p > < / li >
2019-02-06 17:16:08 +01:00
< / ul >
< br >
< center >
< p > \(q\) can be considered as a penality you give to rare species< / p >
< p > < strong > when \(q=0\) all the species have the same weight< / strong > < / p >
< / center >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Biodiversity spectrum of the mock community< / h2 > < / hgroup > < article id = "biodiversity-spectrum-of-the-mock-community" class = "smaller " >
2019-02-08 10:45:55 +01:00
< pre class = 'prettyprint lang-r' > H.mock = H_spectrum(plants.16$dilution,qs)
D.mock = D_spectrum(plants.16$dilution,qs)< / pre >
2019-02-06 17:16:08 +01:00
2019-11-03 13:12:58 -05:00
< p > < img src = "index_files/figure-html/unnamed-chunk-57-1.png" width = "720" / > < / p >
2019-02-06 17:16:08 +01:00
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Biodiversity spectrum and metabarcoding (1)< / h2 > < / hgroup > < article id = "biodiversity-spectrum-and-metabarcoding-1" class = "smaller smaller" >
< pre class = 'prettyprint lang-r' > positive.H = apply(positive.count.relfreq,
MARGIN = 1,
2019-02-08 10:45:55 +01:00
FUN = H_spectrum,
2019-02-06 17:16:08 +01:00
q=qs)< / pre >
2019-11-03 13:12:58 -05:00
< p > < img src = "index_files/figure-html/unnamed-chunk-59-1.png" width = "720" / > < / p >
2019-02-06 17:16:08 +01:00
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Biodiversity spectrum and metabarcoding (2)< / h2 > < / hgroup > < article id = "biodiversity-spectrum-and-metabarcoding-2" class = "smaller flexbox vcenter smaller" >
2019-11-03 13:12:58 -05:00
< p > < img src = "index_files/figure-html/unnamed-chunk-60-1.png" width = "720" / > < / p >
2019-02-06 17:16:08 +01:00
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Biodiversity spectrum and metabarcoding (3)< / h2 > < / hgroup > < article id = "biodiversity-spectrum-and-metabarcoding-3" class = "smaller smaller" >
< pre class = 'prettyprint lang-r' > positive.D = apply(positive.count.relfreq,
MARGIN = 1,
2019-02-08 10:45:55 +01:00
FUN = D_spectrum,
2019-02-06 17:16:08 +01:00
q=qs)< / pre >
2019-11-03 13:12:58 -05:00
< p > < img src = "index_files/figure-html/unnamed-chunk-62-1.png" width = "720" / > < / p >
2019-02-06 17:16:08 +01:00
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Impact of data cleaning on \(\alpha\)-diversity (1)< / h2 > < / hgroup > < article id = "impact-of-data-cleaning-on-alpha-diversity-1" class = "smaller " >
< p > We realize a basic cleaning:< / p >
< ul >
< li > removing signletons< / li >
< li > too short or long sequences< / li >
< li > clustering data using < code > obiclean< / code > < / li >
< / ul >
< pre class = 'prettyprint lang-bash' > obigrep -p ' count > 1' \
positifs.uniq.annotated.fasta \
> positifs.uniq.annotated.no.singleton.fasta
obigrep -l 10 -L 150 \
positifs.uniq.annotated.no.singleton.fasta \
> positifs.uniq.annotated.good.length.fasta
obiclean -s merged_sample -H -C -r 0.1 \
positifs.uniq.annotated.good.length.fasta \
> positifs.uniq.annotated.clean.fasta< / pre >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Impact of data cleaning on \(\alpha\)-diversity (2)< / h2 > < / hgroup > < article id = "impact-of-data-cleaning-on-alpha-diversity-2" class = "smaller " >
< pre class = 'prettyprint lang-r' > data(positive.clean.count)
positive.clean.count.relfreq = decostand(positive.clean.count,
method = " total" )
positive.clean.H = apply(positive.clean.count.relfreq,
MARGIN = 1,
2019-02-08 10:45:55 +01:00
FUN = H_spectrum,
2019-02-06 17:16:08 +01:00
q=qs)< / pre >
2019-11-03 13:12:58 -05:00
< p > < img src = "index_files/figure-html/unnamed-chunk-65-1.png" width = "720" / > < / p >
2019-02-06 17:16:08 +01:00
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Impact of data cleaning on \(\alpha\)-diversity (3)< / h2 > < / hgroup > < article id = "impact-of-data-cleaning-on-alpha-diversity-3" class = "smaller " >
< pre class = 'prettyprint lang-r' > positive.clean.D = apply(positive.clean.count.relfreq,
MARGIN = 1,
2019-02-08 10:45:55 +01:00
FUN = D_spectrum,
2019-02-06 17:16:08 +01:00
q=qs)< / pre >
2019-11-03 13:12:58 -05:00
< p > < img src = "index_files/figure-html/unnamed-chunk-67-1.png" width = "720" / > < / p >
2019-02-06 17:16:08 +01:00
< / article > < / slide > < slide class = "segue dark nobackground level1" > < hgroup class = 'auto-fadein' > < h2 > \(\beta\)-diversity< / h2 > < / hgroup > < article id = "beta-diversity" class = "smaller " >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Dissimilarity indices or non-metric distances< / h2 > < / hgroup > < article id = "dissimilarity-indices-or-non-metric-distances" class = "smaller flexbox vcenter" >
< center >
A dissimilarity index \(d(A,B)\) is a numerical measurement < br > of how far apart objects \(A\) and \(B\) are.
< / center >
< h3 > Properties< / h3 >
< p > \[
\begin{align}
d(A,B) \geqslant& 0 \\
d(A,B) =& d(B,A) \\
d(A,B) =& 0 \iff A = B \\
\end{align}
\]< / p >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Some dissimilarity indices< / h2 > < / hgroup > < article id = "some-dissimilarity-indices" class = "smaller " >
< h3 > Bray-Curtis< / h3 >
< p > Relying on contengency table (quantitative data)< / p >
< p > \[
{\displaystyle BC(A,B)=1-{\frac {2\sum _{i=1}^{p}min(N_{Ai},N_{Bi})}{\sum _{i=1}^{p}(N_{Ai}+N_{Bi})}}}, \; \text{with }p\text{ the total number of species}
\]< / p >
< h3 > Jaccard indices< / h3 >
< p > Relying on presence absence data< / p >
< p > \[
J(A,B) = {{|A \cap B|}\over{|A \cup B|}} = {{|A \cap B|}\over{|A| + |B| - |A \cap B|}}.
\]< / p >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Metrics or distances< / h2 > < / hgroup > < article id = "metrics-or-distances" class = "smaller " >
< div style = "float: left; width: 50%;" >
< p > < img src = "figures/metric.svg" width = "400px" / > < / p > < / div >
< div style = "float: right; width: 50%;" >
< p > A metric is a dissimilarity index verifying the < em > subadditivity< / em > also named < em > triangle inequality< / em > < / p >
< p > \[
\begin{align}
d(A,B) \geqslant& 0 \\
d(A,B) =& \;d(B,A) \\
d(A,B) =& \;0 \iff A = B \\
d(A,B) \leqslant& \;d(A,C) + d(C,B)
\end{align}
\]< / p > < / div >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Some metrics< / h2 > < / hgroup > < article id = "some-metrics" class = "smaller " >
< div style = "float: left; width: 50%;" >
< p > < img src = "figures/Distance.svg" width = "400px" / > < / p > < / div >
< div style = "float: right; width: 50%;" >
< h3 > Computing< / h3 >
< p > \[
\begin{align}
d_e =& \sqrt{(x_A - x_B)^2 + (y_A - y_B)^2} \\
d_m =& |x_A - x_B| + |y_A - y_B| \\
d_c =& \max(|x_A - x_B| , |y_A - y_B|) \\
\end{align}
\]< / p > < / div >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Generalizable on a n-dimension space< / h2 > < / hgroup > < article id = "generalizable-on-a-n-dimension-space" class = "smaller smaller" >
< p > Considering 2 points \(A\) and \(B\) defined by \(n\) variables< / p >
< p > \[
\begin{align}
A :& (a_1,a_2,a_3,...,a_n) \\
B :& (b_1,b_2,b_3,...,b_n)
\end{align}
\]< / p >
< p > with \(a_i\) and \(b_i\) being respectively the value of the \(i^{th}\) variable for \(A\) and \(B\).< / p >
< p > \[
\begin{align}
d_e =& \sqrt{\sum_{i=1}^{n}(a_i - b_i)^2 } \\
d_m =& \sum_{i=1}^{n}\left| a_i - b_i \right| \\
d_c =& \max\limits_{1\leqslant i \leqslant n}\left|a_i - b_i\right| \\
\end{align}
\]< / p >
2019-11-03 13:12:58 -05:00
< / article > < / slide > < slide class = "" > < hgroup > < h2 > For the fun… ;-)< / h2 > < / hgroup > < article id = "for-the-fun--" class = "smaller flexbox vcenter" >
2019-02-06 17:16:08 +01:00
< p > You can generalize those distances as a norm of order \(k\)< / p >
< p > \[
d^k = \sqrt[k]{\sum_{i=1}^n|a_i - b_i|^k}
\]< / p >
< ul >
< li > \(k=1 \Rightarrow D_m\) Manhatan distance< / li >
< li > \(k=2 \Rightarrow D_e\) Euclidean distance< / li >
< li > \(k=\infty \Rightarrow D_c\) Chebychev distance< / li >
< / ul >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Metrics and ultrametrics< / h2 > < / hgroup > < article id = "metrics-and-ultrametrics" class = "smaller " >
< div style = "float: left; width: 50%;" >
< p > < img src = "figures/ultrametric.svg" width = "400px" / > < / p > < / div >
< div style = "float: right; width: 50%;" >
< h3 > Metric< / h3 >
< p > \[
d(x,z)\leqslant d(x,y)+d(y,z)
\]< / p >
< h3 > Ultrametric< / h3 >
< p > \[
d(x,z)\leq \max(d(x,y),d(y,z))
\]< / p > < / div >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Why it is nice to use metrics ?< / h2 > < / hgroup > < article id = "why-it-is-nice-to-use-metrics" class = "smaller flexbox vcenter" >
< ul >
< li > A metric induce a metric space< / li >
< li > In a metric space rotations are isometries< / li >
< li > This means that rotations are not changing distances between objects< / li >
2019-11-03 13:12:58 -05:00
< li > Multidimensional scaling (PCA, PCoA, CoA…) are rotations< / li >
2019-02-06 17:16:08 +01:00
< / ul >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > The data set< / h2 > < / hgroup > < article id = "the-data-set" class = "smaller flexbox vcenter" >
< p > < strong > We analyzed two forest sites in French Guiana< / strong > < / p >
< ul >
< li > < p > Mana : Soil is composed of white sands.< / p > < / li >
< li > < p > Petit Plateau : Terra firme (firm land). In the Amazon, it corresponds to the area of the forest that is not flooded during high water periods. The terra firme is characterized by old and poor soils.< / p > < / li >
< / ul >
< p > < strong > At each site, twice sixteen samples where collected over an hectar< / strong > < / p >
< ul >
< li > < p > Sixteen samples of soil. Each of them is constituted by a mix of five cores of 50g from the 10 first centimeters of soil covering half square meter.< / p > < / li >
< li > < p > Sixteen samples of litter. Each of them is constituted by the total litter collecter over the same half square meter where soil was sampled< / p > < / li >
< / ul >
< pre class = 'prettyprint lang-r' > data(" guiana.count" )
data(" guiana.motus" )
data(" guiana.samples" )< / pre >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Clean out bad PCR cycle 1< / h2 > < / hgroup > < article id = "clean-out-bad-pcr-cycle-1" class = "smaller flexbox vcenter smaller" >
< pre class = 'prettyprint lang-r' > s = tag_bad_pcr(guiana.samples$sample,guiana.count)< / pre >
2019-11-03 13:12:58 -05:00
< p > < img src = "index_files/figure-html/unnamed-chunk-72-1.png" width = "720" / > < / p >
2019-02-06 17:16:08 +01:00
< pre class = 'prettyprint lang-r' > guiana.count.clean = guiana.count[s$keep,]
guiana.samples.clean = guiana.samples[s$keep,]< / pre >
< pre class = 'prettyprint lang-r' > table(s$keep)< / pre >
< pre > ##
## FALSE TRUE
## 48 293< / pre >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Clean out bad PCR cycle 2< / h2 > < / hgroup > < article id = "clean-out-bad-pcr-cycle-2" class = "smaller flexbox vcenter smaller" >
< pre class = 'prettyprint lang-r' > s = tag_bad_pcr(guiana.samples.clean$sample,guiana.count.clean)< / pre >
2019-11-03 13:12:58 -05:00
< p > < img src = "index_files/figure-html/unnamed-chunk-74-1.png" width = "720" / > < / p >
2019-02-06 17:16:08 +01:00
< pre class = 'prettyprint lang-r' > guiana.count.clean = guiana.count.clean[s$keep,]
guiana.samples.clean = guiana.samples.clean[s$keep,]< / pre >
< pre class = 'prettyprint lang-r' > table(s$keep)< / pre >
< pre > ##
## FALSE TRUE
## 7 286< / pre >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Clean out bad PCR cycle 3< / h2 > < / hgroup > < article id = "clean-out-bad-pcr-cycle-3" class = "smaller flexbox vcenter smaller" >
< pre class = 'prettyprint lang-r' > s = tag_bad_pcr(guiana.samples.clean$sample,guiana.count.clean)< / pre >
2019-11-03 13:12:58 -05:00
< p > < img src = "index_files/figure-html/unnamed-chunk-76-1.png" width = "720" / > < / p >
2019-02-06 17:16:08 +01:00
< pre class = 'prettyprint lang-r' > guiana.count.clean = guiana.count.clean[s$keep,]
guiana.samples.clean = guiana.samples.clean[s$keep,]< / pre >
< pre class = 'prettyprint lang-r' > table(s$keep)< / pre >
< pre > ##
## TRUE
## 286< / pre >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Averaging good PCR replicates (1)< / h2 > < / hgroup > < article id = "averaging-good-pcr-replicates-1" class = "smaller flexbox vcenter" >
< pre class = 'prettyprint lang-r' > guiana.samples.clean = cbind(guiana.samples.clean,s)
guiana.count.mean = aggregate(decostand(guiana.count.clean,method = " total" ),
by = list(guiana.samples.clean$sample),
FUN=mean)
n = guiana.count.mean[,1]
guiana.count.mean = guiana.count.mean[,-1]
rownames(guiana.count.mean)=as.character(n)
guiana.count.mean = as.matrix(guiana.count.mean)
dim(guiana.count.mean)< / pre >
< pre > ## [1] 83 7884< / pre >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Averaging good PCR replicates (2)< / h2 > < / hgroup > < article id = "averaging-good-pcr-replicates-2" class = "smaller flexbox vcenter" >
< pre class = 'prettyprint lang-r' > guiana.samples.mean = aggregate(guiana.samples.clean,
by = list(guiana.samples.clean$sample),
FUN=function(i) i[1])
n = guiana.samples.mean[,1]
guiana.samples.mean = guiana.samples.mean[,-1]
rownames(guiana.samples.mean)=as.character(n)
dim(guiana.samples.mean)< / pre >
< pre > ## [1] 83 17< / pre >
< h3 > Keep only samples< / h3 >
< pre class = 'prettyprint lang-r' > guiana.samples.final = guiana.samples.mean[! is.na(guiana.samples.mean$site_id),]
guiana.count.final = guiana.count.mean[! is.na(guiana.samples.mean$site_id),]< / pre >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Estimating similarity between samples< / h2 > < / hgroup > < article id = "estimating-similarity-between-samples" class = "smaller flexbox vcenter" >
< pre class = 'prettyprint lang-r' > guiana.hellinger.final = decostand(guiana.count.final,method = " hellinger" )
guiana.relfreq.final = decostand(guiana.count.final,method = " total" )
guiana.presence.1.final = guiana.relfreq.final > 0.001
guiana.presence.10.final = guiana.relfreq.final > 0.01
guiana.presence.50.final = guiana.relfreq.final > 0.05
guiana.bc.dist = vegdist(guiana.relfreq.final,method = " bray" )
guiana.euc.dist = vegdist(guiana.hellinger.final,method = " euclidean" )
guiana.jac.1.dist = vegdist(guiana.presence.1.final,method = " jaccard" )
guiana.jac.10.dist = vegdist(guiana.presence.10.final,method = " jaccard" )
guiana.jac.50.dist = vegdist(guiana.presence.50.final,method = " jaccard" )< / pre >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Euclidean distance on Hellinger transformation< / h2 > < / hgroup > < article id = "euclidean-distance-on-hellinger-transformation" class = "smaller " >
< pre class = 'prettyprint lang-r' > xy = guiana.count.final[,order(-colSums(guiana.count.final))]
xy = xy[,1:2]
xy.hellinger = decostand(xy,method = " hellinger" )< / pre >
< div style = "float: left; width: 50%;" >
2019-11-03 13:12:58 -05:00
< p > < img src = "index_files/figure-html/unnamed-chunk-83-1.png" width = "384" / > < / p > < / div >
2019-02-06 17:16:08 +01:00
< div style = "float: right; width: 50%;" >
< p > < img src = "figures/euclidean_hellinger.svg" width = "400px" / > < / p > < / div >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Bray-Curtis distance on relative frequencies< / h2 > < / hgroup > < article id = "bray-curtis-distance-on-relative-frequencies" class = "smaller " >
< p > \[
BC_{jk}=1-{\frac {2\sum _{i=1}^{p}min(N_{ij},N_{ik})}{\sum _{i=1}^{p}(N_{ij}+N_{ik})}}
\]< / p >
< p > \[
BC_{jk}=\frac{\sum _{i=1}^{p}(N_{ij}+N_{ik})-\sum _{i=1}^{p}2\;min(N_{ij},N_{ik})}{\sum _{i=1}^{p}(N_{ij}+N_{ik})}
\]< / p >
< p > \[
BC_{jk}=\frac{\sum _{i=1}^{p}(N_{ij} - min(N_{ij},N_{ik}) + (N_{ik} - min(N_{ij},N_{ik}))}{\sum _{i=1}^{p}(N_{ij}+N_{ik})}
\]< / p >
< p > \[
2019-11-03 13:12:58 -05:00
BC_{jk}=\frac{\sum _{i=1}^{p}|N_{ij} - N_{ik}|}{\sum _{i=1}^{p}N_{ij}+\sum _{i=1}^{p}N_{ik}}
2019-02-06 17:16:08 +01:00
\]< / p >
< p > \[
2019-11-03 13:12:58 -05:00
BC_{jk}=\frac{\sum _{i=1}^{p}|N_{ij} - N_{ik}|}{1+1}
2019-02-06 17:16:08 +01:00
\]< / p >
< p > \[
2019-11-03 13:12:58 -05:00
BC_{jk}=\frac{1}{2}\sum _{i=1}^{p}|N_{ij} - N_{ik}|
2019-02-06 17:16:08 +01:00
\]< / p >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Principale coordinate analysis (1)< / h2 > < / hgroup > < article id = "principale-coordinate-analysis-1" class = "smaller flexbox vcenter" >
< pre class = 'prettyprint lang-r' > guiana.bc.pcoa = cmdscale(guiana.bc.dist,k=3,eig = TRUE)
guiana.euc.pcoa = cmdscale(guiana.euc.dist,k=3,eig = TRUE)
guiana.jac.1.pcoa = cmdscale(guiana.jac.1.dist,k=3,eig = TRUE)
guiana.jac.10.pcoa = cmdscale(guiana.jac.10.dist,k=3,eig = TRUE)
guiana.jac.50.pcoa = cmdscale(guiana.jac.50.dist,k=3,eig = TRUE)< / pre >
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Principale coordinate analysis (2)< / h2 > < / hgroup > < article id = "principale-coordinate-analysis-2" class = "smaller " >
2019-11-03 13:12:58 -05:00
< p > < img src = "index_files/figure-html/unnamed-chunk-86-1.png" width = "720" / > < / p >
2019-02-06 17:16:08 +01:00
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Principale composante analysis< / h2 > < / hgroup > < article id = "principale-composante-analysis" class = "smaller flexbox vcenter" >
< pre class = 'prettyprint lang-r' > guiana.hellinger.pca = prcomp(guiana.hellinger.final,center = TRUE, scale. = FALSE)< / pre >
2019-11-03 13:12:58 -05:00
< p > < img src = "index_files/figure-html/unnamed-chunk-88-1.png" width = "1152" / > < / p >
<!-- -
## Computation of norms
```r
guiana.n1.dist = norm(guiana.relfreq.final,l=1)
guiana.n2.dist = norm(guiana.relfreq.final^(1/2),l=2)
guiana.n3.dist = norm(guiana.relfreq.final^(1/3),l=3)
guiana.n4.dist = norm(guiana.relfreq.final^(1/100),l=100)
```
## pCoA on norms
< img src = "index_files/figure-html/unnamed-chunk-90-1.png" width = "720" / >
--->
2019-02-06 17:16:08 +01:00
2019-02-06 17:22:48 +01:00
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Comparing diversity of the environments< / h2 > < / hgroup > < article id = "comparing-diversity-of-the-environments" class = "smaller " >
2019-02-06 17:16:08 +01:00
2019-11-03 13:12:58 -05:00
< p > < img src = "index_files/figure-html/unnamed-chunk-92-1.png" width = "864" / > < / p >
2019-02-06 17:16:08 +01:00
< / article > < / slide > < slide class = "" > < hgroup > < h2 > Bibliography< / h2 > < / hgroup > < article id = "bibliography" class = "smaller unnumbered" >
< div id = "refs" class = "references" >
< div id = "ref-Tsallis:94:00" >
< p > Tsallis, Constantino. 1994. “What Are the Numbers That Experiments Provide.” < em > Quim. Nova< / em > 17 (6): 468– 71.< / p > < / div >
< div id = "ref-Whittaker:10:00" >
< p > Whittaker, Robert J. 2010. “Meta-Analyses and Mega-Mistakes: Calling Time on Meta-Analysis of the Species Richness-Productivity Relationship.” < em > Ecology< / em > 91 (9). Eco Soc America: 2522– 33.< / p > < / div > < / div > < / article > < / slide >
< slide class = "backdrop" > < / slide >
< / slides >
<!-- dynamically load mathjax for compatibility with self - contained -->
< script >
(function () {
var script = document.createElement("script");
script.type = "text/javascript";
script.src = "index_files/mathjax-local/MathJax.js?config=TeX-AMS-MML_HTMLorMML";
document.getElementsByTagName("head")[0].appendChild(script);
})();
< / script >
<!-- map slide visiblity events into shiny -->
< script >
(function() {
if (window.jQuery) {
window.jQuery(document).on('slideleave', function(e) {
window.jQuery(e.target).trigger('hidden');
});
window.jQuery(document).on('slideenter', function(e) {
window.jQuery(e.target).trigger('shown');
});
}
})();
< / script >
< / body >
< / html >