Introduction

Setup

library(ggplot2)
library(ape)
library(ggtree)
library(vegan)

Data

Create a directory called Data inside of your working directory. Then downlod the zip file below and unzip it to your Data folder:

OTU_files.zip

The folder should contain these files:

OTU_table_QIIME1.txt
SampleInfo.csv

Background

These data are from a meta-analysis of plant microbiome studies using 16S metabarcodes. Each sample is a plant that was sequenced for 16S to identify the bacteria associated with it.

Let’s take a look at the data:

Sample Info

This file contains information about the samples. It is essentially an index that links a specific sample code to the species that it was taken from.

Samples<-read.csv("./Data/SampleInfo.csv")

OTU Table

OTU_table <- read.delim("./Data/OTU_table_QIIME1.txt", header=T, row.names="OTU_ID")

Note that we used the parameter header=T, which sets the first row as column names. We also use row.names=OTU_ID, which sets the column names to the values in the column called OTU_ID. What does this do? You’ll understand why we did this in a minute.

This is a simple table tab-delimited data table. Note that it is imported with the read.delim() function in R, which translates a tab character \t into columns. If we were to open it in a text file, the first four lines would look like this:

OTU_ID  SRR4267992  SRR4289215  SRR5211360  SRR5211361  SRR5439762  SRR5439772  SRR4267987  SRR4289218  SRR4734815  SRR4708713  SRR5051401  SRR5051407  SRR5051999  SRR5052001  SRR5052006  SRR5051567  SRR5052003  SRR5051415  SRR5051568  ERR632209   SRR4267988  SRR4289217  SRR2087833  SRR2087830  SRR2087835  ERR1745180  ERR1745155  ERR1745154  ERR1745153  ERR1745151  ERR1745152  ERR1745156  ERR1745178  SRR2204344  SRR2204709  SRR2087832  SRR3073557  ERR632218   SRR5439753  SRR3742182  SRR3742176  SRR3742138  SRR3073553  SRR3073555  ERR632208   SRR5439754  SRR2087831  SRR2087834  SRR3073554  SRR5439751  SRR5439752  SRR5439770  SRR5439755  SRR5439761  SRR5051400  ERR632220   SRR5185934  SRR4010871  SRR5439756  SRR5439757  SRR5439759  SRR5439763  SRR5439764  SRR5439765  SRR5439766  SRR5439767  SRR5439768  SRR5439769  SRR5439773  SRR5439774  SRR5439771  SRR4265353  SRR4289222  ERR1745165  ERR1745169  SRR3073556  SRR3073558  SRR3073559  ERR1745158  ERR1745179  SRR5439758  SRR5439760  SRR5052000  SRR5052004  ERR1745177  SRR4010872  SRR4265352  SRR4289224  ERR1745175  ERR1745176  SRR4267986  SRR4289219  ERR631092   ERR632207   ERR632221   SRR5185925  SRR5439796  SRR3747548  SRR5439793  SRR2204159  SRR5439778  SRR5439782  SRR5439776  SRR5439789  SRR5439781  SRR5439786  SRR5439798  ERR1745170  ERR1745167  SRR4706101  SRR2064974  SRR5439779  SRR5439780  SRR5439795  SRR5439794  SRR4453815  SRR3742177  SRR3742180  SRR3742197  SRR3742190  SRR3742179  SRR3742194  SRR3742184  ERR1745163  SRR2064975  SRR5185927  SRR5185938  SRR5185933  SRR5185932  SRR5185935  SRR5185926  SRR5185924  SRR5185931  SRR5185936  SRR5185929  SRR5185939  SRR5185937  SRR5185940  SRR4263477  SRR4289223  SRR5185944  SRR5185942  SRR5185946  SRR5185943  SRR5185941  ERR1745182  ERR632219   ERR1745181  ERR1745159  ERR1745157  ERR1745183  SRR4267985  SRR4289220  SRR5439785  SRR5439788  SRR5439775  SRR5439777  SRR5439783  SRR5439787  SRR5439791  SRR5439792  SRR5439797  SRR5439784  SRR5439790  ERR1745171  ERR1745168  SRR5185930  SRR4734837  SRR3742191  SRR3742193  SRR5185947  ERR1745164  ERR1745166  SRR5185945  SRR2204707  SRR5185928  ERR592858   taxonomy
4479946 1.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 k__Bacteria; p__Proteobacteria; c__Betaproteobacteria; o__MND1; f__; g__; s__
4386156 0.0 0.0 1.0 2.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 k__Bacteria; p__Proteobacteria; c__Alphaproteobacteria; o__Rickettsiales; f__mitochondria; g__; s__

Parsed table

Once parsed, the data are easier to understand:

	SRR4267992	SRR4289215	SRR5211360	SRR5211361	SRR5439762	SRR5439772	SRR4267987	SRR4289218	SRR5051407	SRR5051567	ERR632209	SRR4267988	SRR4289217	SRR2087833	SRR2087830	SRR2087835	ERR1745180	ERR1745153	ERR1745152	SRR2087832	ERR632218	SRR5439753	ERR632208	SRR5439754	SRR2087831	SRR2087834	SRR3073554	SRR5439751	SRR5439752	SRR5439770	SRR5439755	SRR5439761	ERR632220	SRR5185934	SRR4010871	SRR5439756	SRR5439757	SRR5439759	SRR5439763	SRR5439764	SRR5439765	SRR5439766	SRR5439767	SRR5439768	SRR5439769	SRR5439773	SRR5439774	SRR5439771	SRR4265353	SRR4289222	ERR1745169	SRR5439758	SRR5439760	ERR1745177	SRR4010872	SRR4265352	SRR4289224	ERR1745176	SRR4267986	SRR4289219	ERR631092	ERR632207	ERR632221	SRR5439796	SRR5439793	SRR5439778	SRR5439789	SRR5439781	SRR5439798	ERR1745167	SRR2064974	SRR5439780	SRR5439795	SRR5439794	SRR4453815	ERR1745163	SRR2064975	SRR5185927	SRR5185938	SRR5185932	SRR5185931	SRR5185936	SRR5185929	SRR5185939	SRR5185937	SRR4263477	SRR4289223	SRR5185944	SRR5185942	SRR5185946	SRR5185943	SRR5185941	ERR632219	SRR4267985	SRR4289220	SRR5439785	SRR5439775	SRR5439777	SRR5439783	SRR5439791	SRR5439792	SRR5439784	SRR5439790	ERR1745168	SRR5185930	SRR4734837	SRR5185947	taxonomy
432284	31	31	0	0	3	5	5	5	0	0	22	22	22	0	1	0	0	2	1	0	291	19	1	291	1	5	0	16	8	1	4	9	227	48	414	2	1	2	6	3	24	7	1	10	6	7	11	3	30	30	0	0	6	4	450	10	10	0	3	3	259	18	224	19	147	391	9	15	9	0	25957	92	14	2	0	0	254	88	79	3	10	9	166	9	35	18	18	10	583	21	183	5	346	18	18	4	1	5	1	1	1	12	4	0	10	0	41	k__Bacteria; p__Cyanobacteria; c__Chloroplast; oStreptophyta; f; g; s
694558	0	0	91	108	0	0	0	0	0	0	0	0	0	0	1	17	1	0	0	3	0	2	0	0	9	15	0	0	2	0	1	0	0	0	0	0	0	0	0	0	0	0	0	1	1	1	0	0	0	0	1	0	0	0	0	0	0	1	0	0	0	0	0	0	1	0	0	0	0	2	0	0	0	0	0	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	0	0	k__Bacteria; p__Firmicutes; c__Bacilli; o__Bacillales; f__Bacillaceae; gBacillus; s
543472	0	0	0	0	69	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	2	0	0	1	0	0	0	0	0	0	0	0	0	4	0	0	0	0	0	2	0	0	0	0	0	0	0	0	13	8	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	k__Bacteria; p__Proteobacteria; c__Alphaproteobacteria; o__Rhizobiales; f__Bradyrhizobiaceae; g; s
938794	0	0	22	0	0	0	7	7	1	1	0	0	0	1	0	2	0	0	0	1	0	0	0	0	0	2	0	0	0	0	0	0	3	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	1	0	12	12	0	0	1	0	0	0	0	0	0	0	0	0	0	0	4	0	0	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	k__Bacteria; p__Proteobacteria; c__Gammaproteobacteria; o__Pseudomonadales; f__Moraxellaceae; g__Acinetobacter; s__guillouiae

Note that this is a 16.5MB file, so only a few lines are printed here. The first column contains the OTU_ID – this is a unique ID number assigned to a unique ‘Operational Taxonomic Unit’ or OTU. The term OTU is common for microbes like bacteria and fungi where species concepts don’t work so well. Since we are inferring taxonomy from a single gene, all sequences that share the same sequence are considered part of the same OTU.

The next columns start with SRR. These are the the individual samples that were sequenced. Each number in the SRR column corresponds to the number of sequences for the OTU that were present in that sample.

The last column is called taxonomy and contains the predicted taxonomic assignment for each OTU. Let’s remove these so that we aren’t mixing numeric and character data.

OTU_data<-OTU_table[,-ncol(OTU_table)]

Let’s see how many OTUs and samples are in the dataset:

dim(OTU_data)

## [1] 21201   177

How many OTUs? How many samples?

Remove errors

Let’s see how many total sequences are in this dataset:

X<-rowSums(OTU_data)
sum(X)

## [1] 18080169

With > 18,000,000 reads (i.e. individual sequences) we are bound to get some reads that are just contamination. To reduce the dataset slightly we can remove any OTUs that don’t have more than one sequence in more than one sample:

Drop<-rowSums(OTU_data) < 2
sum(Drop)

## [1] 3956

This will allow us to drop almost 4,000 OTUs from the analysis.

OTU_red<-OTU_data[!Drop,]

Table transpose

To analyze the OTU table, we need to rearrange with the species across the top and sample sizes along the bottom. A simple way to do this is with the transpose function t(). Since it’s a big file, let’s do it only on the first 3 rows and columns to see what happens:

OTU_red[1:3,1:3]

##         SRR4267992 SRR4289215 SRR5211360
## 4479946          1          1          0
## 4386156          0          0          1
## 4479944          0          0          0

t(OTU_red[1:3,1:3])

##            4479946 4386156 4479944
## SRR4267992       1       0       0
## SRR4289215       1       0       0
## SRR5211360       0       1       0

That looks good, so let’s do it for the whole dataset:

OTUs<-as.data.frame(t(OTU_red))

Analysis

Now that we have samples as rows and OTUs as columns, how can we compare the samples?

Hint: How did we compare dragons with different traits?

(see the Dragon Phylogeny tutorial)

In an analogous fashion, we can calculate pairwise distances among our samples to create a distance matrix.

Binary Distance matrix

We can change our OTUs dataframe into binary data by changing the read counts in each square to 1 or 0. Of course, we lose information of the relative number of sequences for each OTU, so we’ll look how to analyze that data later.

Transform to 1 if # reads >=1 and 0 otherwise. We could use a nested loop through each row and each column. You CAN do IT this WAY, but it may take a long time, so you might want to skip this code, but take some time to look at it carefully and make sure you understand what it is doing (or replace nrow(OTUs) with a smaller number like 5):

OTU_bin<-OTUs
for (i in 1:nrow(OTU_bin)){
  for (j in 1:ncol(OTU_bin)){
    if(OTU_bin[i,j]>0){
      OTU_bin[i,j]<-1
    }
  }
}

Loops are not computationally efficient – especially loops nested within loops. This becomes a problem when we have a lot of rows and columns to deal with. Since we are working with a numeric data.frame (i.e. all data in rows and columns are numeric), there is a nice shortcut:

X<-OTUs[1:3,1:3]
X

##            4479946 4386156 4479944
## SRR4267992       1       0       0
## SRR4289215       1       0       0
## SRR5211360       0       1       0

X[X>0]<-100
X

##            4479946 4386156 4479944
## SRR4267992     100       0       0
## SRR4289215     100       0       0
## SRR5211360       0     100       0

This won’t work with most data frames, because data frames often contain heterogenous data (e.g. sample ID, measurements, counts, binary, factor).

Now we apply it to the full dataset to create a binary matrix:

OTU_bin<-OTUs
OTU_bin[OTU_bin>0]<-1

Did you have to wait?

Compare that compute time to a nested loop. Remember:

dim(OTUs)

## [1]   177 17245

That means the total number of cells to replace is:

nrow(OTUs)*ncol(OTUs)

## [1] 3052365

Just as we did in the Dragon Phylogeny tutorial, we can calculate the pairwise distance of this binary matrix.

OTU_bin_dist<-dist(OTU_bin,method='binary')

What will our matrix size be? How many elements?

We are comparing the sites (rows), so:

nrow(OTU_bin)

## [1] 177

nrow(OTU_bin) * nrow(OTU_bin)

## [1] 31329

Neighbour-Joining

Just as we did with our Dragon Phylogeny, we can cluster our samples by their similarity. For example, we can use the neighbor-joining method:

library(ape)
library(ggtree)
OTU_tree<-nj(OTU_bin_dist)
ggtree(OTU_tree,layout="rectangular")

Note that this is NOT A PHYLOGENY – a phylogeny shows an hypothesis about evolutionary relationships among organisms. This is a neighbour-joining tree that clusters samples based on the similarity of their microbes.

To see if species cluster together, we can colour-code and annotate our tree using our Samples data.frame object.

ggtree(OTU_tree,layout="rectangular") %<+% Samples + 
  geom_tiplab(aes(colour=Species)) + 
  theme(legend.position="right")

Bray-Curtis dissimilarity

In addition to binary distances, we can include abundance information. In our case ‘abundance’ is inferred from the number of sequence reads – the idea being that the more abundant a bacterial species is in the sample, the more likely it is for its DNA to be sequenced multiple times.

The dist function has other options for calculating distances:

?dist

A popular measure is the Bray-Curtis dissimilarity metric. The Bray-Curtis dissimilarity (BC) between two samples \(i\) and \(j\) is:

\[BC_{ij} = 1-\frac{2 \sum_{k=1}^{N} min(Sp_{i,k},Sp_{j,k})}{\sum_{k=1}^{N} Sp_{i,k}+\sum_{k=1}^{N} Sp_{j,k}}\] where \(Sp_k\) is the number of ‘reads’ for the species k. Fortunately, the vegan package does this so we don’t have to calculate this manually.

library(vegan)
OTU_dist<- vegdist(OTUs,method="bray",binary=F)
OTU_tree2<-nj(OTU_dist)
ggtree(OTU_tree2,layout="rectangular") %<+% Samples + 
  geom_tiplab(aes(colour=Species)) + 
  theme(legend.position="right")

Notice the change in tree topology. Why?

NMDS

In addition to tree-based clustering, we can use a Non-Metric Multidimensional Scaling (NMDS) to visualize similarity/differences among our samples. This method is a bit complicated, but you can see the Wikipedia page for details. The important thing to know is that it is an algorithm for plotting points so that two points that are similar end up close together on the graph. Usually, this is done in two dimensions to generate a bivariate plot, but there are some cases where you might want just one or 3 or more.

set.seed(13)
NMDSdat<-metaMDS(OTU_dist,k=2) # k = 2 dimensions

## Run 0 stress 0.2481601 
## Run 1 stress 0.2448077 
## ... New best solution
## ... Procrustes: rmse 0.06313684  max resid 0.1776276 
## Run 2 stress 0.2476016 
## Run 3 stress 0.2534164 
## Run 4 stress 0.2528905 
## Run 5 stress 0.2533469 
## Run 6 stress 0.2452108 
## ... Procrustes: rmse 0.00558734  max resid 0.071541 
## Run 7 stress 0.2447915 
## ... New best solution
## ... Procrustes: rmse 0.002684137  max resid 0.03323182 
## Run 8 stress 0.2504992 
## Run 9 stress 0.2478204 
## Run 10 stress 0.2479272 
## Run 11 stress 0.2510769 
## Run 12 stress 0.2718408 
## Run 13 stress 0.2572388 
## Run 14 stress 0.2815646 
## Run 15 stress 0.2522907 
## Run 16 stress 0.256157 
## Run 17 stress 0.2498118 
## Run 18 stress 0.256801 
## Run 19 stress 0.2447743 
## ... New best solution
## ... Procrustes: rmse 0.005080067  max resid 0.04743662 
## Run 20 stress 0.2452108 
## ... Procrustes: rmse 0.008389072  max resid 0.07155341 
## *** No convergence -- monoMDS stopping criteria:
##     19: stress ratio > sratmax
##      1: scale factor of the gradient < sfgrmin

It looks like our model did not converge after 20 runs, we can try increasing this:

set.seed(13)
NMDSdat<-metaMDS(OTU_dist,k=2,trymax = 100)

## Run 0 stress 0.2481601 
## Run 1 stress 0.2448077 
## ... New best solution
## ... Procrustes: rmse 0.06313684  max resid 0.1776276 
## Run 2 stress 0.2476016 
## Run 3 stress 0.2534164 
## Run 4 stress 0.2528905 
## Run 5 stress 0.2533469 
## Run 6 stress 0.2452108 
## ... Procrustes: rmse 0.00558734  max resid 0.071541 
## Run 7 stress 0.2447915 
## ... New best solution
## ... Procrustes: rmse 0.002684137  max resid 0.03323182 
## Run 8 stress 0.2504992 
## Run 9 stress 0.2478204 
## Run 10 stress 0.2479272 
## Run 11 stress 0.2510769 
## Run 12 stress 0.2718408 
## Run 13 stress 0.2572388 
## Run 14 stress 0.2815646 
## Run 15 stress 0.2522907 
## Run 16 stress 0.256157 
## Run 17 stress 0.2498118 
## Run 18 stress 0.256801 
## Run 19 stress 0.2447743 
## ... New best solution
## ... Procrustes: rmse 0.005080067  max resid 0.04743662 
## Run 20 stress 0.2452108 
## ... Procrustes: rmse 0.008389072  max resid 0.07155341 
## Run 21 stress 0.2485014 
## Run 22 stress 0.2458439 
## Run 23 stress 0.2526073 
## Run 24 stress 0.2514402 
## Run 25 stress 0.253503 
## Run 26 stress 0.2522898 
## Run 27 stress 0.2501302 
## Run 28 stress 0.2476545 
## Run 29 stress 0.251593 
## Run 30 stress 0.2452109 
## ... Procrustes: rmse 0.008385411  max resid 0.07155543 
## Run 31 stress 0.2594707 
## Run 32 stress 0.2797743 
## Run 33 stress 0.2542847 
## Run 34 stress 0.2480837 
## Run 35 stress 0.2514054 
## Run 36 stress 0.2511438 
## Run 37 stress 0.2505249 
## Run 38 stress 0.2582468 
## Run 39 stress 0.2580423 
## Run 40 stress 0.2570759 
## Run 41 stress 0.2596559 
## Run 42 stress 0.2917437 
## Run 43 stress 0.2514907 
## Run 44 stress 0.2512227 
## Run 45 stress 0.258097 
## Run 46 stress 0.2456734 
## Run 47 stress 0.2644926 
## Run 48 stress 0.2457242 
## Run 49 stress 0.2451999 
## ... Procrustes: rmse 0.007929362  max resid 0.0714391 
## Run 50 stress 0.2448077 
## ... Procrustes: rmse 0.005733739  max resid 0.04747065 
## Run 51 stress 0.2534336 
## Run 52 stress 0.2704354 
## Run 53 stress 0.2544404 
## Run 54 stress 0.2511885 
## Run 55 stress 0.2505814 
## Run 56 stress 0.2588626 
## Run 57 stress 0.2447924 
## ... Procrustes: rmse 0.002638983  max resid 0.0326045 
## Run 58 stress 0.2571638 
## Run 59 stress 0.2478662 
## Run 60 stress 0.260839 
## Run 61 stress 0.2518501 
## Run 62 stress 0.2505426 
## Run 63 stress 0.2514963 
## Run 64 stress 0.2566792 
## Run 65 stress 0.2511014 
## Run 66 stress 0.2540744 
## Run 67 stress 0.2466613 
## Run 68 stress 0.2511703 
## Run 69 stress 0.2551103 
## Run 70 stress 0.2476687 
## Run 71 stress 0.2517213 
## Run 72 stress 0.2452107 
## ... Procrustes: rmse 0.008385849  max resid 0.0715598 
## Run 73 stress 0.2505126 
## Run 74 stress 0.2578466 
## Run 75 stress 0.2495588 
## Run 76 stress 0.2513394 
## Run 77 stress 0.2601173 
## Run 78 stress 0.2541433 
## Run 79 stress 0.2447909 
## ... Procrustes: rmse 0.002668734  max resid 0.03263473 
## Run 80 stress 0.2526854 
## Run 81 stress 0.2503426 
## Run 82 stress 0.2554527 
## Run 83 stress 0.2505249 
## Run 84 stress 0.2477735 
## Run 85 stress 0.249047 
## Run 86 stress 0.2515544 
## Run 87 stress 0.2498213 
## Run 88 stress 0.2470329 
## Run 89 stress 0.2454711 
## Run 90 stress 0.2571176 
## Run 91 stress 0.2555098 
## Run 92 stress 0.2502964 
## Run 93 stress 0.2532716 
## Run 94 stress 0.2447905 
## ... Procrustes: rmse 0.002703492  max resid 0.03268134 
## Run 95 stress 0.2468473 
## Run 96 stress 0.2458538 
## Run 97 stress 0.2509962 
## Run 98 stress 0.2473709 
## Run 99 stress 0.2525378 
## Run 100 stress 0.252097 
## *** No convergence -- monoMDS stopping criteria:
##      2: no. of iterations >= maxit
##     88: stress ratio > sratmax
##     10: scale factor of the gradient < sfgrmin

If it still hasn’t converged after 100 iterations, we can increase trymax to try to get a better solution. Go ahead and try and then compare to the figure below.

Create data for plotting:

PDat<-data.frame(NMDS1=NMDSdat$points[,1],
                 NMDS2=NMDSdat$points[,2],
                 SampleID=row.names(OTUs))

Add species labels using the merge function:

PDat<-merge(PDat,Samples,by="SampleID",all.x=T,all.y=F)

And plot:

qplot(x=NMDS1,y=NMDS2,colour=Species,alpha=I(0.6),data=PDat)+theme_bw()

Metabarcoding Analysis