Introduction

See slides on metabarcoding

Setup

library(ggplot2)
library(ape)
library(ggtree)
library(vegan)

Data

Create a directory called Data inside of your working directory. Then downlod the zip file below and unzip it to your Data folder:

The folder should contain these files:

  • OTU_table_QIIME1.txt
  • SampleInfo.csv

Background

These data are from a meta-analysis of plant microbiome studies using 16S metabarcodes. Each sample is a plant that was sequenced for 16S to identify the bacteria associated with it.

Let’s take a look at the data:

Sample Info

This file contains information about the samples. It is essentially an index that links a specific sample code to the species that it was taken from.

Samples<-read.csv("./Data/SampleInfo.csv")

OTU Table

OTU_table <- read.delim("./Data/OTU_table_QIIME1.txt", header=T, row.names="OTU_ID")

Note that we used the parameter header=T, which sets the first row as column names. We also use row.names=OTU_ID, which sets the column names to the values in the column called OTU_ID. What does this do? You’ll understand why we did this in a minute.

This is a simple table tab-delimited data table. Note that it is imported with the read.delim() function in R, which translates a tab character \t into columns. If we were to open it in a text file, the first four lines would look like this:

OTU_ID  SRR4267992  SRR4289215  SRR5211360  SRR5211361  SRR5439762  SRR5439772  SRR4267987  SRR4289218  SRR4734815  SRR4708713  SRR5051401  SRR5051407  SRR5051999  SRR5052001  SRR5052006  SRR5051567  SRR5052003  SRR5051415  SRR5051568  ERR632209   SRR4267988  SRR4289217  SRR2087833  SRR2087830  SRR2087835  ERR1745180  ERR1745155  ERR1745154  ERR1745153  ERR1745151  ERR1745152  ERR1745156  ERR1745178  SRR2204344  SRR2204709  SRR2087832  SRR3073557  ERR632218   SRR5439753  SRR3742182  SRR3742176  SRR3742138  SRR3073553  SRR3073555  ERR632208   SRR5439754  SRR2087831  SRR2087834  SRR3073554  SRR5439751  SRR5439752  SRR5439770  SRR5439755  SRR5439761  SRR5051400  ERR632220   SRR5185934  SRR4010871  SRR5439756  SRR5439757  SRR5439759  SRR5439763  SRR5439764  SRR5439765  SRR5439766  SRR5439767  SRR5439768  SRR5439769  SRR5439773  SRR5439774  SRR5439771  SRR4265353  SRR4289222  ERR1745165  ERR1745169  SRR3073556  SRR3073558  SRR3073559  ERR1745158  ERR1745179  SRR5439758  SRR5439760  SRR5052000  SRR5052004  ERR1745177  SRR4010872  SRR4265352  SRR4289224  ERR1745175  ERR1745176  SRR4267986  SRR4289219  ERR631092   ERR632207   ERR632221   SRR5185925  SRR5439796  SRR3747548  SRR5439793  SRR2204159  SRR5439778  SRR5439782  SRR5439776  SRR5439789  SRR5439781  SRR5439786  SRR5439798  ERR1745170  ERR1745167  SRR4706101  SRR2064974  SRR5439779  SRR5439780  SRR5439795  SRR5439794  SRR4453815  SRR3742177  SRR3742180  SRR3742197  SRR3742190  SRR3742179  SRR3742194  SRR3742184  ERR1745163  SRR2064975  SRR5185927  SRR5185938  SRR5185933  SRR5185932  SRR5185935  SRR5185926  SRR5185924  SRR5185931  SRR5185936  SRR5185929  SRR5185939  SRR5185937  SRR5185940  SRR4263477  SRR4289223  SRR5185944  SRR5185942  SRR5185946  SRR5185943  SRR5185941  ERR1745182  ERR632219   ERR1745181  ERR1745159  ERR1745157  ERR1745183  SRR4267985  SRR4289220  SRR5439785  SRR5439788  SRR5439775  SRR5439777  SRR5439783  SRR5439787  SRR5439791  SRR5439792  SRR5439797  SRR5439784  SRR5439790  ERR1745171  ERR1745168  SRR5185930  SRR4734837  SRR3742191  SRR3742193  SRR5185947  ERR1745164  ERR1745166  SRR5185945  SRR2204707  SRR5185928  ERR592858   taxonomy
4479946 1.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 k__Bacteria; p__Proteobacteria; c__Betaproteobacteria; o__MND1; f__; g__; s__
4386156 0.0 0.0 1.0 2.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 k__Bacteria; p__Proteobacteria; c__Alphaproteobacteria; o__Rickettsiales; f__mitochondria; g__; s__

Parsed table

Once parsed, the data are easier to understand:

SRR4267992 SRR4289215 SRR5211360 SRR5211361 SRR5439762 SRR5439772 SRR4267987 SRR4289218 SRR4734815 SRR4708713 SRR5051401 SRR5051407 SRR5051999 SRR5052001 SRR5052006 SRR5051567 SRR5052003 SRR5051415 SRR5051568 ERR632209 SRR4267988 SRR4289217 SRR2087833 SRR2087830 SRR2087835 ERR1745180 ERR1745155 ERR1745154 ERR1745153 ERR1745151 ERR1745152 ERR1745156 ERR1745178 SRR2204344 SRR2204709 SRR2087832 SRR3073557 ERR632218 SRR5439753 SRR3742182 SRR3742176 SRR3742138 SRR3073553 SRR3073555 ERR632208 SRR5439754 SRR2087831 SRR2087834 SRR3073554 SRR5439751 SRR5439752 SRR5439770 SRR5439755 SRR5439761 SRR5051400 ERR632220 SRR5185934 SRR4010871 SRR5439756 SRR5439757 SRR5439759 SRR5439763 SRR5439764 SRR5439765 SRR5439766 SRR5439767 SRR5439768 SRR5439769 SRR5439773 SRR5439774 SRR5439771 SRR4265353 SRR4289222 ERR1745165 ERR1745169 SRR3073556 SRR3073558 SRR3073559 ERR1745158 ERR1745179 SRR5439758 SRR5439760 SRR5052000 SRR5052004 ERR1745177 SRR4010872 SRR4265352 SRR4289224 ERR1745175 ERR1745176 SRR4267986 SRR4289219 ERR631092 ERR632207 ERR632221 SRR5185925 SRR5439796 SRR3747548 SRR5439793 SRR2204159 SRR5439778 SRR5439782 SRR5439776 SRR5439789 SRR5439781 SRR5439786 SRR5439798 ERR1745170 ERR1745167 SRR4706101 SRR2064974 SRR5439779 SRR5439780 SRR5439795 SRR5439794 SRR4453815 SRR3742177 SRR3742180 SRR3742197 SRR3742190 SRR3742179 SRR3742194 SRR3742184 ERR1745163 SRR2064975 SRR5185927 SRR5185938 SRR5185933 SRR5185932 SRR5185935 SRR5185926 SRR5185924 SRR5185931 SRR5185936 SRR5185929 SRR5185939 SRR5185937 SRR5185940 SRR4263477 SRR4289223 SRR5185944 SRR5185942 SRR5185946 SRR5185943 SRR5185941 ERR1745182 ERR632219 ERR1745181 ERR1745159 ERR1745157 ERR1745183 SRR4267985 SRR4289220 SRR5439785 SRR5439788 SRR5439775 SRR5439777 SRR5439783 SRR5439787 SRR5439791 SRR5439792 SRR5439797 SRR5439784 SRR5439790 ERR1745171 ERR1745168 SRR5185930 SRR4734837 SRR3742191 SRR3742193 SRR5185947 ERR1745164 ERR1745166 SRR5185945 SRR2204707 SRR5185928 ERR592858 taxonomy
432284 31 31 0 0 3 5 5 5 0 0 0 0 0 0 0 0 0 0 0 22 22 22 0 1 0 0 0 0 2 0 1 0 0 0 0 0 0 291 19 0 0 0 0 0 1 291 1 5 0 16 8 1 4 9 0 227 48 414 2 1 2 6 3 24 7 1 10 6 7 11 3 30 30 0 0 0 0 0 0 0 0 6 0 0 4 450 10 10 0 0 3 3 259 18 224 0 19 0 147 0 391 0 0 9 15 0 9 0 0 0 25957 0 92 14 2 0 0 0 0 0 0 0 0 0 254 88 79 0 3 0 0 0 10 9 166 9 35 0 18 18 10 583 21 183 5 0 346 0 0 0 0 18 18 4 0 1 5 1 0 1 1 0 12 4 0 0 10 0 0 0 41 0 0 0 0 0 0 k__Bacteria; p__Cyanobacteria; c__Chloroplast; oStreptophyta; f; g; s
694558 0 0 91 108 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 17 1 0 0 0 0 0 0 0 0 0 3 0 0 2 0 0 0 0 0 0 0 9 15 0 0 2 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 k__Bacteria; p__Firmicutes; c__Bacilli; o__Bacillales; f__Bacillaceae; gBacillus; s
543472 0 0 0 0 69 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 1 0 0 0 0 0 0 0 0 0 0 4 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 13 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 k__Bacteria; p__Proteobacteria; c__Alphaproteobacteria; o__Rhizobiales; f__Bradyrhizobiaceae; g; s
938794 0 0 22 0 0 0 7 7 0 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 2 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 12 12 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 k__Bacteria; p__Proteobacteria; c__Gammaproteobacteria; o__Pseudomonadales; f__Moraxellaceae; g__Acinetobacter; s__guillouiae

Note that this is a 16.5MB file, so only a few lines are printed here. The first column contains the OTU_ID – this is a unique ID number assigned to a unique ‘Operational Taxonomic Unit’ or OTU. The term OTU is common for microbes like bacteria and fungi where species concepts don’t work so well. Since we are inferring taxonomy from a single gene, all sequences that share the same sequence are considered part of the same OTU.

The next columns start with SRR. These are the the individual samples that were sequenced. Each number in the SRR column corresponds to the number of sequences for the OTU that were present in that sample.

The last column is called taxonomy and contains the predicted taxonomic assignment for each OTU. Let’s remove these so that we aren’t mixing numeric and character data.

OTU_data<-OTU_table[,-ncol(OTU_table)]

Let’s see how many OTUs and samples are in the dataset:

dim(OTU_data)
## [1] 21201   177

How many OTUs? How many samples?

Remove errors

Let’s see how many total sequences are in this dataset:

X<-rowSums(OTU_data)
sum(X)
## [1] 18080169

With > 18,000,000 reads (i.e. individual sequences) we are bound to get some reads that are just contamination. To reduce the dataset slightly we can remove any OTUs that don’t have more than one sequence in more than one sample:

Drop<-rowSums(OTU_data) < 2
sum(Drop)
## [1] 3956

This will allow us to drop almost 4,000 OTUs from the analysis.

OTU_red<-OTU_data[!Drop,]

Table transpose

To analyze the OTU table, we need to rearrange with the species across the top and sample sizes along the bottom. A simple way to do this is with the transpose function t(). Since it’s a big file, let’s do it only on the first 3 rows and columns to see what happens:

OTU_red[1:3,1:3]
##         SRR4267992 SRR4289215 SRR5211360
## 4479946          1          1          0
## 4386156          0          0          1
## 4479944          0          0          0
t(OTU_red[1:3,1:3])
##            4479946 4386156 4479944
## SRR4267992       1       0       0
## SRR4289215       1       0       0
## SRR5211360       0       1       0

That looks good, so let’s do it for the whole dataset:

OTUs<-as.data.frame(t(OTU_red))

Analysis

Now that we have samples as rows and OTUs as columns, how can we compare the samples?

Hint: How did we compare dragons with different traits?

(see the Dragon Phylogeny tutorial)

In an analogous fashion, we can calculate pairwise distances among our samples to create a distance matrix.

Binary Distance matrix

We can change our OTUs dataframe into binary data by changing the read counts in each square to 1 or 0. Of course, we lose information of the relative number of sequences for each OTU, so we’ll look how to analyze that data later.

Transform to 1 if # reads >=1 and 0 otherwise. We could use a nested loop through each row and each column. You CAN do IT this WAY, but it may take a long time, so you might want to skip this code, but take some time to look at it carefully and make sure you understand what it is doing (or replace nrow(OTUs) with a smaller number like 5):

OTU_bin<-OTUs
for (i in 1:nrow(OTU_bin)){
  for (j in 1:ncol(OTU_bin)){
    if(OTU_bin[i,j]>0){
      OTU_bin[i,j]<-1
    }
  }
}

Loops are not computationally efficient – especially loops nested within loops. This becomes a problem when we have a lot of rows and columns to deal with. Since we are working with a numeric data.frame (i.e. all data in rows and columns are numeric), there is a nice shortcut:

X<-OTUs[1:3,1:3]
X
##            4479946 4386156 4479944
## SRR4267992       1       0       0
## SRR4289215       1       0       0
## SRR5211360       0       1       0
X[X>0]<-100
X
##            4479946 4386156 4479944
## SRR4267992     100       0       0
## SRR4289215     100       0       0
## SRR5211360       0     100       0

This won’t work with most data frames, because data frames often contain heterogenous data (e.g. sample ID, measurements, counts, binary, factor).

Now we apply it to the full dataset to create a binary matrix:

OTU_bin<-OTUs
OTU_bin[OTU_bin>0]<-1

Did you have to wait?

Compare that compute time to a nested loop. Remember:

dim(OTUs)
## [1]   177 17245

That means the total number of cells to replace is:

nrow(OTUs)*ncol(OTUs)
## [1] 3052365

Just as we did in the Dragon Phylogeny tutorial, we can calculate the pairwise distance of this binary matrix.

OTU_bin_dist<-dist(OTU_bin,method='binary')

What will our matrix size be? How many elements?

We are comparing the sites (rows), so:

nrow(OTU_bin)
## [1] 177
nrow(OTU_bin) * nrow(OTU_bin)
## [1] 31329

Neighbour-Joining

Just as we did with our Dragon Phylogeny, we can cluster our samples by their similarity. For example, we can use the neighbor-joining method:

library(ape)
library(ggtree)
OTU_tree<-nj(OTU_bin_dist)
ggtree(OTU_tree,layout="rectangular")

Note that this is NOT A PHYLOGENY – a phylogeny shows an hypothesis about evolutionary relationships among organisms. This is a neighbour-joining tree that clusters samples based on the similarity of their microbes.

To see if species cluster together, we can colour-code and annotate our tree using our Samples data.frame object.

ggtree(OTU_tree,layout="rectangular") %<+% Samples + 
  geom_tiplab(aes(colour=Species)) + 
  theme(legend.position="right")

Bray-Curtis dissimilarity

In addition to binary distances, we can include abundance information. In our case ‘abundance’ is inferred from the number of sequence reads – the idea being that the more abundant a bacterial species is in the sample, the more likely it is for its DNA to be sequenced multiple times.

The dist function has other options for calculating distances:

?dist

A popular measure is the Bray-Curtis dissimilarity metric. The Bray-Curtis dissimilarity (BC) between two samples \(i\) and \(j\) is:

\[BC_{ij} = 1-\frac{2 \sum_{k=1}^{N} min(Sp_{i,k},Sp_{j,k})}{\sum_{k=1}^{N} Sp_{i,k}+\sum_{k=1}^{N} Sp_{j,k}}\] where \(Sp_k\) is the number of ‘reads’ for the species k. Fortunately, the vegan package does this so we don’t have to calculate this manually.

library(vegan)
OTU_dist<- vegdist(OTUs,method="bray",binary=F)
OTU_tree2<-nj(OTU_dist)
ggtree(OTU_tree2,layout="rectangular") %<+% Samples + 
  geom_tiplab(aes(colour=Species)) + 
  theme(legend.position="right")

Notice the change in tree topology. Why?

NMDS

In addition to tree-based clustering, we can use a Non-Metric Multidimensional Scaling (NMDS) to visualize similarity/differences among our samples. This method is a bit complicated, but you can see the Wikipedia page for details. The important thing to know is that it is an algorithm for plotting points so that two points that are similar end up close together on the graph. Usually, this is done in two dimensions to generate a bivariate plot, but there are some cases where you might want just one or 3 or more.

set.seed(13)
NMDSdat<-metaMDS(OTU_dist,k=2) # k = 2 dimensions
## Run 0 stress 0.2481601 
## Run 1 stress 0.2448077 
## ... New best solution
## ... Procrustes: rmse 0.06313684  max resid 0.1776276 
## Run 2 stress 0.2476016 
## Run 3 stress 0.2534164 
## Run 4 stress 0.2528905 
## Run 5 stress 0.2533469 
## Run 6 stress 0.2452108 
## ... Procrustes: rmse 0.00558734  max resid 0.071541 
## Run 7 stress 0.2447915 
## ... New best solution
## ... Procrustes: rmse 0.002684137  max resid 0.03323182 
## Run 8 stress 0.2504992 
## Run 9 stress 0.2478204 
## Run 10 stress 0.2479272 
## Run 11 stress 0.2510769 
## Run 12 stress 0.2718408 
## Run 13 stress 0.2572388 
## Run 14 stress 0.2815646 
## Run 15 stress 0.2522907 
## Run 16 stress 0.256157 
## Run 17 stress 0.2498118 
## Run 18 stress 0.256801 
## Run 19 stress 0.2447743 
## ... New best solution
## ... Procrustes: rmse 0.005080067  max resid 0.04743662 
## Run 20 stress 0.2452108 
## ... Procrustes: rmse 0.008389072  max resid 0.07155341 
## *** No convergence -- monoMDS stopping criteria:
##     19: stress ratio > sratmax
##      1: scale factor of the gradient < sfgrmin

It looks like our model did not converge after 20 runs, we can try increasing this:

set.seed(13)
NMDSdat<-metaMDS(OTU_dist,k=2,trymax = 100)
## Run 0 stress 0.2481601 
## Run 1 stress 0.2448077 
## ... New best solution
## ... Procrustes: rmse 0.06313684  max resid 0.1776276 
## Run 2 stress 0.2476016 
## Run 3 stress 0.2534164 
## Run 4 stress 0.2528905 
## Run 5 stress 0.2533469 
## Run 6 stress 0.2452108 
## ... Procrustes: rmse 0.00558734  max resid 0.071541 
## Run 7 stress 0.2447915 
## ... New best solution
## ... Procrustes: rmse 0.002684137  max resid 0.03323182 
## Run 8 stress 0.2504992 
## Run 9 stress 0.2478204 
## Run 10 stress 0.2479272 
## Run 11 stress 0.2510769 
## Run 12 stress 0.2718408 
## Run 13 stress 0.2572388 
## Run 14 stress 0.2815646 
## Run 15 stress 0.2522907 
## Run 16 stress 0.256157 
## Run 17 stress 0.2498118 
## Run 18 stress 0.256801 
## Run 19 stress 0.2447743 
## ... New best solution
## ... Procrustes: rmse 0.005080067  max resid 0.04743662 
## Run 20 stress 0.2452108 
## ... Procrustes: rmse 0.008389072  max resid 0.07155341 
## Run 21 stress 0.2485014 
## Run 22 stress 0.2458439 
## Run 23 stress 0.2526073 
## Run 24 stress 0.2514402 
## Run 25 stress 0.253503 
## Run 26 stress 0.2522898 
## Run 27 stress 0.2501302 
## Run 28 stress 0.2476545 
## Run 29 stress 0.251593 
## Run 30 stress 0.2452109 
## ... Procrustes: rmse 0.008385411  max resid 0.07155543 
## Run 31 stress 0.2594707 
## Run 32 stress 0.2797743 
## Run 33 stress 0.2542847 
## Run 34 stress 0.2480837 
## Run 35 stress 0.2514054 
## Run 36 stress 0.2511438 
## Run 37 stress 0.2505249 
## Run 38 stress 0.2582468 
## Run 39 stress 0.2580423 
## Run 40 stress 0.2570759 
## Run 41 stress 0.2596559 
## Run 42 stress 0.2917437 
## Run 43 stress 0.2514907 
## Run 44 stress 0.2512227 
## Run 45 stress 0.258097 
## Run 46 stress 0.2456734 
## Run 47 stress 0.2644926 
## Run 48 stress 0.2457242 
## Run 49 stress 0.2451999 
## ... Procrustes: rmse 0.007929362  max resid 0.0714391 
## Run 50 stress 0.2448077 
## ... Procrustes: rmse 0.005733739  max resid 0.04747065 
## Run 51 stress 0.2534336 
## Run 52 stress 0.2704354 
## Run 53 stress 0.2544404 
## Run 54 stress 0.2511885 
## Run 55 stress 0.2505814 
## Run 56 stress 0.2588626 
## Run 57 stress 0.2447924 
## ... Procrustes: rmse 0.002638983  max resid 0.0326045 
## Run 58 stress 0.2571638 
## Run 59 stress 0.2478662 
## Run 60 stress 0.260839 
## Run 61 stress 0.2518501 
## Run 62 stress 0.2505426 
## Run 63 stress 0.2514963 
## Run 64 stress 0.2566792 
## Run 65 stress 0.2511014 
## Run 66 stress 0.2540744 
## Run 67 stress 0.2466613 
## Run 68 stress 0.2511703 
## Run 69 stress 0.2551103 
## Run 70 stress 0.2476687 
## Run 71 stress 0.2517213 
## Run 72 stress 0.2452107 
## ... Procrustes: rmse 0.008385849  max resid 0.0715598 
## Run 73 stress 0.2505126 
## Run 74 stress 0.2578466 
## Run 75 stress 0.2495588 
## Run 76 stress 0.2513394 
## Run 77 stress 0.2601173 
## Run 78 stress 0.2541433 
## Run 79 stress 0.2447909 
## ... Procrustes: rmse 0.002668734  max resid 0.03263473 
## Run 80 stress 0.2526854 
## Run 81 stress 0.2503426 
## Run 82 stress 0.2554527 
## Run 83 stress 0.2505249 
## Run 84 stress 0.2477735 
## Run 85 stress 0.249047 
## Run 86 stress 0.2515544 
## Run 87 stress 0.2498213 
## Run 88 stress 0.2470329 
## Run 89 stress 0.2454711 
## Run 90 stress 0.2571176 
## Run 91 stress 0.2555098 
## Run 92 stress 0.2502964 
## Run 93 stress 0.2532716 
## Run 94 stress 0.2447905 
## ... Procrustes: rmse 0.002703492  max resid 0.03268134 
## Run 95 stress 0.2468473 
## Run 96 stress 0.2458538 
## Run 97 stress 0.2509962 
## Run 98 stress 0.2473709 
## Run 99 stress 0.2525378 
## Run 100 stress 0.252097 
## *** No convergence -- monoMDS stopping criteria:
##      2: no. of iterations >= maxit
##     88: stress ratio > sratmax
##     10: scale factor of the gradient < sfgrmin

If it still hasn’t converged after 100 iterations, we can increase trymax to try to get a better solution. Go ahead and try and then compare to the figure below.

Create data for plotting:

PDat<-data.frame(NMDS1=NMDSdat$points[,1],
                 NMDS2=NMDSdat$points[,2],
                 SampleID=row.names(OTUs))

Add species labels using the merge function:

PDat<-merge(PDat,Samples,by="SampleID",all.x=T,all.y=F)

And plot:

qplot(x=NMDS1,y=NMDS2,colour=Species,alpha=I(0.6),data=PDat)+theme_bw()