Source

Given a set of distances (dis-similarities) between objects, is it possible to recreate a dimensional representation of those objects?

Model: Distance = square root of sum of squared distances on k dimensions dxy = √∑(xi-yi)2

Data: a matrix of distances

Find the dimensional values in k = 1, 2, … dimensions for the objects that best reproduces the original data.

Example: Consider the distances between nine American cities. Can we represent these cities in a two dimensional space.

library(psych)
library(psychTools)
data(cities)
cities
##      ATL  BOS  ORD  DCA  DEN  LAX  MIA  JFK  SEA  SFO  MSY
## ATL    0  934  585  542 1209 1942  605  751 2181 2139  424
## BOS  934    0  853  392 1769 2601 1252  183 2492 2700 1356
## ORD  585  853    0  598  918 1748 1187  720 1736 1857  830
## DCA  542  392  598    0 1493 2305  922  209 2328 2442  964
## DEN 1209 1769  918 1493    0  836 1723 1636 1023  951 1079
## LAX 1942 2601 1748 2305  836    0 2345 2461  957  341 1679
## MIA  605 1252 1187  922 1723 2345    0 1092 2733 2594  669
## JFK  751  183  720  209 1636 2461 1092    0 2412 2577 1173
## SEA 2181 2492 1736 2328 1023  957 2733 2412    0  681 2101
## SFO 2139 2700 1857 2442  951  341 2594 2577  681    0 1925
## MSY  424 1356  830  964 1079 1679  669 1173 2101 1925    0

The output gives us the the original distance matrix (just to make sure we put it in correctly, the x,y coordinates for each city, and then the following graph.

city.location <- cmdscale(cities, k=2)    #ask for a 2 dimensional solution
round(city.location,0)        #print the locations to the screen
##      [,1] [,2]
## ATL  -571  248
## BOS -1061 -548
## ORD  -264 -251
## DCA  -861 -211
## DEN   616   10
## LAX  1370  376
## MIA  -959  708
## JFK  -970 -389
## SEA  1438 -607
## SFO  1563   88
## MSY  -301  577

This solution can be represented graphically:

plot(city.location,type="n", xlab="Dimension 1", ylab="Dimension 2",main ="cmdscale(cities)")    #put up a graphics window
text(city.location,labels=names(cities))     #put the cities into the map

Note that the solution is not quite what we expected (it is giving us a mirrored Australian orientation to American cities.) However, by reversing the signs in city.location, we get the more conventional representation:

city.location <- -city.location
plot(city.location,type="n", xlab="Dimension 1", ylab="Dimension 2",main ="cmdscale(cities)")    #put up a graphics window
text(city.location,labels=names(cities))     #put the cities into the map

A useful feature is R is most commands have an extensive help file. Asking for help(cmdscale) shows that R includes a distance matrix for 20 European cities. The following commands (taken from the help file) produce a nice two dimensional solution. (Note that since dimensions are arbitrary, the second dimension needs to be flipped to produce the conventional map of Europe.)

loc <- cmdscale(eurodist, k = 2)
x <- loc[,1]
y <- -loc[,2]
plot(x, y, type="n", xlab="", ylab="", main="cmdscale(eurodist)")
text(x, y, colnames(as.matrix(eurodist)), cex=0.8)

For gene expression matrices, use limma::plotMDS function, http://web.mit.edu/~r/current/arch/i386_linux26/lib/R/library/limma/html/plotMDS.html