diversity indexes
Michael D. Noah
mnoah at gol.com
Mon Mar 11 08:23:24 EST 1996
At 10:40 AM 3/11/96 gmt+0200, you wrote:
>
>
>Dear all,
>I'm doing a research work on comparing coral reef fishes in two
>areas, using a video camera.
>What I need to know, urgently, is how to compare, statisticly,
>diversity indexes to see if they are significant.
>Are there any scientific papers on similar studies?
>I would appreciate it if you would help me. Thanks!
>Sorry for any duplication!
>
>--
>amorim
I'm sure that I'll be corrected quickly by this list membership if I'm wrong
(and I'll start off by saying that I'm by no means a statistician), but I
researched similarity and diversity indices in depth when I was still in my
graduate program back in the early 80's. I've since held the view that for
the most part, since diversity indices (or any index, for that matter -
i.e., habitat suitability indices) attempt to represent in a single value
(and lose much of that information in the process), the mass of
"information" that exists in large, often multidimensional (species, space
and time) data sets, statistical inferences between indices are usually
meaningless.
An example: two or more diversity indices can be identical in their
respective values, but the underlying data upon which they are each based
can be entirely different. I won't go into the mathematics here, but
suffice it to say that you can generate the exact same diversity index
(i.e., Shannon and Weaver's H') from any number of data sets, each differing
from the others in the number of species, the number of individuals within
each species, and even the composition of the community encountered. Since
the indices can be identical, statistical tests will identify no significant
difference between the data sets, yet these data sets can be widely divergent.
I also recall reading once that species diversity indices in particular are
often highly correlated with each other, due simply to the formula that is
used to calculate them - I apologize, I don't recall off-hand the paper that
I read that described this "phenomenon (?)," but I'd be happy to look it up
in my collection if you're interested.
Combining this problem with the failure of most data commonly encountered in
ecological field studies to even closely satisfy the assumptions of
classical statistical methods and probability theory (missing data, mixed
data [binary, rank, quantitative], independent and normal error
distributions, homogeneity of variances, additivity of effects, etc.), I
think one would be hard pressed to stand on any statistically "significant
differences" between two or more indices.
I said "usually" above; there may be limited instances where indices can be
used as predictor variables, IF considerable thought is given a priori to
the underlying hypotheses and the potential relationships that may exist
between the variables. Assume, for example, that you have an impacted area
and a control area, and you want to develop an index that "describes" that
impact. After sampling both areas and using the log-transformed abundance
values of the species abundances encountered in a discriminant analysis, the
linear additive discriminate function of the log-transformed species
abundances would represent (by definition) the best predictor of that
impact, and the most efficient test of the null hypothesis: "no impact."
The definition of impacted and control areas could be incorporated into the
analysis by first performing a cluster analysis to group the samples into
faunally homogenous assemblages. If a two-group solution could somehow be
interpreted as representing the impacted and control groups of samples that
are spatially contiguous, discriminant analysis could be used to define an
index of the faunal differences between those two groups. If, however, the
two groups were *derived* by cluster analysis, then no significance test
would be appropriate since the two groups were created from the outset so as
to maximize the differences on the discriminating variables. However, if
the groups were defined a priori, then tests of the null hypothesis in
species composition between impacted and control areas may be appropriate.
My suggestion: first, know what question you are asking, and then second,
try to use a statistical method that retains more of the underlying
biological information.
As for papers that you might want to consider, try R. Pikul, 1974,
Development of environmental indices, IN Statistical and mathematical
aspects of pollution problems. J.W. Pratt (Ed.). Mercel Dekker, New York.
Hope this helps
Michael Noah
_/ _/ _/ _/ _/ _/ _/ U.S. Navy, COMFLEACT Yokosuka
_/_/ _/ _/_/ _/ _/ _/ _/ Environmental Department
_/ _/ _/ _/ _/ _/ _/ _/_/ PSC 473 Box 1 Code 1000
_/ _/_/ _/_/_/_/ _/_/ _/ FPO AP 96349-1100
_/ _/ _/ _/ _/ _/ 243-7311 / 011-81-311-743-7311
FAX 243-9027 / 011-81-311-743-9027
Michael D. Noah mnoah at gol.com
"Mother, mother ocean, I have heard your call" J. Buffett
U N I V E R S I T Y o f N E B R A S K A C O R N H U S K E R S
N a t i o n a l C h a m p i o n s 1 9 7 0 - 7 1 1 9 9 4 - 9 5
More information about the Coral-list-old
mailing list