efficiency (partial) balance

R A Bailey r.a.bailey at qmul.ac.uk
Tue Nov 4 11:44:33 GMT 2003


This message concerns efficiency factors, but an immediate conclusion
is that we should say nothing about statistical properties of
unequally replicated designs in the forthcoming release.

For i=1, 2, let L_i be the information matrix of design i and suppose
that the variance per plot when this design is used is sigma_i^2.
Then the variance of the estimator of the contrast x'\tau is
x'L_i^{-}x \sigma_i^2.  Here L^{-} denotes any generalized inverse of
L, though, since L is symmetric,  I almost always use its canonical
generalized inverse, which some people call Moore-Penrose.  

Thus, as far as x'\tau is concerned, the relative efficiency of 
design 1 with respect to design 2 is
\frac{x'L_2^{-}x \sigma_2^2}{x'L_1^{-}x \sigma_1^2}.
(I think that efficiencies in statistics are always relative, like
this.)  Because we don't know the value of either of the \sigma^2
before we've done the experiment, we define the efficiency factor (for
design 1 relative to design 2) to be
\frac{x'L_2^{-}x}{x'L_1^{-}x}.

[I think that there should be no dissent so far.]

Now, it is common to take design 2 to be a complete block design in the
equally replicated case, but this doesn't extend to unequal
replication so it is better to take design 2 to be the unblocked design
with the same replications.  Now, here comes the common mistake.
People generally say that L_2 = diag(r) in this case, where r is the
vector of replications.  I think that they shouldn't.  Look at any
anova table for a CRD almost anyhere in the world and you will see a
line saying ``total = n-1", where n is the number of plots.  This is
because the data is being silently projected orthogonal to the plots
all-1 vector before anything else happens.  Look at the last Section
of Chapter 2 in
http://www.maths.qmw.ac.uk/~rab/DOEbook/
to see this made explicit.  So we ought to put L_2 = X'(I-J/n)X, where
X is the design matrix.  Then L_2 = diag(r) - rr'/n.  

So the definition of efficiency balance ought to be that
\frac{x'M^{-}x}{x'L^{-}x} is constant for contrasts x, where L is the
information matrix of the design we are considering and
M = diag(r) - rr'/n.  Now, M is positive semidefinite so it has a
canonical square root N, so I think the definition should be that 
N^{-}LN^{-} is scalar on the image of N (and zero on N of the all-1s
treatments vector).  

We can get away with being sloppy for efficiency balance.  To be fair,
Emlyn Williams, who defined the concept, almost always has a phrase
like `let y be the mean-corrected data', so the the sloppy definition
will be fine in practice.  But if we want to extend it to some idea of
efficiency partial balance then I think we need to start with the
matrix N^{-}LN^{-}.

RAB 
-- 
R. A. Bailey
       
Snail:  School of Mathematical Sciences    Tel:  (+44) 20 7882 5517
        Queen Mary, University of London
        Mile End Road                      Email: r.a.bailey at qmul.ac.uk
        London E1 4NS
        U.K.




More information about the Developers mailing list