## The distribution of rho…

There was a post here about obtaining non-standard p-values for testing the correlation coefficient. The R-library

SuppDists

deals with this problem efficiently.

library(SuppDists) plot(function(x)dPearson(x,N=23,rho=0.7),-1,1,ylim=c(0,10),ylab="density") plot(function(x)dPearson(x,N=23,rho=0),-1,1,add=TRUE,col="steelblue") plot(function(x)dPearson(x,N=23,rho=-.2),-1,1,add=TRUE,col="green") plot(function(x)dPearson(x,N=23,rho=.9),-1,1,add=TRUE,col="red");grid() legend("topleft", col=c("black","steelblue","red","green"),lty=1, legend=c("rho=0.7","rho=0","rho=-.2","rho=.9"))</pre>

This is how it looks like,

Now, let’s construct a table of critical values for some arbitrary or not significance levels.

```
q=c(.025,.05,.075,.1,.15,.2)
xtabs(qPearson(p=q, N=23, rho = 0, lower.tail = FALSE, log.p = FALSE) ~ q )
# q
# 0.025 0.05 0.075 0.1 0.15 0.2
# 0.4130710 0.3514298 0.3099236 0.2773518 0.2258566 0.1842217
```

We can calculate p-values as usual too…

```
1-pPearson(.41307,N=23,rho=0)
# [1] 0.0250003
```

## Show me the mean(ing)…

Well testing a bunch of samples for the largest population mean isn’t that common yet a simple test is at hand. Under the obvious title “*The rank sum maximum test for the largest K population means*” the test relies on the calculation of the sum of ranks under the combined sample of size , where is the common size of the k’s samples.

For illustration purposes the following data are used. They consist of 6 samples of 5 observations.

> data [1] 4.17143986 1.31264787 0.12109036 0.63031601 1.56705511 0.58817076 [7] 1.98011001 1.63226118 -0.03869368 1.80964611 4.80878278 0.67015153 [13] 2.07602321 1.52952749 1.68483297 2.00147364 9.30173048 0.58331012 [19] 2.49537140 1.31229842 1.40193543 0.11906268 4.76253012 1.26550467 [25] 0.69497074 -0.27612056 5.05751484 1.96589383 2.58427547 -0.36979229

Next we construct a convenient matrix

data.mat=expand.grid(x=rep(NA,5),sample=c("1","2","3","4","5","6")) data.mat$x=data data.mat$Rank=rank(data.mat$x)

and we compute the sample ranks

R=rep(NA,6) for (i in 1:6) { R[i]=sum(subset(data.mat,data.mat$sample==i)$Rank) }> rank(R) [1] 3 2 5 6 1 4

So we would test whether the 4th sample has the largest population mean. First we need critical values.

##Critical valus 115/119/127/134 for 10%,5%,1% and 0.1%> R[rank(R)==length(R)]>119FALSE

So, we cannot accept the hypothesis of the largest mean for the 4th sample.

**Look it up… **Gopal K. Kanji, 100 Statistical Tests , Sage Publications [google]