Archive

Posts Tagged ‘hypothesis’

The distribution of rho…

There was a post here about obtaining non-standard p-values for testing the correlation coefficient. The R-library

`SuppDists`

deals with this problem efficiently.

```library(SuppDists)

plot(function(x)dPearson(x,N=23,rho=0.7),-1,1,ylim=c(0,10),ylab="density")
plot(function(x)dPearson(x,N=23,rho=0),-1,1,add=TRUE,col="steelblue")
plot(function(x)dPearson(x,N=23,rho=-.2),-1,1,add=TRUE,col="green")
plot(function(x)dPearson(x,N=23,rho=.9),-1,1,add=TRUE,col="red");grid()

legend("topleft", col=c("black","steelblue","red","green"),lty=1,
legend=c("rho=0.7","rho=0","rho=-.2","rho=.9"))</pre>```

This is how it looks like,

Now, let’s construct a table of critical values for some arbitrary or not significance levels.

```q=c(.025,.05,.075,.1,.15,.2)
xtabs(qPearson(p=q, N=23, rho = 0, lower.tail = FALSE, log.p = FALSE) ~ q )
# q
#     0.025      0.05     0.075       0.1      0.15       0.2
# 0.4130710 0.3514298 0.3099236 0.2773518 0.2258566 0.1842217```

We can calculate p-values as usual too…

```1-pPearson(.41307,N=23,rho=0)
# [1] 0.0250003```
Advertisements
Categories: statistics

Show me the mean(ing)…

Well testing a bunch of samples for the largest population mean isn’t that common yet a simple test is at hand. Under the obvious title “The rank sum maximum test for the largest K population means” the test relies on the calculation of the sum of ranks under the combined sample of size ${{nk}}$, where ${{n}}$ is the common size of the k’s samples.

For illustration purposes the following data are used. They consist of 6 samples of 5 observations.

```> data
[1]  4.17143986  1.31264787  0.12109036  0.63031601  1.56705511  0.58817076
[7]  1.98011001  1.63226118 -0.03869368  1.80964611  4.80878278  0.67015153
[13]  2.07602321  1.52952749  1.68483297  2.00147364  9.30173048  0.58331012
[19]  2.49537140  1.31229842  1.40193543  0.11906268  4.76253012  1.26550467
[25]  0.69497074 -0.27612056  5.05751484  1.96589383  2.58427547 -0.36979229```

Next we construct a convenient matrix

```data.mat=expand.grid(x=rep(NA,5),sample=c("1","2","3","4","5","6"))
data.mat\$x=data
data.mat\$Rank=rank(data.mat\$x)```

and we compute the sample ranks

```R=rep(NA,6)
for (i in 1:6)
{
R[i]=sum(subset(data.mat,data.mat\$sample==i)\$Rank)
}```
```> rank(R)
[1] 3 2 5 6 1 4```

So we would test whether the 4th sample has the largest population mean. First we need critical values.

`##Critical valus 115/119/127/134 for 10%,5%,1% and 0.1%`
`> R[rank(R)==length(R)]>119`
`FALSE`

So, we cannot accept the hypothesis of the largest mean for the 4th sample.

Look it up… Gopal K. Kanji, 100 Statistical Tests , Sage Publications [google]

Categories: statistics Tags: , , ,