Newsgroup sci.stat.consult 21433

Articles

Subject: third quartile method ?
From: stephane champely
Date: Thu, 5 Dec 1996 10:08:23 +0100

I ve recently read an advertising (in french) with the following mention:
the efficiency of the product (indeed a mascara for eye lash) was proved by
the third quartile method. (methode du troisieme quartile in french).
Can someone explain what is this third quartile method or give any reference ?
thanks.
*********************************
Stephane CHAMPELY
IUT STID
Ile du Saulcy
57045 METZ cedex 1
tel: 87 31 51 62
fax: 87 31 51 55
E mail: champely@iut.univ-metz.fr
*********************************

Return to Top

Subject: Re: Help needed in data analysis
From: Ron n Conroy
Date: Thu, 5 Dec 1996 10:28:38 +0000

>Received:    5/12/96 10:03 am
>From:        G Asha, asha@CAS.IISC.ERNET.IN
>My friend has collected data which has the following info.
>
>Dependent variable: Achievement in Biology
>Ind variables     : self -conf, home adjustment, health adjustment etc
>                     totally 11 in number.
>
>She needs to do multiple (step-wise) regression analysis. I am familiar
>with multiple regression, but do not know how to go about step-wise
>regression. Can someone please advice me about this.. Even a ref book or
>some simple stat package will do.
>
No she doesn't. She must now look at the data and ask what shape (if =
any) the relationship between the predictor and the achievement score =
is. Then she might ask if the relationship is genuinely =
cause-and-effect or (partly) a product of the relationship between =
the predictor variable and another variable. She needs to build an =
intelligent model, and that's soething computers don't do for you. =
Otherwise we could all go home right now.
One bit of advice: do the analysis separately for males and females. =
Several people, including myself, have found that achievement is more =
strongly related to student characteristics in females - that is, =
that women perform more in line with their apparent abilities and =
morivations.
_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
    _/_/_/      _/_/     _/_/_/     _/     Ronan M Conroy
   _/    _/   _/   _/  _/          _/      Lecturer in Biostatistics
  _/_/_/    _/          _/_/_/    _/       Royal College of Surgeons
 _/   _/     _/              _/  _/        Dublin 2, Ireland
_/     _/     _/_/     _/_/_/   _/         voice +353 1 402 2431
                                           fax   +353 1 402 2329
_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
=B9Do not try to be a genius in every bar=B9 [Brendan Behan? - No, =
Faure!]

Return to Top

Subject: Re: PYRAMID SCHEMES ARE ILLEGAL, FRAUDULENT,DECEPTIVE!
From: sales@marcap.com
Date: 5 Dec 96 05:08:55 UTC

You may want to take a look at this site:
http://www.usps.gov/websites/depart/inspect/chainlet.htm
It is the site of the US Postal Service.  
Yes, these chain letters and other pyramid schemes ARE illegal, and can
result in some pretty heavy fines and/or imprisonment.  And still they
post their garbage here.  But one day... there WILL BE a knock on the
door.
webmaster@marcap.com
http://www.marcap.com

Return to Top

Subject: Drug dosage
From: Fancher Wolfe
Date: Thu, 5 Dec 1996 14:01:56 -0600

Please recommend a sorce for a novice to learn about modeling dose levels
and exploring the effect of different ingestion patterns.  I have been
exploring difference equations but my math is weak.  Hope that this is an
appropriate question for this list. Thanks.
Fancher E. Wolfe, Professor
Mathematics and Statistics
Metropolitan State University
730 Hennepin Ave.
Minneapolis, MN 55403-1897
fwolfe@msus1.msus.edu
612-341-7256

Return to Top

Subject: Re: Power?
From: hrubin@b.stat.purdue.edu (Herman Rubin)
Date: 5 Dec 1996 15:34:50 -0500

In article <32a6fb03.22439144@news.southeast.net>,
Matt Beckwith  wrote:
>tgee@superior.carleton.ca (Travis Gee) wrote:
>>beckwith@pop.southeast.net (Matt Beckwith) writes:
>>>(1) Alpha is the probability that you have rejected that which is
>>>true.
>>>Not exactly. In classic hypothesis testing, alpha is the probability that you
>>will falsely reject the "null hypothesis" when it is true.
>You said:  Given that the null hypothesis is true, alpha is the
>probability that my test design will (inappropriately) reject it
>anyway. 
>I said:  Given that my experiment has rejected the null hypothesis,
>alpha is the probability that it is actually true. 
>Are these not logically equivalent?
Very definitely not.  And no matter how often this is pointed out,
this mistake is made, and I believe it is subconsciously made even
by those who know better.
Suppose that the null hypothesis is that the coin has probability
.5 of coming up heads (an "honest" coin).  We toss it 100 time,
and produce a test which will reject with probability .05 if the
coin is truly honest.
Now suppose the probability that the coin comes up heads is close,
say .4999.  The probability that the hypothesis will be rejected is
not much different.  Why should someone think that the probability
is .500 rather than .4999?  The test essentially does not distinguish
between them.
In general, the point null hypotheses is never true.  Is that coin
EXACTLY honest?  I suggest you think about the question you really
want to ask.  The problem is not trivial then, but "alpha" is also
not the answer.
			...................
-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
hrubin@stat.purdue.edu	 Phone: (317)494-6054	FAX: (317)494-0558

Return to Top

Subject: Re: Simulation of Rare Events
From: hrubin@b.stat.purdue.edu (Herman Rubin)
Date: 5 Dec 1996 15:39:11 -0500

In article <1996Dec4.155947.1@rddvax.decnet.lockheed.com>,
  wrote:
>Can anyone provide information or suggest references about the accuracy
>and dependability of using PSEUDO-random sequences to simulate rare
>events (10-3 to 10-7).  The occurrence of a rare event at time n depends
>in a complicated way on the values of the preceding 3 to 30 pseudo-random
>numbers.
Even if it was 3, I would be suspicious.  With 30, even more so.
Also, if you are using acceptance-rejection procedures. even very
long term effects can enter.  There are generally better ways to
do it, but this si an art, not a science.
-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
hrubin@stat.purdue.edu	 Phone: (317)494-6054	FAX: (317)494-0558

Return to Top

Subject: Re: Assistance designing study
From: Paige Miller
Date: Thu, 05 Dec 1996 15:31:54 -0500

Sean Lahman wrote:
> 4 - Conclusions
> 
> If I had to some it up in one sentence, I would say that the means have
> remained fairly consistent, while the variance has gradually dropped.
> This would lead me to believe there is no evidence of a
> "segregation-inflation" effect.  I'm guessing that the decrease in
> variance indicates a general increase in the level of play.
I think the problem, as I see it, is that you are looking for an effect
in a data set that may or may not be present, but you have not tried (or
were not able) to remove other effects happening simultaneously or
nearly simultaneously in the population you are studying. Perhaps if you
removed the effects of increasing usage of relief pitchers, night
baseball, better training methods, coast-to-coast travel, etc., there
might be an easily recognizable "segregation-inflation" effect. Maybe
not. I'm unconvinced that your data supports the conclusion you come to.
-- 
+---------------------------------+------------------------------+
| Paige Miller, Eastman Kodak Co. | "Let's play some basketball" |
| PaigeM@kodak.com                | Michael Jordan in Space Jam  |
+---------------------------------+------------------------------+
| The opinions expressed herein do not necessarily reflect the   |
| views of the Eastman Kodak Company.                            |
+----------------------------------------------------------------+

Return to Top

Subject: Re: Assistance designing study
From: wpilib+@pitt.edu (Richard F Ulrich)
Date: 5 Dec 1996 19:45:05 GMT

Sean Lahman (lahmans@vivanet.com) wrote:
<< deleted, comments and first questions about analysing batting data from
major league baseball >>
: 5 - Questions
:   a) Am I misinterpreting the data?
:   b) Does this analysis adequately address the original problem,
:      or is it a much too simplistic approach?
: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Yes, I would say, simplistic and misinterpreting....
They lowered the pitching mound in about 1955, or maybe 1965, in
order to reduce the pitcher's advantages, because batters were 
doing too poorly.
The first generations of players had different rules for what 
constituted the "batting average".  In the history of Major League
Baseball,  "sacrifices" or walks (or maybe both)  have been 
treated differently than they way they are today.
Maybe you should consider the  *physical size*  of players as
demonstration that major league ball is different that it was. 
Today, a player who is not more than 6-foot tall is a rather
short one.  Pitchers who were 6'5"  used to be rare, instead
of the median.
Rich Ulrich, wpilib+@pitt.edu

Return to Top

Subject: Re: z-scores for pca
From: Paige Miller
Date: Thu, 05 Dec 1996 09:24:42 -0500

User wrote:
> 
> Hi, I am trying to learn principal component analysis by myself.  At
> the moment I am a bit stuck on the getting the pc-scores using scaled
> eigenvectors.  Could someone give me some pointers on how to get the
> z-scores based on eigenvectors scaled by the square-root of the
> eigenvalues?
> 
> I understand that A=UL^0.5
>         (A=scaled eigenvectors, U=eigenvectors, L=eigenvalues)
> 
> and to get the z-scores, the equation is
> 
>         Zi=UiT[X-Xbar]
> 
>         T=transpose, Xbar=mean vector of the sample
> 
> The operation is to multiple the deviation matrix by the eigenvectors (I
> think).
> I am using matlab for the matrix manipulation, but can't seem
> to get the right answer.  First, UiT rearrange the eigenvectors in
> rows and the variables in columns, for a data matrix of n specimens
> by p variables, the transposed matrix of eigenvectors is m by p, which
> is different in dimension to the deviation matrix which is n by p, and
> the matrices cannot be
> multiplied.  If I use [X-Xbar]*U, the resultant matrix is n by m, yet
> the values are not correct.  I think I am missing some vital steps,
> could someone give me a couple of pointers on how to get the z-scores
> for principal component analysis using matrix operation? 
First, you are very close. The problem is that in the formula of
Zi=UiT[X-Xbar] as you have written above, it should be noted that
[x-xbar] is a px1 vector deviating a single observation vector x from
the mean vector xbar. Note that this is different from the way we
normally think of a data matrix X, in which rows are normally
observations and columns are normally variables. In this formulation
that you are attempting to reproduce, x must be a column vector
containing a single data point. (Naturally, you could make X a matrix if
you wanted, but then X-xbar isn't a completely legal use of notation, as
X is a matrix and xbar is a vector). See Jackson (1991) for a more
complete description.
Reference: Jackson, J. E. (1991), "A User's Guide to Principal
Components", John Wiley and Sons, New York. See formula 1.4.1 on page
11.
Furthermore, let me add the following sequence of random numbers so that
my news server doesn't give me the error "more included lines than new"
4
2
3.14159
-81
-- 
+---------------------------------+------------------------------+
| Paige Miller, Eastman Kodak Co. | "Let's play some basketball" |
| PaigeM@kodak.com                | Michael Jordan in Space Jam  |
+---------------------------------+------------------------------+
| The opinions expressed herein do not necessarily reflect the   |
| views of the Eastman Kodak Company.                            |
+----------------------------------------------------------------+

Return to Top

Subject: Re: Testing for agreement
From: Karen Gilbert
Date: Thu, 05 Dec 1996 13:36:28 -0500

Richard F Ulrich wrote:
> I wonder what THAT means.  The example seems to show COUNTERS,
> which are incremented at different times by the different raters.
> That would make sense of the word "sequential",  but in that
> case, the rest of the explanation (mentioning kappa, etc.)  makes
> no sense that I can see.
Rich,
Perhaps your criticisms could be more constructive?
> Is there a procedure which can at the same time say the two
> methods agree very little, and also suggest that method B performs 
> relatively well in that it tends to assign a higher value to 
> sentences than method A and therefore it could be used as a 
> predictor
> of the values that method A would assign?
Tony,
The Wilcoxon Signed Rank test may be of use to you.  It is a
nonparametric test for paired data.  While it cannot be used for model
building and prediction, it can be used to test if method A = method B.
Karen Gilbert
** I'm having trouble with my news server.  My appologies if this is a
duplicate post!

Return to Top

Subject: Need ROBUST TIME-SERIES Techniques
From: ftehw@aurora.alaska.edu (Erik H Williams)
Date: 5 Dec 1996 22:13:40 GMT

Return to Top

Subject: Scaling of Marks in an examination
From: Dave Sedgwick
Date: Fri, 6 Dec 1996 12:42:42 +1300

   I have been asked by a colleague a question regards the scaling of
marks. ie. which is preferable - a sliding scale, or proportional
scaling, . . .     . . . or some better method?
   For example if a set of marks has an average of 55% and it is
required to bring the average down to say, 50%, then one method would
be to subtract 5% from all marks.
      Another would be to multiply them by 50/55.
I'm not sure if this is the right forum for this question, and I
apologise for this in advance. My personal opinion is that the choice
of method is as much a "political" issue as a statistical one.
However I would appreciate some comments from experts. It may appear
a trifling matter but a lot of time is spent in committees discussing
such transformations on marks.
              David Sedgwick
scdas@twp.ac.nz

Return to Top

Subject: Sequential experimental design and optimization
From: Tomas Oberg
Date: 5 Dec 1996 16:23:53 GMT

We have just started the beta testing of a new software for sequential 
experimental design and optimization. The methods used are the simplex 
algorithm and fuzzy set handling of multiple criteria. The MultiSimplex 
software is an Excel 5 and 7 add-on running on all Windows platforms. 
We are now inviting those interested to apply for participation in the beta 
test. More information can be found on our web page, 
http://ourworld.compuserve.com:80/homepages/Bergstrom_Oberg, 
or by sending me an email.
Regards,
Tomas Oberg
Bergstrom & Oberg
email: tomas.oberg@bergstrom-oberg.se
fax: +46 455 27922

Return to Top

Subject: RadEFX(sm) Radiation Health Effects Research Resource
From: "Leif Peterson, Ph.D."
Date: Thu, 5 Dec 1996 14:42:46 -0500

The RadEFX(sm) Radiation Health Effects Research Resource at Baylor College
of Medicine has recently completed development of a WWW data base to compile
information on all radiation health effects studies.  In particular, we are
especially interested in Chernobyl-related projects and other projects
involving radiation epidemiology.  Investigators can submit their
information at URL http://radefx.bcm.tmc.edu.
Leif Peterson

Return to Top

Subject: Re: help with sas programiing/ problems
From: gullionc@aol.com
Date: 6 Dec 1996 01:46:33 GMT

It is unclear from your query just what the problem is.  I've done similar
kinds of things, and it basically involved slaving over the manuals,
learning how to use functions and output datasets from the descriptive
PROCs.  If you have run into  specific insoluble problems, describe them,
and I or others would probably be of more use to you.
Chris

Return to Top

Subject: Re: normality test ? -Reply
From: Hans-Peter Piepho
Date: Thu, 5 Dec 1996 19:33:10 +0100

>Several different procedures have been proposed to test for normality.
>Graphical assessment may be useful, especially in conjunction with
>other methods.  In general, goodness-of-fit methods (e..g, chi-square or
>Kolmogorov-Smirnov) do not perform well (in that they have rather low
>power).  The power of the Shapiro-Wilk test is quite good but the
>procedure is cumbersome
With good statisticial packages is is just a mouse-click away.....
_______________________________________________________________________
Hans-Peter Piepho
Institut f. Nutzpflanzenkunde  WWW:   http://www.wiz.uni-kassel.de/fts/
Universitaet Kassel            Mail:  piepho@wiz.uni-kassel.de
Steinstrasse 19                Fax:   +49 5542 98 1230
37213 Witzenhausen, Germany    Phone: +49 5542 98 1248

Return to Top

Subject: Re: Are the two regression lines too many?
From: Hans-Peter Piepho
Date: Thu, 5 Dec 1996 18:44:06 +0100

>In article <9612031234.AA14617@fserv.wiz.uni-kassel.de>,
>Hans-Peter Piepho   wrote:
>>>In article <9612020859.AA19921@fserv.wiz.uni-kassel.de>,
>>>Hans-Peter Piepho   wrote:
>>>>>In article ,
>>>>>Mike  wrote:
>>>>>>Andrew Kukla started a thread:
>
>>>                        ................
>
>>>>If you assume that X and Y are bivariate normal, and you want to fit a line
>>>>to the x,y scatter plot to describe the relationship (without need to
>>>>predict either y from x or x from y), a principal component analysis PCA
>>>>seems quite appropriate. Of course you would do the PCA on the
>>>>variance-covariance matrix (unstandardized data), not on the correlation
>>>>matrix (normalized data).
>
>>>I agree that one should not do it on the correlation matrix, for which
>>>the principal component line is the 45 degree line (positive correlation)
>>>or the -45 degree line (negative correlation).  But the principal
>>>component line depends on scaling, and apart from the direction coming
>>>from the sign of the correlation, nothing specific can be said.
>>>--
>
>>If x and y are bivariate normal, the following can be said: We can draw a
>>confidence region centered at the mean of x and y. This will be an ellipse.
>>The first principle component coincides with the "long" axis of this
>>ellipse. (see Johnson and Wichern. Applied multivariate statistical analysis).
>
>There is no problem with this.  But now change the scale of one of the
>variables without changing that of the other.  The long axis of the
>ellipse changes.
>
>If there is non-zero covariance, the new long axis can come from any
>line in the same quadrant, measured from the means, as the old.  So
>the set of all principal component lines for different scalings convey
>no information other than the sign of the covariance.
>--
But why should one want to change the scale? Usually, we have chosen to
measure x and y on a certain scale, and we want to describe the relationship
on that scale, not any other scale.
_______________________________________________________________________
Hans-Peter Piepho
Institut f. Nutzpflanzenkunde  WWW:   http://www.wiz.uni-kassel.de/fts/
Universitaet Kassel            Mail:  piepho@wiz.uni-kassel.de
Steinstrasse 19                Fax:   +49 5542 98 1230
37213 Witzenhausen, Germany    Phone: +49 5542 98 1248

Return to Top

Subject: Re: normality test ? -Reply
From: Staffan Lindeberg
Date: Fri, 06 Dec 1996 01:35:36 +0100

Hans-Peter Piepho wrote:
> 
> >Several different procedures have been proposed to test for normality.
> >Graphical assessment may be useful, especially in conjunction with other methods
> 
> With good statisticial packages is is just a mouse-click away.....
> 
I am enthusiastic about using transformations like log(X+k) and repeated
use of normal probability plotting as described in Afifi AA, Clark V.
Computer-Aided Multivariate Analysis. (2 ed.) New York: Van Nostrand
Reinhold Co, 1990:505. In multiple linear regressions I always get very
solid results to judge from residual analyses that are part of the
Statistica software.

Return to Top

Subject: Multiplicative vs Additive Data
From: Saleem Nicola
Date: Thu, 05 Dec 1996 18:26:13 -0800

Hello,
I'd appreciate some help on the following question. Suppose one measures
a variable in control and then in an experimental condition. One
way to test whether the variable is different in these two conditions
is to perform a paired t-test (in which for each experiment, the
control and experimental values are subtracted, and the mean of these
differences is compared with their variance). This approach works very
well if the data is additive: that is, if one expects the same absolute
magnitude of change in the variable each time one does the experiment.
However, suppose one expects that the experimental effect is
multiplicative, not additive. For example, one might expect that
a particular concentration of an antibiotic will kill off around half
the total number of bacterial cells in a dish, no matter how many cells
the dish contains to start with -- 10, 100, 1000, or a billion. If the 
number of cells in the control condition varies quite a bit, the 
variance in the differences between control and experimental conditions
will be enormous, and the paired t-test is of little use.
There are several ways to take care of this problem; I suspect 
that not all of them are correct. One is to divide each experimental
value by the control value instead of to subtract them. Another
is to subtract the logarithm of the experimental value from the log of
the control value, and then do the test on the resulting values. (Of
course the difference between two logs is the same as the log of one
value divided by the other [ie, log(E)-log(C) = log(E/C)].) I have been 
told that the latter method (taking the log transform) is the correct 
one, but I'm not sure about this, and I'd like an explanation for why it 
is acceptable to use log values. If anyone could provide one, or at
least point me towards a textbook or other reference that discusses this 
in detail, I would really appreciate it.
Thanks a lot,
Saleem Nicola
"Freedom is like oxygen. One may not even notice it when you have it,
but only appreciate it when you don't have it."
                              --Wu'er Kaixi

Return to Top

Subject: Re: Scaling of Marks in an examination
From: Dennis Roberts
Date: Thu, 5 Dec 1996 21:55:50 -0500

many MANY years ago .. when i was working in canada ... i heard about this
little thing called the MAGIC MEDIAN MODIFIER ...
seems as though there was some dictum that ... all averages of marks
(grades) in schools ... had to be 65% ... so, if a teacher had a class that
had a final average higher or lower ... out whipped the MMM ... no relation
to the 3 M company ... and here is about how it worked.
there was a grid ... and ... a small moveable ruler. one aligned YOUR
average with a certain postion on the grid ... and then would read off all
the adjusted grades for students ... such that the net result was a new
class average of 65%.
i had a short article once that described/justified this MMM ... and i will
try to find it again.
anyway ... i found the entire "notion" to be rather stupid .. it was like
saying that all classes were equal ... which we clearly knew was NOT the case.
as for adjusting a given set of marks ... keep in mind that
adding/subtracting a constant will only shift up or down the average but
have NO impact on the spread of scores ... but, if you add or subtract some
constant PERCENTAGE (like 5%) ... then the average will change by that
amount AND the spread will change too ....
if there is any additional interest in this ... i will try to dig up the
paper mentioned above ... and expand a bit on it.
At 12:42 PM 12/6/96 +1300, you wrote:
>   I have been asked by a colleague a question regards the scaling of
>marks. ie. which is preferable - a sliding scale, or proportional
>scaling, . . .     . . . or some better method?
>   For example if a set of marks has an average of 55% and it is
>required to bring the average down to say, 50%, then one method would
>be to subtract 5% from all marks.
>      Another would be to multiply them by 50/55.
>I'm not sure if this is the right forum for this question, and I
>apologise for this in advance. My personal opinion is that the choice
>of method is as much a "political" issue as a statistical one.
>However I would appreciate some comments from experts. It may appear
>a trifling matter but a lot of time is spent in committees discussing
>such transformations on marks.
>              David Sedgwick
>scdas@twp.ac.nz
>
>

Return to Top

Subject: [Help] ROC - Receiver operating characteristics analysis
From: jlu@idi.org.au
Date: 6 Dec 1996 04:36:30 GMT

Can someone recommend reference books/articles and softwares perform reciver operating characteristic analysis?
Thanks!
Janice

Return to Top

Subject: Group membership - likelihood method
From: biologics@worldnet.att.net
Date: Thu, 05 Dec 1996 22:57:34 -0800

Hi
I have a question regarding the construction of a statistical test for
the following problem.
I have K populations, and I have a sample, S(i), of size n(i) from the
i_th population, i=1,2,3,...,K.  Each element in S(i) can be categorized
binomially, i.e., it can be assigned a value of 1 or 0, and the
probability that it is assigned 1 is given by P(i).  P(i) is unknown,
and all I really can measure is the frequency of 1s in my sample and
this I denote as f(i).
I now have an item, Q, that has the value 1, and I want to know to which
population Q belongs.  The way I thought of doing this - probably
long-winded but it seems to make sense to me - is to maximise the
following probability:
L(i) = P( Q=1 | Q is from population i)
Obviously,for each population, L(i) = P(i), but P(i) is unknown, and
although I can estimate P(i) from f(i), some samples are going to have
small values of n(i)(<5), so that f(i) is going to be an unreliable
estimate of P(i).  Therefore, it occurred to me that I could calculated
a weighted estimate of P(i) as follows:
             1       n(i)
a = integral  P(i)* C    P(i)^f(i)*(1-P(i))^(n(i)-f(i))  dP(i)
             0       f(i)
             1  n(i)
b = integral   C    P(i)^f(i)*(1-P(i))^(n(i)-f(i))  dP(i)
             0  f(i)
P(i) = a/b
This is simply the parameter P(i) multiplied by the binomial probability
of getting f(i) elements equal 1, in a sample of size n(i), all
integrated with respect to P(i) from 0 to 1 - and then divided by the
binomial probability of getting f(i) integrated wrt P(i) from 0 to 1.
This estimate of P(i) converges with f(i) as the sample size gets
larger.  However, doing it this way, I don't think I am penalising
populations that are represented by small samples.
So - finally, I can ask my question.  Since I've effectively calculated
a likelihood for group membership, I want to test whether the ML group
is statistically better than other groups.  How should I do this? 
Should I use a likelihood ratio test between the ML group and the group
with the next highest likelihood?  Or should I use a likelihood ratio
test (LRT) between the ML group and the sum of the likelihoods of all
other groups not including the ML group?  Should I use a LRT at all?
I apologise for this long-winded post, and I appreciate any assistance.
Thanks.
Allen Rodrigo

Return to Top

Subject: Re: normality test ?
From: nakhob@mat.ulaval.ca (Renaud Langis)
Date: Fri, 06 Dec 1996 01:43:25 GMT

On Fri, 29 Nov 1996 18:38:39 -0300, Julio Cesar Voltolini 
wrote:
>
>Dear friends,
>
>I am a biologist and we are collecting mammals at different strata of the 
>Brazilian Rainforests.
>
>I would like to do some statistical tests but I need to know if my data have 
>normal distribution. I am starting to use ESTATISTICA and SYSTAT and I would 
>like to do the tests in this packages. May I test the normality in EXCEL too ?
>
As far as i know, Statistica does not perform normality tests in versions prior
to 5.0. Only probability plot. You can do it with Systat though. In the NPAR
module, go to Kolmogorov-Smirnov, you will have the choice of many tests. No
Shapiro-Wilk nor Anderson-Darling though.
In statistica 5.0, you have the choice between Kolmogorov-Smirnov, Lilliefors
and Shapiro-Wilk under the menu Frequency Tables.
R

Return to Top

Subject: Is there a test for H0:Pearson-Rho=1?
From: nakhob@mat.ulaval.ca (Renaud Langis)
Date: Fri, 06 Dec 1996 01:50:24 GMT

Is there a test for H0:Pearson-Rho=1?
I found tests for Rho=0 and Rho=Rho0 with Rho0<1. Can't gat to find one for
testing Rho=1.
Any suggestion?
R

Return to Top

Subject: [Q] ROC - receiver operating characteristics analysis
From: jlu@idi.org.au
Date: 6 Dec 1996 04:48:41 GMT

Can someone recommend reference books/articles and softwares perform receiver operating characteristics analysis?
Thanks!
Janice

Return to Top

Subject: Re: Analyzing nominal and ordinal data...
From: bellour@upso.ucl.ac.be (F. Bellour)
Date: Fri, 06 Dec 1996 11:33:12 +0100

In article <586qf6$9ah@usenet.srv.cis.pitt.edu>, wpilib+@pitt.edu (Richard
F Ulrich) wrote:
>   -- You have me baffled already. Spot the cells?  I do not know how
> log-linear analyses may do that, that Pearsonian analyses cannot do.
> Do you have some particular computer program in mind, which has a
> very nice implementation?  I have seen studentized residuals, for
> looking at contributions of cells, from ordinary Pearson tables.
> 
I am perhaps going to baffle you a third time, sorry for that. Once a
log-linear model has been selected, you can assess the contribution of an
effect to a cell. For this you divide the parameter estimate by its
standard deviation and compare this ratio with the critical z. 
For the cluster question, thanks for your comments. I think this is a
question of personal preferences. But I retain yours.
> 
> 
> Hope this helps.
> 
> 
> Rich Ulrich, biostatistician              wpilib+@pitt.edu
> Western Psychiatric Inst. and Clinic   Univ. of Pittsburgh
Yes, it helps!
-- 
F.Bellour
PhD Student
U.C.L. Belgium
E-mail: bellour@upso.ucl.ac.be
Phone office: 00-32-10-478640

Return to Top

Subject: Re: Is there a test for H0:Pearson-Rho=1?
From: Patrick Onghena
Date: 6 Dec 1996 09:31:33 GMT

nakhob@mat.ulaval.ca (Renaud Langis) wrote:
>Is there a test for H0:Pearson-Rho=1?
>
>I found tests for Rho=0 and Rho=Rho0 with Rho0<1. Can't gat to find one for
>testing Rho=1.
>
>Any suggestion?
>
>R
Generate k bivariate datasets (with the same sample size as your observed dataset) under 
the null hypothesis of a perfect positive linear relationship, compute the Pearson 
correlation for each dataset, and compare those simulated Pearson correlations to the 
observed Pearson correlation. All the simulated correlations will be one, so your 
(directional) P-value will be essentially zero (unless your observed Pearson correlation 
itself is also 1). Or in other words, I think this isn't a very interesting null hypothesis 
(are there any?) because it will always be rejected unless you have a perfect positive 
linear relationship in your data.
With kind regards,
Pat.
_____________________________________________________________________________
Patrick Onghena			   	patrick.onghena@ped.kuleuven.ac.be
Katholieke Universiteit Leuven	   
Department of Educational Sciences	 	Tel1: +32 16 32.59.54
Vesaliusstraat 2				Tel2: +32 16 32.62.01
B-3000 Leuven (Belgium)				Fax : +32 16 32.59.34
	http://www.kuleuven.ac.be/facdep/psy/eng/onderz/methped.htm
_____________________________________________________________________________

Return to Top

Subject: Re: Analyzing missing values
From: Mark Myatt
Date: Fri, 6 Dec 1996 10:48:56 +0000

Antony wrote:
>Missings are qualitatively different from other data so coding them as a
>numeric value does not appeal.  Our plots (as you can see by linking from
>our webpage: http://www1.math.uni-augsburg.de/ to MANET) treat missings
>quite differently so that you are always aware they are missings and not
>actual values.  More specifically there are technical problems:  999 is a
>numeric value and could arise naturally.  If it is used in plots it can
>seriously distort the scale and what it does to statistics if one is not
>careful does not bear thinking about.
I meant '999' as an impossible value. I would not use such a code for
(e.g.) a distance between two cities variable (unless I was having a
particularly bad day :->) but I might use (e.g) '9' for a binary
categorical variable coded with 1/2. I will have a look at your page.
>Good software is like that, it helps you to get results more
>quickly and more easily, although some people always prefer to walk.
Proper (i.e. versatile) missing value handling is essential to any data-
analysis package worthy of the name. I often read posts on this group by
people using packages such as Excel for data analsysis and despair.
-- 
Mark Myatt

Return to Top

Subject: Propensity score matching
From: Alan Zaslavsky
Date: Fri, 6 Dec 1996 09:02:07 -0500

Does anybody have any software (e.g. SAS macros or the like) for
propensity score matching?  Specifically I am interested in mat
Mahalanobis metric matching within propensity score calipers.
Alan Zaslavsky
zaslavsk@hcp.med.harvard.edu

Return to Top

Subject: Re: Is there a test for H0:Pearson-Rho=1?
From: Dennis Roberts
Date: Thu, 5 Dec 1996 22:25:34 -0500

HERE IS A TEST OF THE NULL OF RHO BEING 1 ... TAKE A SAMPLE AND SEE WHAT THE
CORRELATION IS ... IF IT IS ANYTHING OTHER THAN 1 ... REJECT THE NULL.
At 09:31 AM 12/6/96 GMT, you wrote:
>nakhob@mat.ulaval.ca (Renaud Langis) wrote:
>>Is there a test for H0:Pearson-Rho=1?
>>
>>I found tests for Rho=0 and Rho=Rho0 with Rho0<1. Can't gat to find one for
>>testing Rho=1.
>>
>>Any suggestion?
>>
>>R
>
>Generate k bivariate datasets (with the same sample size as your observed
>dataset) under
>the null hypothesis of a perfect positive linear relationship, compute the
>Pearson
>correlation for each dataset, and compare those simulated Pearson correlations
>to the
>observed Pearson correlation. All the simulated correlations will be one, so
>your
>(directional) P-value will be essentially zero (unless your observed Pearson
>correlation
>itself is also 1). Or in other words, I think this isn't a very interesting
>null hypothesis
>(are there any?) because it will always be rejected unless you have a perfect
>positive
>linear relationship in your data.
>
>With kind regards,
>Pat.
>_____________________________________________________________________________
>
>Patrick Onghena                         patrick.onghena@ped.kuleuven.ac.be
>Katholieke Universiteit Leuven
>Department of Educational Sciences              Tel1: +32 16 32.59.54
>Vesaliusstraat 2                                Tel2: +32 16 32.62.01
>B-3000 Leuven (Belgium)                         Fax : +32 16 32.59.34
>
>        http://www.kuleuven.ac.be/facdep/psy/eng/onderz/methped.htm
>_____________________________________________________________________________
>
>
===========================
 Dennis Roberts, Professor EdPsy             !!! GO NITTANY LIONS !!!
 208 Cedar, Penn State, University Park, PA 16802 AC 814-863-2401
 WEB (personal) http://www2.ed.psu.edu/espse/staff/droberts/drober~1.htm

Return to Top

Subject: Re: normality test ?
From: tdierauf@lamar.colostate.edu (Tim Dierauf)
Date: 6 Dec 1996 15:07:03 GMT

>As far as i know, Statistica does not perform normality tests in 
versions prior
>to 5.0. Only probability plot. You can do it with Systat though. In the 
NPAR
>module, go to Kolmogorov-Smirnov, you will have the choice of many 
tests. No
>Shapiro-Wilk nor Anderson-Darling though.
>In statistica 5.0, you have the choice between Kolmogorov-Smirnov, 
Lilliefors
>and Shapiro-Wilk under the menu Frequency Tables.
>
As I remember, pre-5.0 Statistica has the K-S test.
---
Timothy A. Dierauf, PE
Solar Energy Applications Laboratory
Department of Mechanical Engineering
Colorado State University

Return to Top

Subject: web courses
From: web courses
Date: Fri, 6 Dec 1996 10:49:09 EST

----------------------------Original message----------------------------
Dear list owner: This announcement may be appropriate for subscribers to
your list. Please post if you think the subject matter would be of interest.
Thank you for your time.
Deborah Clark
Center for Distance Learning
Central Michigan University
Mt. Pleasant, MI
(517) 774-7143
---------------------------------------------------------------------------
Central Michigan University is going beyond the books with 10 undergraduate
courses being offered on the Internet, beginning in January 1997.
Statistics 382, Elementary Statistical Analysis will be offered on the World
Wide Web. Registration for this and other CMU web courses begins Dec. 9,
1996. The 12-week term begins Jan. 6, 1997. Tuition for CMU web courses is
$95.90 per credit hour. There is a one-time CMU admission fee of $50. If you
have already been admitted to take CMU courses, then you are not required to
pay the admission fee.
Statistics involves collecting and organizing data, describing it and using
it to infer information about the subject being studied. This course will
examine topics such as different types of data, probability, variables,
distribution, hypothesis testing, and correlation. Come along for an
educational lift into cyberspace and the world of statistics with Dr. Ken W.
Smith, professor of mathematics at Central Michigan University and director
of institutional research at CMU.
CMU's College of Extended Learning has always been at the forefront in
offering quality off-campus degree programs. For 25 years, CMU has expanded
that quality throughout the state of Michigan, the United States, Canada and
Mexico. Now, CMU has taken that quality into cyberspace through its Center
for Distance Learning with learning package courses on the World Wide Web.
Each student will receive a syllabus, lectures and other material for the
course via the World Wide Web. CMU's virtual learning center will feature
multiple levels of communication for students enrolled in the course.
Interactive chat sessions will be scheduled between the students and the
instructor in each course. A message center that operates as a forum for
student e-mail has been established for students to communicate among
themselves about informal topics and about course topics such as
assignments, projects and upcoming examinations. Students will also have
access to e-mail addresses for all members of the class and the instructor.
CMU is also offering the following courses on the World Wide Web, beginning
in January.
Accounting 201 (Principles of Accounting) 3 credit hours
Astronomy 111 (Astronomy) 3 credit hours
Astronomy 112 (Introduction to Astronomical Observations) 1 credit hour
Business Information Systems 106 (Spreadsheet Concepts) 1 credit hour
English 323 (Fantasy and Science Fiction) 3 credit hours
Health Promotion and Rehabilitation 523 (AIDS Education) 1 credit hour
Health Promotion and Rehabilitation 529 (Alcohol Education Workshop) 1
credit hour
Health Promotion and Rehabilitation 530 (Drug Abuse Workshop) 1 credit hour
Religion 334 (Death and Dying) 3 credit hours
Technical recommendations for taking these classes are:
SYSTEM: Multimedia PC 486 or Pentium, Macintosh 68040, or Power PC
SOFTWARE: Netscape 2.0 or Microsoft Internet Explorer, Adobe Acrobat Reader
2.0, Netscape Mail or Internet compatible mail system
Those interested in taking the course may look at a preview of some of the
courses at the following web site: http://www.cel.cmich.edu/dlonline.htm
To register, or for more information about the courses, please send e-mail
to john.mcmahon@cmich.edu. or call 1-800-688-4268.

Return to Top

Subject: Re: What do we mean by "The Null Hypothesis"?
From: Clay Helberg
Date: Fri, 06 Dec 1996 09:50:28 -0600

Richard F Ulrich wrote:
>  Clay Helberg (chelberg@spss.com) wrote:
> : NO! From a statistical standpoint, the null can make any specific
> : prediction about the relationship you want. It is true that social
> : scientists *usually* specify "no difference" or "no effect", but there
> : is no reason in the world for it to be that way. In fact I have argued
> : elsewhere that this preponderance of performing tests against such
> : straw-man null hypotheses (which are often clearly false a priori) holds
> : back social science from fulfilling its potential for scientific
> : discovery.
> 
> : You can specify a null hypothesis which states "the difference between
> : group 1 and group 2 is exactly 5 Zuleks", or "the regression slope of
> : Foo regressed on Bar is less than or equal to 3." There is no need to
> : use (and no excuse for using) the default "no effect" null hypothesis
> : when you have something more specific in mind.
> 
> Maybe I am confused, too, but I tend to agree with Chauncey, that,
>  "null is nil."  For an Odds ratio, for instance, *ordinarily*
> 'nil'  is  OR= 1.0;  but the statistical test, if you want to write
> out the terms, is
>    absolute value of  (Group1-Group2) minus 1 equal 0
>    or  '  ...                       minus 5 Zuleks equal 0'.
This is tautological--you can always rearrange an equation so that there
is a zero on one side. The point I was objecting to was the automatic
assumption that it refers to "no difference" or "no effect" (not, as in
your example, where a specific difference is given, but the equation is
rearranged so the hypothesis reads "the observed difference minus the
hypothesized difference equals zero").
> There is a  'nil'  in there somewhere, or you have a funny idea
> of a null hypothesis.  Usually, there is a very rational/logical
> reason for what constitutes the  null=nil  though I do imagine
> the lax  case as being, arbitrarily, 'some value previously
> observed', which is what people look at on process-control charts.
Unfortunately, all too often the default null of "no difference" is used
because it is convenient (it is generally what you get from
computer-generated output), or because the theory under investigation is
so vague as to preclude reasonable point predictions.
> I do NOT see a string of hypothesis, of which H-sub-zero  is simply
> the lowest number.
Well, in Hays (Statistics), he lists the symbol for the null hypothesis
as H-sub-zero, and the symbol for the alternative as H-sub-one. This
usage is also given in Hogg & Craig (Introduction to Mathematical
Statistics) and Vogt (Dictionary of Statistics & Methodology). In fact,
here is a relevant quote from Hays (4th ed., p 249):
	
	Incidentally, there is an impression in some quarters that the term
"null hypothesis" refers to the fact that in experimental work the
parameter value specified in Ho is very often zero. Thus, in many
experiments the hypothetical situation "no experimental effect" is
represented by a statement that some mean or difference between means is
exactly zero. However, as we have seen, the tested hypothesis can
specify any of the possible values for one or more parameters, and this
use of the word *null* is only incidental. It is far better for the
student to think of the null hypthesis Ho as simply designating that
hypothesis actually being tested, the one which, if true, determines the
sampling distribution referred to in the test.
	
I couldn't have said it better myself....
						--Clay
--
Clay Helberg         | Internet: helberg@execpc.com
Publications Dept.   | WWW: http://www.execpc.com/~helberg/
SPSS, Inc.           | Speaking only for myself....

Return to Top

Subject: Re: Scaling of Marks in an examination
From: wpilib+@pitt.edu (Richard F Ulrich)
Date: 6 Dec 1996 16:19:50 GMT

Dave Sedgwick (SCDAS@TWP.AC.NZ) wrote:
:    I have been asked by a colleague a question regards the scaling of
: marks. ie. which is preferable - a sliding scale, or proportional
: scaling, . . .     . . . or some better method?
:    For example if a set of marks has an average of 55% and it is
: required to bring the average down to say, 50%, then one method would
: be to subtract 5% from all marks.
 << rest, deleted... >>
"IT IS REQUIRED"  is an awfully bland statement of a mandate.  What
are these scores used for later on?
If you want the median to be at 50%, then transform the scores into
centiles  -  When you start with ranks and number of cases, there
are several variations on the formula, but just choose one.
If you want to say that you have a standardized score with the
mean of 50, and the Standard deviation of 10, then you can 
generate T-scores  -  either by starting with Ranks and assuming
a normal distribution, or by using the simple computations on the
mean and standard deviation. 
:       Another would be to multiply them by 50/55.
: I'm not sure if this is the right forum for this question, and I
: apologise for this in advance. My personal opinion is that the choice
: of method is as much a "political" issue as a statistical one.
: However I would appreciate some comments from experts. It may appear
: a trifling matter but a lot of time is spent in committees discussing
: such transformations on marks.
If your only examples are 50 vs 55, and there are not a lot of 
scores at 100, then there is little difference between the two
options that you describe.  Since you do not mention what they
are for, there is little very intrinsic, to recommend either.
Rich Ulrich, biostatistician              wpilib+@pitt.edu
Western Psychiatric Inst. and Clinic   Univ. of Pittsburgh

Return to Top

Subject: Re: population vs sample
From: axelrod1@popcorn.llnl.gov (Michael Axelrod)
Date: 6 Dec 1996 18:18:31 GMT

In article , ebohlman@netcom.com (Eric
Bohlman) wrote:
> 
> Oops, you've conflated two separate incidents.  The Literary Digest poll 
> was for the 1936 election, and predicted that Landon would defeat 
> Roosevelt.  The vote in that election split along economic lines, with 
> wealthier people favoring Landon and poorer people favoring Roosevelt.  
> In 1936, telephone subscribers tended to be wealthier than the general 
> population, and thus the sampling procedure oversampled Landon voters and 
> undersampled Roosevelt voters (the same biased sampling method came up 
> with correct predictions for the 1928 and 1932 elections, because the 
> vote in those elections didn't have such a strong economic split).
> 
> The incorrect prediction in 1948 (Dewey vs. Truman) wasn't due to invalid 
> methodology; political and economic events that occurred in between the 
> poll and the election caused a lot of voters to change their minds.
Yes, I had the wrong election, (shows why one should not stay up late and
make posts), but the sampling bias question is another matter. I am
remembering an article that appeared in the American Statistician circa
1980 with a title something like: "The Making of a Statistical Myth: The
1936 Literary Digest Poll" Unfortunately, I lost that issue in a fire, but
I remember the author's conclusion. The samling procedure did not
oversample Landon voters to the extent that it would explain the large
prediction error. The author assets that the problem was response bias and
not sampling bias. His reasoning was as follows. By 1936 those people who
were opposed to Roosevelt tended to hold those opinions very strongly and
would thus be more likely to mail back their questionnaire to express
their feelings. It is common for mailed responses to run less than 10%,
and in such a cases the response bias will significantly distort the
results. The author presented an analysis to back up his conjecture
including why telephone sampling did not significantly oversample Landon
voters. He also gives a history how the myth got started and why it is so
enduring.
Paul Johnson in his book "Modern times" (Harper and Row 1983) discusses
the depth of feeling against Roosevelt in the 1930's.

Return to Top

Subject: WEB courses and Walmart
From: Dennis Roberts
Date: Fri, 6 Dec 1996 02:04:44 -0500

Sometimes, on sloppy snow day fridays, one waxes philosophical ...
Earlier today, i received the following in my readerlist.
Some of the post has been omitted ...
------------------
Central Michigan University is going beyond the books with 10 undergraduate
courses being offered on the Internet, beginning in January 1997.
Statistics 382, Elementary Statistical Analysis will be offered on the World
Wide Web. Registration for this and other CMU web courses begins Dec. 9,
1996. The 12-week term begins Jan. 6, 1997. Tuition for CMU web courses is
$95.90 per credit hour. There is a one-time CMU admission fee of $50. If you
have already been admitted to take CMU courses, then you are not required to
pay the admission fee.
----------------
Why should this bother me? (NOTE: My post here is not meant AT ALL to be a
criticism of the course above ... nor suggest that CMU does not have the
right to do this)
Well, here at Penn State for example, there is a push (like I am sure there
is at other places) to become a member of the "World University" ... ie,
offer things that people all over the world can take advantage of. So far
... so good. But, what does this really mean?
The bottom line for doing something like the above ... is to seek more
sources of revenue generation and, if we can offer something that will be
bought by people in Oregon, and Texas, and the UK ... that means bucks for
us. Notice that while much of this material might be accessed at will,  if
you want CREDIT for it ... you have to pay.
So, why is this bad ... and how is this related to Walmart?
We all know what happens when Walmart comes to town ... smaller business
suffer ... some going out of business ... and while that might be good for
consumers in general if prices fall ... it hurts people/jobs and the like.
Now ... if one place offers an introductory stat course via the web ... and
starts to generate some revenue for it ... then you know that it will not be
long before place B, and C, and .... etc. will put their own course on the
web since, the fact that place A does it means the POTENTIAL of taking some
revenue away from you! And ... we must be competitive!
Thus, it is only a matter of time before place B and C bring out their
versions of the course and, instead of charging $95.90 per credit hour,
offer the course for $90 ... or then D gets into the act and only charges
$87.50 for each credit hour. Then, since some of the smaller institutions
that are living on the edge ... that have to have that $95.90 per credit
hour to survive .. find that the way that they can recoup their investment
and keep their course attractive ... is to make their couse EASIER to
complete ... less difficult/challenging for those who might register.
Get the picture? The big players will wipe out the smaller ones ... and the
greed continues.
The Walmart schools will knock out the Central Michigans ... either by
taking a loss financially or ... making their courses more "user friendly".
I see this coming ... and it will be here sooner than you think.
Any thoughts?
===========================
 Dennis Roberts, Professor EdPsy             !!! GO NITTANY LIONS !!!
 208 Cedar, Penn State, University Park, PA 16802 AC 814-863-2401
 WEB (personal) http://www2.ed.psu.edu/espse/staff/droberts/drober~1.htm

Return to Top

Subject: CIs #10
From: Dennis Roberts
Date: Fri, 6 Dec 1996 02:16:38 -0500

All the discussion about CIs is fascinating me ... why? I am not sure. Look
at the following.
I generated a random sample with n=100 ... from a population with fixed mu
and sigma values, and built the 95 and 68 percent CIs ... for the population
mean.
MTB > tint c1
Confidence Intervals
Variable     N      Mean    StDev  SE Mean       95.0 % CI
C1         100    102.19    14.90     1.49  (   99.23,  105.14)
MTB > tint 68 c1
Confidence Intervals
Variable     N      Mean    StDev  SE Mean       68.0 % CI
C1         100    102.19    14.90     1.49  (  100.70,  103.68)
MTB >
---------------------
My first question is ... what EXACTLY can we state verbally ... that is
accurate ... about the SPECIFIC interval of 99.23 to 105.14 ... or the
SPECIFIC interval of 100.7 to 103.68?
My second question is ... what EXACTLY can we state verbally ... that is
accurate ... about the TWO intervals together ... ie, what can we  correctly
and accurately say when comparing the interval of 99.23 to 105.14 ... with
the interval of 100.7 to 103.68? (NOTE: please don't say that ... the first
is wider than the second  ... I DO know that!)
More to come ...
===========================
 Dennis Roberts, Professor EdPsy             !!! GO NITTANY LIONS !!!
 208 Cedar, Penn State, University Park, PA 16802 AC 814-863-2401
 WEB (personal) http://www2.ed.psu.edu/espse/staff/droberts/drober~1.htm

Return to Top

Subject: Re: Assistance designing study
From: Sean Lahman
Date: Thu, 05 Dec 1996 14:08:28 -0500

Donn C. Young wrote:
> I'd suggest taking a look at this book - Gould is a baseball freak
> and includes much information on why batting averages have fluctated
> over the years - and why baseball is the only sport where these
> phenomena can be studied.
> 
Thanks for reminding me about that.  I saw Gould on the Charlie Rose
show about a month ago and meant to grab the book, but I forgot about
it.  Thanks.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sean Lahman - lahmans@vivanet.com
Sean Lahman's Baseball Archive
http://www.vivanet.com/~lahmans/baseball.html

Return to Top

Subject: hypergeometric distrib
From: ks@promentor.dk (Klaus Svarre)
Date: Fri, 06 Dec 1996 18:13:33 GMT

In most standard books on statistics you can read how
to compare two normal distributions, i.e. how to test wheater
the means of the two distributions are equal.
If you have a small sample taken whitout replacement
the relevant distribution will be the hypergeometric
distribution. If you further more have two samples of this 
kind and you want to compare there means the computations
becomes a bit dificult.
Does anybody know how to construct a test for such a
situation or mabye know where I can read about it?
Any help is appriciated. (Sorry for my poor english)
Klaus, Denmark

Return to Top

Subject: Help on comparing two hypergeometric distributions
From: ks@promentor.dk (Klaus Svarre)
Date: Fri, 06 Dec 1996 18:01:47 GMT

In most standard books on statistics you can read how
to compare two normal distributions, i.e. how to test wheater
the means of the two distributions are equal.
If you have a small sample taken whitout replacement
the relevant distribution will be the hypergeometric
distribution. If you further more have two samples of this 
kind and you want to compare there means the computations
becomes a bit dificult.
Does anybody know how to construct a test for such a
situation or mabye know where I can read about it?
Any help is appriciated. (Sorry for my poor english)
Klaus, Denmark

Return to Top

Downloaded by WWW Programs
Byron Palmer

Newsgroup sci.stat.consult 21433

Directory

Articles