Back


Newsgroup sci.stat.math 11963

Directory

Subject: Re: UNIX OPERATING SYSTEM, WHICH ONE!!!!!!! -- From: Alan Burlison
Subject: Bayesian hypothesis testing confusion -- From: Robert Dodier
Subject: Four-Way ANOVA Example Needed! -- From: mazhar@news.uwf.edu (Mehran Azhar)
Subject: Re: Occam's razor & WDB2T [was Decidability question] -- From: kenneth paul collins
Subject: Re: UNIX OPERATING SYSTEM, WHICH ONE!!!!!!! -- From: evan@bigbird.telly.org (Evan Leibovitch)
Subject: Model Specification -- From: Daniel Parker
Subject: Re: Parameters and Maximum Likelihood -- From: maj@waikato.ac.nz (Murray Jorgensen)
Subject: Re: Bayesian hypothesis testing confusion -- From: "Robert E Sawyer"

Articles

Subject: Re: UNIX OPERATING SYSTEM, WHICH ONE!!!!!!!
From: Alan Burlison
Date: Sat, 30 Nov 1996 18:48:05 +0000
Evan Leibovitch wrote:
> You are absolutely right about the strengths and weaknesses. What you
> should get depends not on any particular absolute, but rather which
> strengths match your specific needs. We sell both UnixWare and the
> Caldera distribution of Linux, and still have very few instances where
> both are appropriate. Here are some observations; I hope they don't
> further confuse you.
[snip]
Good summary!  I think this is a very even-handed description of the
strengths/weaknesses of the two systems
> and all Linux distributions come out-of-the-box running
> the Samba server to let them do SMB services on Windows95/NT/WfW nets.
> SCO's VisionFS is an additional cost add-on (though Samba should work
> fine on UnixWare too).
Samba does work fine on UnixWare.
-- 
Alan Burlison
alanburlison@unn.unisys.com
My opinions may be incorrect, but they are my own.
Return to Top
Subject: Bayesian hypothesis testing confusion
From: Robert Dodier
Date: Sat, 30 Nov 1996 14:43:46 -0700
Hello all,
I'm trying to test hypotheses of the form ``Parameters a and b are
both nearly zero,'' ``Parameter a is nearly zero and b is substantially
more than zero,'' ``Parameter a is substantially more than zero and
parameter b is nearly zero.''
I think I am doing the right thing: I figured out the joint 
distribution of my test statistics a' and b' given certain values of a
and b, then integrated these distributions over suitable ranges (e.g.
nearly zero means within a small interval about zero) to eliminate a 
and b. Now I have the distribution a' and b' given the
hypothesis. There are about 10 hypotheses.
Of course, these conditional distributions tell the likelihood of
observed values of the test statistics, and I complete the scheme
by introducing a prior over hypotheses and computing posterior
probabilities.
Are you with me so far? Here is my problem: under one hypothesis h1,
p(a',b'|a,b,h1) is strongly peaked, so one particular value has
high probability and all others are very improbable. Under another
hypothesis h2, there is weaker peak, but the total mass is much greater.
This hypothesis almost always has the highest posterior probability.
Is there something counterintuitive happening here? It's clear 
enough that it's the integration over a and b that leads to favoring
h2 over h1. I suppose I could introduce hundreds of hypotheses -- 
``a and b are in this tiny region'' -- which would lead me to choose
the tiny region around the highest peak, but somehow this seems
like cheating.
It makes me uneasy that the hypothesis that contains the best-fitting
parameters doesn't get chosen. Should I just get used to it?
Thanks in advance for any comments you may have. I apologize to any
real Bayesians who are out there -- I am just an amateur.
Regards,
Robert Dodier
--
``Ainda nos faz lembrar os belos tempos'' -- on the prow of a fishing
boat.
Return to Top
Subject: Four-Way ANOVA Example Needed!
From: mazhar@news.uwf.edu (Mehran Azhar)
Date: 1 Dec 1996 04:00:24 GMT
Hi all,
   I need an example of a four-way analysis of variance (ANOVA). Most
textbooks get into one-way anova in detail and talk about multi-factor anova
in terms of two- and three-way analyses. I know how to use the SAS software
in order to find the F-tests for the data. All I need is raw data of an
example (or problem, survey, project, etc.) of a four-way ANOVA. Any
information on this matter is greatly appreciated. I'll be looking forward
to your responses. Thanks in advance.
Sincerely,
Mehran Azhar
mazhar@students.uwf.edu
Return to Top
Subject: Re: Occam's razor & WDB2T [was Decidability question]
From: kenneth paul collins
Date: Sat, 30 Nov 1996 22:35:32 -0500
Ilias Kastanas wrote:
>         "Analogue" measures "continuous" quantities, and so yields appro-
>    ximate answers (which of course may be fully adequate for various prac-
>    tical problems). 
I must "move on", but I'll tell you, Ilias, "continuous" ("analogue") is no 
more-correlated to "approximate" than is "digital". What's correlated to 
"approximate" is a practictioner's willingness to do the work inherent in 
ferreting out "exactness". 
>    "Digital" (discrete), with exact answers, is different
>    (and classical computability theory applies).
Here's where we differ. "Discrete" Math is "exact" in terms of of its own 
definitions, and =only= in those terms. And when one looks at such 
"discreteness" through a differnet "lens", one sees that the "discrete" rules 
are subsets of the "analogue" rules.
I really do wish that I could stay on in this discussion, but the tyranny of 
the stomach has won yet another battle, and I must turn to other things.
Farewell, All (sadly, until some months have passed). ken collins
_____________________________________________________
People hate because they fear, and they fear because
they do not understand, and they do not understand 
because hating is less work than understanding.
Return to Top
Subject: Re: UNIX OPERATING SYSTEM, WHICH ONE!!!!!!!
From: evan@bigbird.telly.org (Evan Leibovitch)
Date: Sun, 1 Dec 1996 02:01:50 GMT
In article , Raymond N Shwake  wrote:
>Bryan Austin  writes:
>>I am in the market for a UNIX operating system. I have narrowed the
>>search down to three 3 prospects: SCO UNIX 2.1, Solaris x86 UNIX, and
>>Lunix. My question is, which of the three is the best choice, and more
>>importantly, Why? I will be using the operating system for business and
>>personal use.  
>I find it interesting that, while your posting was directed to c.u.u.m
>(among other groups), you don't list UnixWare as an alternative. 
I had sort-of assumed, given the newsgroup and the release number, that
the "SCO UNIX" in Bryan's question *did* refer to UnixWare.
I may be in the minority, but when I think of "SCO Unix" I first
think of UnixWare.
I think they also make an OS called "OpenSurfer" or something like that.
But where is the future? (Hint: The Gemini SDK requires UnixWare.)
-- 
  Evan Leibovitch, Sound Software Ltd, located in beautiful Brampton, Ontario
 Supporting PC-based Unix since 1985 / Caldera & SCO authorized / 905-452-0504
 Unix is user-friendly - it's just a bit more choosy about who its friends are
Return to Top
Subject: Model Specification
From: Daniel Parker
Date: Sat, 30 Nov 1996 22:24:10 -0500
Perhaps someone can help me:
I am a beginner level student of statistics and econometrics. I am
looking for information on how to go about specifying model forms (to be
used for economic work). The basic texts that I have only suggest using
economic insight, the adjusted r-squared, tests for multicollinearity,
and Wald tests of joint and individual significance. Obviously, if you
are using econometric analysis (i.e. empirical analysis)to test the
validity of a economic theory you have already proposed, then modelling
is somewhat easier in that you must test the specific model you have
already put forth. If,on the other hand, instead of testing whether a
specific model is supported empirically, you are attempting to determine
what if any relationship may exist among a number of factors, you may
not know beforehand exactly what to look for. Of course you don't want
to turn it into a "fishing" expedition, but it seems that it is not
always clear ahead of time exactly what the form should be. I know this
is quite a simple level question and I apologize if it is inappropriate
in this forum. However, additional techniques or methods or suggestions
are greatly appreciated. Also, I am wondering what the consensus is on
using adjusted r-squared and or Wald tests of significance. My texts are
ambiguous about the value of these tests.
dsp11
Return to Top
Subject: Re: Parameters and Maximum Likelihood
From: maj@waikato.ac.nz (Murray Jorgensen)
Date: Sun, 01 Dec 96 22:07:41 GMT
Nonlinear models were very much in my mind when I gave the linear model 
example. In the case of nonlinear models the simple geometry of finite 
dimensional vector spaces is replaced by the much more complicated geometry of          
curved manifolds. But I still see the choice of a particular parameterization 
as means of describing the statistical model, rather than as giving the model 
itself. To say that there are "serious estimation and inference problems" in 
nonlinear models is similar to saying that there are serious navigation 
problems near the poles of the earth. There will be if you continue to use 
latitude and longitude as your coordinate system, but if you "reparameterize" 
and switch to a local grid life will be much easier.
I certainly second the recommendation of James Malley that interested 
statisticians should consult Seber and Wild (1989). 
With regard to reconciling my view with that of Herman Rubin I could note that 
it is usually possible to start from one particular function of a model as a 
first parameter and add other functions until a full parameterization is 
obtained. 
Murray Jorgensen
James D. Malley wrote:
>Murray Jorgensen wrote:
>>To see the role of parameters in statistical models clearly it is useful to
>>consider a special case: a linear model
>>   y = X\beta + e
>(................)
>>The geometry of more general models is more complicated, but a similar
>>arbitrariness underlies the choice of parameter vector.
>>
>>Models are a means of describing the assumptions made about the probability
>>distributions of the random variable involved, the particular equation and
>>parameterization used to introduce the model sould be seen as
>>representative of an equivalence class of such representations.
>
>The reasoning above is secure for linear models but comes completely apart
>for nonlinear ones. Quite innocent appearing nonlinear models often pose
>serious estimation and inference problems when so-called "parameter
>curvature" is present. Then, the parameterization is critical and far from
>arbitrary. So profound are these problems, and so widely present in
>real-world data, it has been argued that linear models are a fool's
>paradise. This is an extreme position, of course, but any excursion into
>nonlinear models will be usefully sobering.
>
>Details can be found, for example, in the very large, very good text by
>Seber and Wild (1989). Some beautiful diagrams and graphs can also be found
>there for those interested in the ties between geometry and inference.
>
>Jim
>
Return to Top
Subject: Re: Bayesian hypothesis testing confusion
From: "Robert E Sawyer"
Date: 1 Dec 1996 06:17:37 GMT
Robert Dodier  wrote in article <32A0AA92.41C67EA6@colorado.edu>...
| ---------------
| I figured out the joint 
| distribution of my test statistics a' and b' given certain values of a
| and b, then integrated these distributions over suitable ranges (e.g.
| nearly zero means within a small interval about zero) to eliminate a 
| and b. Now I have the distribution a' and b' given the
| hypothesis. 
|
Hmm...a source of trouble: The integral described is not a conditional 
density function given the hypothesis, nor is it a likelihood function.
Let a=parameter vector (vector of your a,b)
and x=data vector (or vector of your test statistics a',b') 
Then the conditional density of x given H is
p(x|H) = integral{a in H}[p(x|a) p(a)] / pr(H)
Claiming, as you do, that 
p(x|H) = integral{a in H}[p(x|a)],
implies that p(a)=pr(H) for a in H.
But this would give 
pr(H)=integral{a in H}[p(a)]=integral{a in H}[pr(H)]=[pr(H)]^2
implying pr(H)=1 -- which is not likely to be your prior knowledge!
The posterior odds-ratio is
pr(H1|x) / pr(H2|x) = ( p(x|H1) / p(x|H2) )  ( pr(H1) / pr(H2) )
giving 
[Eqn1:]
pr(H1|x) / pr(H2|x)
= integral{a in H1}[p(x|a) p(a)] / integral{a in H1}[p(x|a) p(a)]
If you have a "flat, non-informative prior" p(a),
then this reduces to 
[Eqn2:]
pr(H1|x) / pr(H2|x)
= integral{a in H1}[p(x|a)] / integral{a in H1}[p(x|a)]
N.B. Neither Eqn1 nor Eqn2 explicitly involves probabilities of the
hypotheses, although such probabilities are implicit in both of them.
In particular, this shows that integral{a in H}[p(x|a)] is 
*not* a likelihood function.
| 
| Of course, these conditional distributions tell the likelihood of
| observed values of the test statistics, and I complete the scheme
| by introducing a prior over hypotheses and computing posterior
| probabilities.
|
This would be inconsistent with the assumptions needed to use the forms
integral{a in H}[p(x|a)].  To introduce a "prior over the hypotheses"
is to say that p(a) *isn't* non-informative, contrary to what was shown
above to be the requirement for using these forms. (I assume that 
"introducing" doesn't mean "calculating from a non-informative p(a)".)
If you have a non-informative p(a), Eqn2 is appropriate with no further
introduction of pr(H), and uses the forms you describe. 
If you have an informative p(a), then the forms you describe are not
appropriate, and Eqn1 would apply.
Robert E Sawyer 
soen@pacbell.net
Return to Top

Downloaded by WWW Programs
Byron Palmer