Newsgroup sci.stat.math 11808

Articles

Subject: Prob Density Function for sine
From: Dick DeLoach
Date: Fri, 22 Nov 1996 19:23:42 -0500

I'm having the statistical equivalent of writer's block.  Can someone
PLEASE remind me what the probability density function is for a simple
sine wave?  It's that U-shaped thingy -- you know what I mean.  Thanks!
   --- Dick

Return to Top

Subject: RE: DOE Mixture Design Question
From: owens@slivova.es.dupont.com (Aaron J. Owens)
Date: 22 Nov 1996 23:00:40 GMT

Most commercial Design of Experiments packages can deal with this problem.
ECHIP handles this kind of constraint, for example.
-- Aaron --
Aaron J. Owens, Research Fellow    Telephone Numbers:
Modeling and Simulation		  	Office	  (302) 695-7341 
Engineering Research Laboratory	  	FAX	    "   695-9658
DuPont Company, E320/201	        Home      (302) 733-7836
Wilmington, DE 19880-0320	   Internet: owens@prism.es.dupont.com
-----------------------------------------------------------------------------
Opinions expressed in this electronic message should *NOT* be taken to repre-
sent the official view(s) of the DuPont Company.  ANY OPINIONS EXPRESSED ARE
THE PERSONAL VIEWS OF THE AUTHOR ONLY. 
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Return to Top

Subject: Re: #of Samples for measurement accuracy
From: Dick DeLoach
Date: Fri, 22 Nov 1996 19:59:53 -0500

HWINC wrote:
> 
> Dear Sci.Stat.Math Readers
> 
> I need help to find the number of samples needed to measure
> the mean of a fluctuating quantity (velocity of turbulent fluid):
> 
> given:
> 
>         accuracy = e = 0.001
>         confidence: sigma of 99% = 2.33
>         RMS value = 0.01
> 
> number of samples = (2.33 x RMS / e)^2
> 
> 
> number of samples = (2.33 x RMS / e)^2
> 
> Is this correct?
> Your responses and comments are appreciated
> thank you,
> Tony Falcone
> 
> Please send reply to:
> falcon@cooper.edu
Tony,
You need one minor alteration to your formula, in order to account for
additional degrees of freedom necessary to reduce Type II errors below
whatever risk tolerance threshhold you have.  The formula I think you
want to use is as follows:
number of samples = (2.33 + U)^2 x (RMS / e)^2
where the added "U" factor is the standard normal deviate associated
with the risk you are willing to take of rejecting a point as an
"outlier" which is in fact within your 0.01 resolution requirement.  You
have only accounted for the Type-I risk of certifying a "bad" point as
"good" (per your stated quality standards.)
   --- Dick

Return to Top

Subject: Re: Attitude responses
From: Marks Nester
Date: Sat, 23 Nov 1996 02:23:39 GMT

On 22 Nov 1996 aacbrown@aol.com wrote:
> Marks Nester  in
>  writes:
> 
> > The data tell you something whether or not you reject
> > the null hypothesis, e.g. the data may tell you that the
> > two groups are quite similar in their attitudes.
> 
> If your null hypothesis is that two groups are identical, and you fail to
> reject it, there are two possibilities.
I agree. (Forgetting about validity of statistical assumptions
and appropriateness of the test.)
> The groups may actually be identical
which is generally implausible
> or you may not have enough data to distinguish them. Only the
> second explanation allows you to ask for more money.
Personally if I owned the purse strings I wouldn't want
to fund any project whose sole aim was to determine if two
groups have EXACTLY identical attitudes.
********************************************************
All readers who control budgets take note and save their 
money!
********************************************************
I also do not know how one can justify asking for more
money on the basis of a hypothesis test alone. As far as I
can see one would have to take a close look at the
observed means, variances etc. Even though these may have
been used in the original hypothesis test suddenly one
is considering point/interval estimates etc. and this,
I believe, is what one should have done in the first place
instead of bothering with the hypothesis test.
--------------------------------------------------------
All of the above is very interesting but it has not
escaped my attention that my original question
has not been answered.
I said:
It seems quite implausible that two different groups
would share exactly the same attitudes.
So why do a significance test?
Who cares whether or not the two groups
differ significantly?
You said inter alia:
It is often the case that the null hypothesis is implausible,
but it is still a standard technique of classical inference.
I said inter alia:
Then perhaps classical inference is at fault. If the most plausible
position is that the groups differ then why not make that your null
hypothesis. Ask a statistician to design for you a test which has this
as the "null" hypothesis and tell him/her that, on the basis of the test,
you may be prepared to reject this new null hypothesis and concede
that the groups really have IDENTICAL attitudes.
The "inter alias" above relate to the mechanics of hypothesis
testing but do not explain why, in the vast majority of cases,
no sensible person would accept the null hypothesis yet
the null hypothesis is still tested. Why not bypass the
null hypothesis, proceed directly to point/interval
estimates, then make your pronouncements:
there are big differences between groups, or
there are small differences between groups, or
we can't really tell yet because of large variation, or
whatever.
Kind regards
Marks R. Nester

Return to Top

Subject: HELP ME ON THIS PROBLEM PLEASE!
From:
Date: 23 Nov 1996 02:45:44 GMT

If someone can help me on this problem, I really appreciated.....
Y = ( 19x - 12 ) / (5x^2 - 15x)
I need you to solve for x...
for example
Y  = 3x then x = y/3
thank you again
amukhtar@mail.bcpl.lib.md.us

Return to Top

Subject: Re: Outliers
From: pfv2@cornell.edu (Paul Velleman)
Date: Sat, 23 Nov 1996 00:56:50 -0500

In article , Bill
Simpson  wrote:
> Several people are mentioning the use of robust statistical methods.  This
> is essentially what I said.
> 
> For example, using "robust regression" amounts to an assumption that the
> errors have a double exponential (Laplace) distribution.  (Since robust
> regression minimizes the absolute errors, which is the way to get the MLE
> for a model with Laplace distributed errors)
> 
There are, in fact, many better robust regression methods than least
absolute residuals. Some are maximum liklihood for some error model, but
many are not.
-- Paul Velleman

Return to Top

Subject: Re: Attitude responses
From: emetrics@ix.netcom.com(Jason C McLoney)
Date: 23 Nov 1996 05:49:15 GMT

In market research it is not uncommon to assume that some types of
ordinal data can be treated as interval.  It seems to me that the test
is whether the group male, female differ with respect to the question..
There are 2 approaches I would reccommend.  One is a Chi test (if you
are using a contgency table then you may need to decompose the chi if
the predictor variable (independent) has more than 2 levels).  A second
approach is the use of the non-parametric test statistics of
Mann-Whitney (though there are some issues that arise when ranks have
multiple ties and data sets are small)  I am unfamiliar with SAS but
SPSS allows you to do all these tests under the various menu commands.
In <56ioti$frb@www.umsmed.edu> Warren  writes: 
>
>Nick,
>
>Unless you have reason to believe the categories are equally spaced, I
>would worry a little about numbering just as you said.  And the
general 
>test of association may not tell you what you need to know...how do
the 
>proportions relate to each other on an ordinal scale.
>
>You have a couple of options, I would think.
>
>In PROC FREQ in SAS, you can select scores different from 1,2,3,4
(CMH), 
>but you don't seem to have any strong belief in how much distance
there 
>is between categories.
>
>You could use ridits which assign scores that take into account the 
>number in each category...but I think we are back to numbering the 
>categories if I remember the ridit procedure...it does some kind of 
>midranking as I recall.
>
>You could use a straight ranking 
>procedure...like Kruskal-Wallis-Mann-Whitney.
>
>Another technique for analyzing these data (your example fits the
classic 
>mold very well) is something called a "proportional odds model."  It 
>would be the best choice if the assumptions are met.  SAS will do this
>type of model using PROC LOGISTIC and test the "PO" assumption.  If
the 
>proportional odds assumption isn't reasonable, you can do a
"generalized 
>logits" model using PROC CATMOD...you could compare the "not at all" 
>group to all the rest.  Agresti would be a good place to look for 
>references, but Agresti isn't a good "elementary" text...I haven't
seen 
>his new book but it might be a little more "elementary".  All of these
>would require a little reading on your part concerning the
assumptions.
>
>
>n.w.nelson@education.leeds.ac.uk (nick nelson) wrote:
>>Say I have responses on an attitude scale
>>
>>eg  Do you like this?  lots / some / a little / not at all
>>
>>and two groups eg men and women. What is the best way to
>>establish whethere the two groups differ significantly?
>>
>>On approach I have seen involves numbering the responses 4,3,2,1
>>and working with the means, but this seems dubious due to the
>>non-interval nature of the scale.
>>
>>Alternatively you could cast the reponses in a 4x2 table and do
>>a chi2 on it, but this ignores the order information altogether.
>>
>>Is there a middle path?
>>
>>Nick.
>
>

Return to Top

Subject: Re: Outliers
From: Bill Simpson
Date: Fri, 22 Nov 1996 11:39:33 -0600

Several people are mentioning the use of robust statistical methods.  This
is essentially what I said.
For example, using "robust regression" amounts to an assumption that the
errors have a double exponential (Laplace) distribution.  (Since robust
regression minimizes the absolute errors, which is the way to get the MLE
for a model with Laplace distributed errors)
Maybe another distribution for the errors is more plausible.
Bill Simpson

Return to Top

Subject: Centroid of 5 Points in 4 Dimensional Space
From: engp6074@leonis.nus.sg (C K Ang)
Date: 23 Nov 1996 09:41:56 GMT

hi...
anyone has any idea how to compute the centroid of 5 points in a 
4 dimensional space ?
For 2 points, it would be given by the mid-point of the 2 points.
For 3 points, it would be given by the centroid of the triangle with 
the 3 points as vertex. How about 5 points ?
Thanks.
CK

Return to Top

Subject: Re: Centroid of 5 Points in 4 Dimensional Space
From: "D.J. Wilkinson"
Date: 23 Nov 1996 17:50:20 GMT

C K Ang (engp6074@leonis.nus.sg) wrote:
> anyone has any idea how to compute the centroid of 5 points in a 
> 4 dimensional space ?
> For 2 points, it would be given by the mid-point of the 2 points.
You mean the arthmentic mean of the two vectors.
> For 3 points, it would be given by the centroid of the triangle with 
> the 3 points as vertex.
You mean the arithmetic mean of the three vectors?!
> How about 5 points ?
An exercise for the reader.... :-)
--
Dr Darren Wilkinson   e-mail
WWW

Return to Top

Subject: Re: Strange model in nonparametric stats
From: thomas lumley
Date: Sat, 23 Nov 1996 11:41:44 -0800

On 22 Nov 1996, Randall C. Poe wrote:
> I'm reading a couple of statistics papers that have to do with nonparametric estimation
> in a setting with correlated measurement errors.  I find the model used for the
> errors to be very strange, and I wonder if somebody could help me 
> understand it intuitively.
 
>  
>  The errors e_i are taken to be samples from a stochastic process
>  Z(t) with autocovariance function g(t).   The i-th error is
>  Z(a*x_i), where a is a scaling parameter.  Large a corresponds
>  to e_i widely spaced in the stochastic process (covariance
>  g(a*(x_i - x_j)) and therefore not very highly correlated.
>  
>  Here's what I find peculiar:  the parameter a also -> oo, at
>  rate either faster than n (a/n -> oo), slower than n (a/n -> 0)
>  or the same rate as n (a/n -> constant).   The different rates
>  supposedly model different correlation regimes (termed
>  asymptotically independent, long-range correlation, and
>  short-range correlation). 
The idea of a sequence of models with n and a increasing is not intended
to be taken at face value as a description of an actual experiment.  In
any real experiment a and n are both particular finite numbers, not
sequences. The authors are trying to answer the question "What are the
properties of this model for particular finite a and n if a is not small
compared to n". 
As exact analysis of this problem is presumably infeasible they simplify
the problem by saying "Let n be very large and a be much smaller than/same
size as/much larger than n". In standard mathematical theory this requires
the analysis of a sequence of models which have a and n increasing to
infinity at the specified rates. The idea is that quantities proportional
to, say, a/n can be ignored in the first case but not in the second or
third case.  Suppose your estimator has a bias which is roughly a/n. The
usual theory for ARMA models assumes that terms like are small enough to
be ignored, but if a and n are of comparable size in your data this may be
untrue. 
A simpler example to understand comes from linear regression. Suppose we 
have n observations on p variables.  We all know that if n is large and p 
is small the least squares estimators are good, but that if p is nearly 
as big as n the estimators may be very poor.  If you wanted to have some 
idea of how big p could be compared to n you could study sequences of 
models in which, for example, p was proportional to n or to the square root 
of n or some such. Suppose you got consistent estimators when n was of 
order p^2.  If you know from experience that in a certain setting with 
n=100 and p=10 you get good estimation you would be able to say that you 
should get similarly good estimation with n=400 and p=20 in the same sort 
of data.
thomas lumley
UW biostatistics

Return to Top

Subject: code: regression: leaps and bounds
From: thomas lumley
Date: Sat, 23 Nov 1996 11:52:32 -0800

Does anyone have the FORTRAN (or C) code for best-subsets regression by
leaps and bounds (Furnival & Wilson)? I have only been able to trace one
of the authors, who doesn't have email.
thomas lumley
UW biostatistics

Return to Top

Subject: E[ X | X => X^*] = ?
From: Tatsuo Ochiai
Date: Sat, 23 Nov 1996 14:20:02 -0800

Suppose X ~ N(mu, sigma^2). Then, what is the formula for
     E[X|X=>X^*] : Expected value of X given X is greater than or
equal      to some fixed number X^*
Could anyone give me the reference?  
Thanks in advance.
Tatsuo Ochiai
tochiai@students.wisc.edu

Return to Top

Subject: Re: Centroid of 5 Points in 4 Dimensional Space
From: aacbrown@aol.com
Date: 24 Nov 1996 00:38:48 GMT

engp6074@leonis.nus.sg (C K Ang) in <576gt4$jho@nuscc.nus.sg>
> anyone has any idea how to compute the centroid
> of 5 points in a 4 dimensional space ?
I probably have misunderstood your question. But I think the answer is
just to take the average of vectors position-by-position. That is the
first number in your centroid vector is the average of the first numbers
of your five given vectors. And so on for each of the four positions.
Aaron C. Brown
New York, NY

Return to Top

Subject: Re: HELP ME ON THIS PROBLEM PLEASE!
From: aacbrown@aol.com
Date: 24 Nov 1996 00:43:45 GMT

 in
<01bbd8e8$2b3ab2a0$1c8e13cf@abdulhamidMukhtar>
> Y = ( 19x - 12 ) / (5x^2 - 15x) I need you to solve for x...
When giving a problem like this, it helps to have some context. Otherwise
it looks as if it might be a homework problem. I'm guessing it isn't so:
x = 1.5Y + 1.9/Y + (225Y^2 - 810Y + 361)^0.5/10Y
Aaron C. Brown
New York, NY

Return to Top

Subject: Re: Attitude responses
From: aacbrown@aol.com
Date: 24 Nov 1996 00:50:38 GMT

Marks Nester  in

writes:
> Why not bypass the null hypothesis, proceed directly
> to point/interval estimates
There are many statisticians who agree with you. In particular, many
Bayesians lean toward this point of view. However I think the solid
majority of statisticians find classical hypothesis testing, which usually
includes implausible null hypotheses, to be useful.
It's really more a question of the reporting of results than of
statistical technique. I can say that "a 95% confidence interval for x is
1 to 2" or "I reject the null hypothesis that x=0 at the 5% significance
level." The first statement gives more information, the second is often
more useful for communicating results.
But if you don't like hypothesis testing, don't do it.
Aaron C. Brown
New York, NY

Return to Top

Subject: Question about GG distribution
From: Igor Kozintsev
Date: Sat, 23 Nov 1996 20:13:03 -0600

Hello,
The following distribution is called Generalized Gaussian
and is widely used in image processing:
\begin{equation}
f_{X}(x)=\left[\frac{\nu \eta (\nu ,\sigma )}{2\Gamma (1 / \nu)}\right]
\exp (-[\eta (\nu ,\sigma ) |x|]^\nu ),
\end{equation}
where
\begin{equation}
\eta = \eta(\nu ,\sigma ) = {\sigma}^{-1}
\left[\frac{\Gamma (3/\nu )}{\Gamma (1/\nu)}\right]^{1/2}.
\end{equation}
(sorry for this latex code)
I need to generate random numbers for this distribution
and evaluate first three moments of the distribution (conditioned 
on variable to be in some interval) on the computer.
Could anybody help me with this?
I will appreciate your advice and/or reference.
Thank you,
----------------------------------------------------------------------------
     _/    _/_/_/_/   _/_/_/_/        |    2259 Beckman Institute
    _/    _/         _/    _/         |    University of Illinois
   _/    _/_/_/     _/_/_/_/          |    405 N. Mathews Ave.
  _/    _/         _/                 |    Urbana, IL 61801
 _/    _/         _/                  |
                                      |
Igor Kozintsev                        |    igor@ifp.uiuc.edu
Graduate student                      |    Ph.  (217)-244-1089
Electrical and Computer Engineering   |    FAX: (217)-244-8371
Image Formation & Processing Group    |    http://www.ifp.uiuc.edu
----------------------------------------------------------------------------

Return to Top

Subject: Re: Occam's razor & WDB2T [was Decidability question]
From: kenneth paul collins
Date: Sat, 23 Nov 1996 21:41:46 -0500

Bob Massey wrote:
> [...] real computers can only compute a finite number 
> of cycling(rational) outputs even if run forever!
Yeah. But such does not constitute a limit upon that which 
is computable (via machine, or otherwise).
A machine that is designed so that it can "divide & conquer" 
can render such infinities irrelevant. Basically, such a 
machine transforms all problems into Geometry, and, instead 
of "algorithms", use cross-correlation among continuua to 
arrive at solutions.
In such an approach, an infinity can be represented on a 
continuum... the "end points" are superfluous because the 
continuum represents "all there is of a particular 
'quality'". Then calculation can "orient" within "all there 
is" within the "infinity" in question by simply "sliding" 
along the continuum. "Infinities" are easily dealt with in 
this way. Real solutions with respect to such are possible 
in short computational periods.
There is one "caveat", however. "Solutions" that are 
produced are only as accurate as are the correlations 
between the continuua and the "infinities" that they 
represent. This caveat is dealt with by imbuing such a 
computational system with "intelligence" with respect to 
such correlateions. That is, within every problem-solving 
activity that occurs within such a system, there exists a 
parallel continuum-correlation-with-infinity convergence 
dynamic. And when one looks, one finds that such constitutes 
the essence of that which is referred to as "intelligence". 
What's actually happening is that the system strives 
continuously to climb the gradient that is what's described 
by 2nd Thermo (WDB2T).
Given a means to assess the continuum-infinity-correlations, 
accuracy can be increased to any desired degree, and 
reasonable accuracy can be converged upon within 
relatively-short time frames. And it's easy to imbue such 
systems with the capacity to "recognize" advantageous "tool" 
inventions, and to produce outputs that will tell of such 
tools. That is, to imbue the system with the ability to 
invent the tools it needs to increase the correlations... 
the machine would recognize the usefulness of lenses, for 
instance, because encounters with lenses would result 
in the useful increase of correlations... this'd lead to the 
invention of magnifying glasses, microscopes and telescopes. 
And so forth.
Our nervous systems work in precisely this fashion. I've 
explored computer math using such a Geometrical approach, 
and know of no problems that it doesn't handle in 
straight-forward fashions.
Such systems incorporate no "tree"-like structures. They 
"just" do their continuua-"sliding" cross-correlation stuff. 
Coincidences among continuua in a converged "state" form a 
the closest thing that this sort of system has to a 
"tree"-like structure. But such is not very "tree"-like 
because new continuua can be added continually, and each new 
continuum makes possible formerly non-existent 
"tree"-branching analogues. This means that the "tree"-like 
structure is capable of growing everywhere, This renders the 
conceptalization of "branching" nonsensical, hence "trees", 
and their finite "time" & "space" (physical memory), are 
rendered nonsensical, too. (Note, of course some physical 
memory is required, and the "resolving power" of a 
"continuum machine" will increase in some proportion to 
physical memory. But even small "continuum machines" can 
solve seemingly-"intractable" problems.)
[I apologize if this msg seems a bit "cryptic". I'm 
exhausted, and close to accepting that no one will ever be 
able to come and play with me. So why bother to "translate"? 
What's here is here because I'm obliged to make it 
available.] ken collins

Return to Top

Subject: Re: E[ X | X => X^*] = ?
From: daley@albany.net (Dimitri....something)
Date: 24 Nov 1996 02:17:33 GMT

Tatsuo  asked:
>Suppose X ~ N(mu, sigma^2). Then, what is the formula for
>     E[X|X=>X^*] : Expected value of X given X is greater than or
>equal      to some fixed number X^*
you can find the expected value by evaluating the integral of x*p(x) from
 -mu/sigma^2  to oo  and multiplying the result with sigma^2
p(x) is the pdf of N(0,1) and is equal to  exp(-x*x/2)*(2*pi)^-0.5
BTW the indefinite integral of x*p(x) in this case evaluates to -p(x)
The book that I found most thorough in explaining Expected values and 
introductory statistics in general was :
"Statistical Theory and Methodology in Science and Engineering" 
by K.A.Brownlee, 1960 
In my opinion this book beats any modern intro statistics book hands-down.
Page 38 is on expected values.
I had asked a similar question in this newsgroup and it took me a good
week to justify the answer I got but I finally did. Thanks again to Aaron
Brown who gave me the initial help.
Dimitri

Return to Top

Subject: Re: Occam's razor & WDB2T [was Decidability question]
From: kenneth paul collins
Date: Sat, 23 Nov 1996 21:49:07 -0500

Ilias Kastanas wrote:
> 
> In article <329393FD.10CE@orci.com>, Bob Massey   wrote:
> >kenneth paul collins wrote:
> >>
> >> .....       . All such attempts can be disproven by presenting the
> >> system with something that "breaks" the syntax. (This is also my main
> >> objection to Goedel's "Incompleteness".)
> 
>         Side note: I didn't follow this exchange, but "breaking the syntax"
>    is irrelevant to Goedel Incompleteness.
By "syntax" I was referring to the "rules" of the "proof". I stand on what I 
posted. ken collins
[snipped Falsely-attributed stuff and discussion directed to others]

Return to Top

Subject: Re: Occam's razor & WDB2T [was Decidability question]
From: jono@cse.ucsc.edu (Jonathan Gibbs)
Date: 24 Nov 1996 06:33:33 GMT

kenneth paul collins (KPCollins@postoffice.worldnet.att.net) wrote:
: A machine that is designed so that it can "divide & conquer" 
: can render such infinities irrelevant. Basically, such a 
: machine transforms all problems into Geometry, and, instead 
: of "algorithms", use cross-correlation among continuua to 
: arrive at solutions.
[tons snipped]
Sounds very facinating ken, can you point me to a good reference on
this stuff? Perhaps a journal paper...
--jono
--
"The human mind is a dangerous plaything, boys. When it's used for evil, 
watch out! But when it's used for good, then things are much nicer." 
                                                            -- The Tick

Return to Top

Subject: Re: E[ X | X => X^*] = ?
From: daley@albany.net (Dimitri....something)
Date: 24 Nov 1996 06:29:34 GMT

I made a mistake in my previous post. When I typed sigma^2 I meant sigma
(the standard deviation)
So the corrected post should read like this :
>you can find the expected value by evaluating the integral of x*p(x) from
> -mu/sigma  to oo  and multiplying the result with sigma
>p(x) is the pdf of N(0,1) and is equal to  exp(-x*x/2)*(2*pi)^-0.5
>
>BTW the indefinite integral of x*p(x) in this case evaluates to -p(x)
>

Return to Top

Downloaded by WWW Programs
Byron Palmer

Newsgroup sci.stat.math 11808

Directory

Articles