Newsgroup sci.stat.math 11653

Articles

Subject: Re: CpK for industrial SPC. What is it?
From: Paige Miller
Date: Thu, 07 Nov 1996 07:59:03 -0800

cwbern@aol.com wrote:
> 
> The Cpk is a measure of how well is your process covering the tolerance.
> 
> Cpk = Min (CPL,CPU)
> 
> CPU = (USL - Avg.) / (3 X s.d)
> CPL = (Avg. -LSL) / (3 X s.d)
> 
> A Cpk equal to 1 would mean that your process is exacty covering the 3
> sigma tolerance.  The higher the Cpk, the more capable your process of
> conforming to specifications.
> Scince this method is highly dependent on the assumption of a normall
> distribution, you should perform a normality test first.
This is purely a descriptive statistic. The requires a normal distribution is 
not required for these numbers to be used descriptively. If you feel that you want 
to use these quantities to make probabilistic statements, then the distribution is 
required. 
-- 
+---------------------------------+------------------------------------+
| Paige Miller, Eastman Kodak Co. | "I hate a country witout a         |
| PaigeM@kodak.com                |    derrick" -- Mark Twain          |
+---------------------------------+------------------------------------+
| The opinions expressed herein do not necessarily reflect the         |
| views of the Eastman Kodak Company.                                  |
+----------------------------------------------------------------------+

Return to Top

Subject: Re: CpK for industrial SPC. What is it?
From: "Michael J. Anderson"
Date: 11 Nov 1996 12:06:49 GMT

Unfortunately,Cpk in the semiconductor industry is used as an absolute
capability measurement with no estimation of confidence.  In other words,
'spec' minded people are thrilled to hear that your Cpk has gone from 1.1
to 1.2, even though both are well within your confidence interval!
Paige Miller  wrote in article
<32820747.143D@kodak.com>...
> cwbern@aol.com wrote:
> > 
> > The Cpk is a measure of how well is your process covering the
tolerance.
> > 
> > Cpk = Min (CPL,CPU)
> > 
> > CPU = (USL - Avg.) / (3 X s.d)
> > CPL = (Avg. -LSL) / (3 X s.d)
> > 
> > A Cpk equal to 1 would mean that your process is exacty covering the 3
> > sigma tolerance.  The higher the Cpk, the more capable your process of
> > conforming to specifications.
> > Scince this method is highly dependent on the assumption of a normall
> > distribution, you should perform a normality test first.
> 
> This is purely a descriptive statistic. The requires a normal
distribution is 
> not required for these numbers to be used descriptively. If you feel that
you want 
> to use these quantities to make probabilistic statements, then the
distribution is 
> required. 
> 
> -- 
> +---------------------------------+------------------------------------+
> | Paige Miller, Eastman Kodak Co. | "I hate a country witout a         |
> | PaigeM@kodak.com                |    derrick" -- Mark Twain          |
> +---------------------------------+------------------------------------+
> | The opinions expressed herein do not necessarily reflect the         |
> | views of the Eastman Kodak Company.                                  |
> +----------------------------------------------------------------------+
>

Return to Top

Subject: Re: CpK for industrial SPC. What is it?
From: "Michael J. Anderson"
Date: 11 Nov 1996 12:06:49 GMT

Unfortunately,Cpk in the semiconductor industry is used as an absolute
capability measurement with no estimation of confidence.  In other words,
'spec' minded people are thrilled to hear that your Cpk has gone from 1.1
to 1.2, even though both are well within your confidence interval!
Paige Miller  wrote in article
<32820747.143D@kodak.com>...
> cwbern@aol.com wrote:
> > 
> > The Cpk is a measure of how well is your process covering the
tolerance.
> > 
> > Cpk = Min (CPL,CPU)
> > 
> > CPU = (USL - Avg.) / (3 X s.d)
> > CPL = (Avg. -LSL) / (3 X s.d)
> > 
> > A Cpk equal to 1 would mean that your process is exacty covering the 3
> > sigma tolerance.  The higher the Cpk, the more capable your process of
> > conforming to specifications.
> > Scince this method is highly dependent on the assumption of a normall
> > distribution, you should perform a normality test first.
> 
> This is purely a descriptive statistic. The requires a normal
distribution is 
> not required for these numbers to be used descriptively. If you feel that
you want 
> to use these quantities to make probabilistic statements, then the
distribution is 
> required. 
> 
> -- 
> +---------------------------------+------------------------------------+
> | Paige Miller, Eastman Kodak Co. | "I hate a country witout a         |
> | PaigeM@kodak.com                |    derrick" -- Mark Twain          |
> +---------------------------------+------------------------------------+
> | The opinions expressed herein do not necessarily reflect the         |
> | views of the Eastman Kodak Company.                                  |
> +----------------------------------------------------------------------+
>

Return to Top

Subject: summer and academic year visits
From: Mark Schervish
Date: Mon, 11 Nov 1996 08:43:09 -0500

***********************************************************************
*  CARNEGIE MELLON ANNOUNCES NEW PROGRAM TO SUPPORT VISITS OF RECENT  *
*     PH.D. RECIPIENTS DURING EITHER SUMMER OR ACADEMIC YEAR          *
***********************************************************************
The Department of Statistics at Carnegie Mellon University, with
partial support from the National Science Foundation, has established
an Institute for Statistics and its Applications.
The purpose of the Institute is to foster the development of
statistical methodology through vigorous cross-disciplinary
collaborations, and to train pre-doctoral and post-doctoral
statisticians in cross-disciplinary research and teaching.  All
membersof the Department are affiliated with the Institute.  Recently
the Department appointed four post-doctoral fellows in the Institute. 
Currently the research collaborations in the Institute include the
following subject matter areas: cognitive psychology; functional
magnetic resonance imaging; genetics; psychiatric statistics;
statistical physics; criminology; governmental statistics and public
policy; environmental statistics; and finance.
The Institute seeks applications for visiting positions from
statisticians with recent Ph.D. degrees who are interested in
cross-disciplinary research and teaching.  Research support is also
available to selected recent Ph.D. recipients who wish to visit the
Institute during the summer.  The length of all appointments is
negotiable.  Please send vita, relevant transcripts, research papers,
three letters of recommendation to
                 Chair, Faculty Search Committee
                 Department of Statistics
                 Carnegie Mellon Univeristy
                 Pittsburgh, PA  15213
Women and minorities are especially encouraged to apply. Carnegie Mellon
University is an affirmative action/equal opportunity employer.

Return to Top

Subject: Re: CpK for industrial SPC. What is it?
From: Paige Miller
Date: Mon, 11 Nov 1996 09:26:08 -0500

Michael J. Anderson wrote:
> 
> Unfortunately,Cpk in the semiconductor industry is used as an absolute
> capability measurement with no estimation of confidence.  In other words,
> 'spec' minded people are thrilled to hear that your Cpk has gone from 1.1
> to 1.2, even though both are well within your confidence interval!
Clearly, another abuse of a statistic. In fact, I would claim that Cpk is abused more 
than it is used properly. I have tried to get my clients to stop comparing one month 
to the previous month without confidence intervals, and replace it by a chart of the 
last n months (where n is hopefully 10 or more) to show that the trend is in the 
right direction. After all, the goal is to show continuous improvement, not one-time 
improvement.
-- 
+---------------------------------+------------------------------------+
| Paige Miller, Eastman Kodak Co. | "I hate a country witout a         |
| PaigeM@kodak.com                |    derrick" -- Mark Twain          |
+---------------------------------+------------------------------------+
| The opinions expressed herein do not necessarily reflect the         |
| views of the Eastman Kodak Company.                                  |
+----------------------------------------------------------------------+

Return to Top

Subject: Re: Probability Question
From: wpilib+@pitt.edu (Richard F Ulrich)
Date: 11 Nov 1996 17:21:13 GMT

Richard Bucklew (buckbeme@flash.net) wrote:
": Can you help me solve this question?
": In the Persian Gulf War, the United States used Patriot missiles as a
way to defent the Iraqi SCUD missile. The manufacturer claims a Patriot
missile fired has a 60% chance of destroying its target. Now suppose a
sculd missile has been detedted and four Patriot missiles are fired at the
SCUD. Find the probability that the SCUD is destroyed. Please show you
work. The choices are: "
If you believe one should apply the  "60% chance" claimed by the
manufacturer, then the chance for 4 missiles is probably 60%, too.
There is no evidence that firing a  *second*  missile at one detected 
SCUD has any chance at all; that was a trick question, I suppose.
Actually, the evidence has been argued over, as to whether the Patriots
did *any* good in the Gulf War, adding negative effects to the one 
or two real interceptions.  The worst damage from a "SCUD"  came from
the one that apparently was hit by a Patriot simultaneously as it hit
an apartment building.  Oops!
  -- Anybody care to submit any more militaristic examples?
Rich Ulrich, biostatistician              wpilib+@pitt.edu
Western Psychiatric Inst. and Clinic   Univ. of Pittsburgh

Return to Top

Subject: Re: Help with vector spaces, please.
From: Gaines Harry T
Date: 11 Nov 1996 14:55:26 GMT

For another, it doesn't include the zero vector.
Harry

Return to Top

Subject: Re: What is the difference between chaotic and random?
From: Robert Dodier
Date: Mon, 11 Nov 1996 11:59:08 -0700

Hello all,
It's a beautiful blue day here in Boulder, hope it's the same whereever
you are.
Troy Shinbrot wrote:
> 
> In article <19961108025700.VAA15372@ladder01.news.aol.com>,
> jksnyder@aol.com wrote:
> 
> > Please excuse the elementary nature of the question, I am just learning
> > about chaotic systems.  Is there a difference between chaotic systems and
> > random systems?  If so, could the difference be measured/quantified by
> > plotting the series on normal probability paper?
> >
> > I generated a chaotic series of 1,000 numbers, between zero and one,
> > using the logistic equation and similarly generated a series of 1,000
> > numbers, between zero and one, with a random number generator.  After
> > ordering both series, they both plotted as straight lines on normal
> > probability paper.  Therefore, by this test, chaotic and random number
> > series appear to have a normal distribution.
This seems very strange; the invariant measure for ax(1-x) is far from
normal. For a=4, the invariant measure is continuously differentiable
and thus there is a density function, which looks like a U. For a < 4,
the invariant measure has a lot of spikes (I don't recall if the number
of spikes is finite, countable, or uncountable), so there is no density.
Also, when you say a random number between 0 and 1, I believe you must 
mean uniformly distributed; again, this is anything but normal. If you 
make a histogram of the 1000 numbers, what shape do you get?
> > Are there other measures, such as analysis of variance, that could
> > distinguish between random and chaotic series?
I'll try to argue that this is, for practical purposes, not a meaningful
question. First, let's review what Mr. Shinbrot wrote. By the way, this
is
indeed a great question, which touches on a fundamental issue.
> A great question.  First a practical answer for this particular problem.
> If X(n) denotes the n-th value of your time series, if you plot X(n) vs.
> X(n-1), you will get a mess for the random data, because the n-th value of
> the time series does not depend on the n-1'st value.  For the logistic
> data, you will get a parabola (obviously).
> 
> Thus because of the deterministic nature of chaos, one value depends on
> its history, while random data does not.  Vis a vis statistical tests, if
> you randomize the ORDER of the logistic data, you will have two data sets,
> one logistic and a second randomized-order logistic, both of which are
> guaranteed to have EXACTLY the same mean, variance, skew, kurtosis, or
> anything else you would care to measure.  It is only the determinism of
> the data sets that differ.
I don't think dependence on the past is a suitable way to distinguish
random from chaotic processes. I'm sure we'll all agree that a Markov 
process is a random process, yet such a process may have a very strong
dependence on past states.
> The second answer is more pedagogical: the logistic data have dimension at
> most 1.  That is, they all lie precisely on the parabola I mentioned, and
> one variable is all that is needed to define the state and thus determine
> the future state of the system.  The random data are (ideally) infinite
> dimensional: an ideal random number generator would require an infinite
> number of variables to define the future state.  Practically we all know
> that this isn't quite true, but that is the strict answer.
There is nothing in the textbook definition of a random variable that
requires that it be generated by an infinite-dimensional problem. In
order
for all the definitions about expected value, distribution function,
density, etc etc to work, all that is required is that the generating
process have a unique invariant measure, so that time averages over 
process values equal weighted averages taken over the state space (with
the invariant measure doing the weighting). That is, a random variable
need not be generated by an infinite-dimensional process; the process
need
only be ergodic -- this is a much weaker condition.
Incidentally, for ergodic processes the frequentist and
measure-theoretic 
definitions of probability coincide. I don't think the followers of
these
two schools really differ on any practical point.
So I've pointed out that random processes can be low-dimensional, but 
I could make the argument a little more convincing by coming up with 
some examples of deterministic system which has an everyday distribution
as its invariant measure. So far I can't think of a low-dimensional
system
which has an invariant measure which is approximately normal, say. 
Can anyone name such a system?
Thanks again to Messrs. Snyder and Shinbrot for bringing up this topic.
I apologize for my lack of knowledge of probability, ergodic theory, 
and nonlinear systems.
Regards,
Robert Dodier

Return to Top

Subject: Re: Multiple Regression
From: wpilib+@pitt.edu (Richard F Ulrich)
Date: 11 Nov 1996 21:20:57 GMT

tschmitz@hpu.edu wrote:
:" Could someone please explain how to understand this in English?
Particularyly the t-stats and f-stat. I know what they are, but 
I am confused by the output."
 << some deleted ... >>
: SUMMARY OUTPUT			X3 = JS				
: 								
: Regression Statistics								
: Multiple R	0.996378							
: R Square	0.992769
Reasons to be confused by the printout:
   I never saw a p-level printed out before, where the program showed
the 24 leading zeros...  most programs would just leave it rounded off
at  "0.0000"  though it might be nice to see someone say  "< 0.00005".
   The program gives "Lower 95% ... "  then, redundantly,
"Lower 95.000%..."  with the same numbers under it.  No good reason.
   The usual examples of problems have things that are BARELY significant,
instead of examples with the prediction being ALMOST PERFECT, as in
this example.  The best predictor, though, does not account, all by  
itself, for the total R-squared;  that is, the t-test on the "partial
regression coefficient"  is not so enormous as it might be.  Thus, I
conclude, there must be some correlation between the predictors in
the equation.
The two lesser predictors still are "significant"  by the arbitrary
5% standard, but that does not tell you anything about how the
predictors affect each other;  which they must do. 
So, what more do you want to know?
Rich Ulrich, biostatistician              wpilib+@pitt.edu
Western Psychiatric Inst. and Clinic   Univ. of Pittsburgh

Return to Top

Subject: Re: What is the difference between chaotic and random?
From: Misha Sushchik
Date: Mon, 11 Nov 1996 01:50:06 -0800

Robert Dodier wrote:
> 
> I don't think dependence on the past is a suitable way to distinguish
> random from chaotic processes. I'm sure we'll all agree that a Markov
> process is a random process, yet such a process may have a very strong
> dependence on past states.
The dependance on the previose states is different in two cases. For a
chaotic
system the current state is *uniquely* determined by the previous
state(s).
> 
> So I've pointed out that random processes can be low-dimensional, but
> I could make the argument a little more convincing by coming up with
> some examples of deterministic system which has an everyday distribution
> as its invariant measure. So far I can't think of a low-dimensional
> system
> which has an invariant measure which is approximately normal, say.
> Can anyone name such a system?
> 
What if you take a double shift map (has uniform ivariant density) and
apply a nonlinear transform to its variable to make the distribution
watever you
like it to be? This will not prove however that the process that you get
at the end is random. At least not in the sence how I understand a
random
process. The thought that a random sequence can be produced by a
low-dimensional dynamical system casts terror on me. Every time it comes
to me in my dreams I wake up in cold sweat. 
No. Some methods of statistics may apply to both random and chaotic
processes but it does not mean that random processes can be observed in
low-dimensional dynamical systems. And certainly methods and theorems
developed for delta-correlated random
processes will not work for chaotic processes.
Misha Sushchik

Return to Top

Subject: Re: [Q] Using pseudoinverse in Bayes discriminant function?
From: Greg Heath
Date: Mon, 11 Nov 1996 14:53:56 -0500

On Fri, 8 Nov 1996, Dukki Chung wrote:
> Hi. Reently, I had to use Bayes classifier for a pattern classification
> problem. The Bayes discriminant function is:
> 	di(x) = - [ ln|Ci| + (x-mu)^t Ci^-1 (x-mu)]
        di(x) = - [ ln|Ci| + (x-mui)^t Ci^+ (x-mui) -2 lnPi]
> The problem was, the covariance matrix Ci was near singular, so the
> inverse could not be calculated. So, I used pseudoinverse instead of real
> inverse. What I'm wondering is whether this is a valid, justifiable 
> mathematical or statistical approach.
Yes. I've always used the pseudoinverse. The ill-conditioning of the 
covariance matrix results in near zero eigenvalues corresponding to 
directions in space for which the distribution has nearly a constant 
value(i.e., nearly a zero variance).
> I would be appreciated for any comments, suggestions, references, or any 
> pointers.
Check the eigendirections associated with the near-zero eigenvalues.   
Classes with near constant values in those directions might be able to be 
classified quite easily based on that fact alone.
Hope this helps.
Gregory E. Heath     heath@ll.mit.edu      The views expressed here are
M.I.T. Lincoln Lab   (617) 981-2815        not necessarily shared by 
Lexington, MA        (617) 981-0908(FAX)   M.I.T./LL or its sponsors
02173-9185, USA

Return to Top

Subject: Re: What is the difference between chaotic and random?
From: pecora@zoltar.nrl.navy.mil (Lou Pecora)
Date: Mon, 11 Nov 1996 17:40:48 -0400

In article <3287777C.73D8@colorado.edu>, dodier@colorado.edu wrote:
> 
> So I've pointed out that random processes can be low-dimensional, but 
> I could make the argument a little more convincing by coming up with 
> some examples of deterministic system which has an everyday distribution
> as its invariant measure. So far I can't think of a low-dimensional
> system
> which has an invariant measure which is approximately normal, say. 
> Can anyone name such a system?
Hello, Robert,
I'm not sure you and Troy are using the word dimension the same, but I was
under the impression that any random process could not be embedded in a
finite dimensional phase space (a la time delay embeddings).  I think
that's what Troy meant (Troy, is that right?).  Is there a random process
than can be embedded in a finite-dimensional phase space?  Just off the
top of my head, I would guess a finite-dimensional embedding would imply a
deterministic law.  Can one prove that?  Good question.  My hunch is
"yes," but I can't do it right now (but then I have two kids playing
SuperNintendo in the background. :-) ).
Opinions/theorems, anyone?
Lou Pecora
code 6343
Naval Research Lab
Washington  DC  20375
USA
 == My views are not those of the U.S. Navy. ==
------------------------------------------------------------
  Check out the 4th Experimental Chaos Conference Home Page:
  http://natasha.umsl.edu/Exp_Chaos4/
------------------------------------------------------------

Return to Top

Subject: help?!: "serial correlation" of random sequences
From: "geeman@best.com"
Date: Mon, 11 Nov 1996 17:09:50 +0000

I'll post this again - perhaps someone can shed some light ???
I've been running some statistical tests on various RNG methods including
bits taken from irrational functions, hashing, and supposedly geiger-counter bits
from outer space.
But when I run serial correlation tests (see below for
the relevant parts) -as per Knuth- they tend to be biased towards
negative.  
The test procedure is as follows:
initialize min_correlation and max_correlation
loop on number of trials {
 generate a random sequence of the desired length
 calculate serial_correlation (as is shown below)
 // take mins and max of this result,previous results
 min_correlation  = min(serial_correlation,min_correlation)
 max_correlation  = max(serial_correlation,max_correlation)
 display etc.
}
What I see when I perform this test is that very consistently, 
min_correlation goes lower than max_correlation goes high.  I would have
expected that they be symmetric on average.  And yet this bias
is quite consistent.
Code I am using is below; note two methods of calculating the final val.
I have checked on two different CPUs and 2 different compilers.  What
gives? Can anyone shed some light?
=====================================
/*
        ENT  --  Entropy calculation and analysis of putative
                 random sequences.
        Designed and implemented by John "Random" Walker in May 1985.
        Multiple analyses of random sequences added in December 1985.
*/
void entcalc(unsigned char *digest,int size,void *)
{
        int i, opt, mp, sccfirst, sccdef;
        unsigned int c;
         double      scc, sccun, sccu0, scclast, scct1, scct2, scct3,
        /* Initialise for calculations */
        totalc = 0;
        sccfirst = sccdef = TRUE;  
	/* Mark first time for serialcorrelation */
        scct1 = scct2 = scct3 = 0.0; 
	/* Clear serial correlation terms */
        for(i=0;i maxcc) maxcc = scc;
        }
        if ( (++iterationcount % 100) == 0) {
                cout << "iter [" << iterationcount << "]: mincorr=" <<
mincc << "\t maxcorr=" << maxcc << "\n";
        }
}

Return to Top

Subject: Bounds on variances
From: ahmed shabbir
Date: Mon, 11 Nov 1996 19:58:53 -0600

Hi,
How can I estimate (by sampling) upper and lower bounds of the variance 
of a random variable with unknown distribution.
Thanks in advance,
SHABBIR
______________________________________________________________________________
  SHABBIR AHMED			      | 140 MEB, MC-244, 1206 W.Green st.,
  Operations Research Lab.	      | Urbana, IL61801.                   
  Dept. of Mech.&Ind.; Eng.	      | Ph:(217) 367-5073(home),333-0699(off)
  Univ.of Illinois @ Urbana-Champaign | 
				      | 
______________________________________________________________________________

Return to Top

Subject: need advice on statistical method for cluster analysis
From: markwa@halcyon.com (Mark Walsen)
Date: 12 Nov 1996 01:08:19 GMT

I'm hoping that someone can suggest the appropriate
statical tool for a problem I have.  
I'm developing music software that "listens" to music
(MIDI) and transcribes it into music notation.  One of
the problems is determining what the meter of the song
is, eg, 2:4, 3:4, 4:4, 6:8, etc. The meter "variable"
should be thought of as a category A, B, C, etc. instead
of as a quantity, even though the meter is expressed as
two numbers.
I examine about 10 scalar variables that summarize the 
character of the song, such as average duration of notes 
in the song.  These 10 variables form a 10-dimensional space 
in which I have plotted points for each of  songs that I have 
analyzed. There tends to be clustering of points in the 
10-dimensional space for songs with the same meter.  I think 
of the clusters as "shapes" in the 10-dimensional space. 
If you give me a new song, I'll measure its 10 variables related
to meter, and locate the point for this song in the 10-dimension 
space. If the point falls clearly within one of the "shapes",
then I'm confident in predicting its meter as being the
same as the other songs already in that shape.  If the
point falls between shapes, or if the point falls in
an area where shapes overlap, then I have to do some
guess work.  I need an appropriate statistical tool (math)
for the guess work.  Some kind of cluster analysis.
If you can suggest what kind of statistical analysis I
should apply to this problem, please send a mail to 
markwa@notation.com, and also post it in the newsgroup only if
you think others might find it interesting.
Thanking in advance -- Mark Walsen

Return to Top

Subject: Re: Probability Question
From: george_cummings@email.mot.com
Date: Mon, 11 Nov 1996 14:52:58 GMT

radford@cs.toronto.edu (Radford Neal) wrote:
>>jjboeldt@in.net wrote:
>> Can you help me solve this question?
>>
>> In the Persian Gulf War, the United States used Patriot missiles as a way 
>> to defent the Iraqi SCUD missile. The manufacturer claims a Patriot missile 
>> fired has a 60% chance of destroying its target. Now suppose a sculd missile
>> has been detedted and four Patriot missiles are fired at the SCUD. Find the 
>> probability that the SCUD is destroyed. Please show you work. The choices 
>> are:
>>
>> A) .9744
>> B) .9375
>> C) .1296
>> D) .8704
part a :
let A = probability of a single missle hitting the target
	 A = 0.6
let B = probability of a single missle missing the target
	 B = 1 - A  =  0.4
the probability of all 4 missiles missing is :
	BxBxBxB  = B^4
the probability of a hit is :
	1 - prob miss
or 
	P(hit)  = 	1 - B^4 
	          =  1 - (0.4)^4
	          =  .9744
part b :
how many missiles must be fired to get P(hit) = 0.99
same formula :
P(hit) 	=   1 - P(miss)
	=   1 - ( 1 - 0.6 ) ^N
so :
0.99	=    1  -  ( 0.4 )^N
1 - 0.99  =  ( 0.4 ) ^N
log ( 0.01 )  =  N log ( 0.4 )
then :
N	=     log(0.01) / log(0.04)
	=    5.0259
a purist would say that you need more than 5, so the answer is 6.
for N = 5, P(hit)  =  0.9898
for N = 6, P(hit)  =  0.9959
george c

Return to Top

Subject: Splus question
From: Carlos Lopez
Date: Tue, 12 Nov 1996 03:16:37 GMT

I am having trouble getting Splus to give a 95% prediction interval, and a 95%
expectation interval after having fit a linear model. Are there built in
functions that will do these calculations for me, or do I have to write my
own. The manuals are no help on this point. It seems odd to me that Splus
wouldn't have this kind of function already built in, as it is a fairly common
thing to do. Any help will be much appreciated! Reply by posting or e-mail me.
Carlos Lopez

Return to Top

Subject: Splus help.
From: Carlos Lopez
Date: Tue, 12 Nov 1996 03:18:03 GMT

I was wondering if there is a command for getting Splus to give a 95%
confidence interval for E(Y|X0) and a 95% prediction interval for a new data
point after a linear model has been fit.
The predict() command doesn't seem to give the results that I'm looking for.
If there isn't a built in command I think I can write a small function to do
both of these.
Any help will be much appreciated.
Carlos Lopez

Return to Top

Subject: Re: Confounding variables in regression
From: auld@qed.econ.queensu.ca
Date: 11 Nov 1996 12:00:59 -0600

Dan Kehler  <005769k@ace.acadiau.ca> wrote:
>method 1:  
>
>model 1 = Y ~ X1
>model 2 = residuals(model 1) ~ X2 , test for significance of X2
>
>Method 2:  
>
>model 1 = Y ~ X1 + X2, test for significance of X2.
In the second method, the estimate of b2 is
	\hat b2 = (X2'M1 X2)^{-1}X2'M1 Y,
where M1=(I - X1(X1'X1)^{-1}X1').  In the first, it is
	\tilde b2 = (X2'X2)^{-1}X2'M1 Y
You would get numerically identical results if you altered the first
method to regress the residuals from the first step on the residuals of a
regression of X2 on X1 rather than on X2.  The proposed method 1 will give
incorrect results if X1 and X2 are not orthogonal.  See Davidson and
MacKinnon, _Estimation and Inference in Econometrics_ (Oxford 1993)
chapters 1 and 3.
-- 
Chris Auld                               Department of Economics
Internet: auld@qed.econ.queensu.ca       Queen's University
Office:   (613)545-6000 x4398            Kingston, ON   K7L 3N6

Return to Top

Subject: Re: Probability Question
From: Anatoli Michalski
Date: Tue, 12 Nov 1996 12:10:25 -0800

Radford Neal wrote:
> 
> 
> The correct answer is:
> 
>   E) Cannot be determined from the available information
> 
> ----------------------------------------------------------------------------> Radford M. Neal                                       radford@cs.utoronto.ca
----------------------------------------------------------------------------I think it is too pessimistic point of view. 
Let q be the probability not distroying the target in bad weather. 
The probability Pr not destroying the target in any weather using n
missiles has bounds (0.4)^n<=Pr<=0.4(q)^(n-1). If  q=1, then no profit 
to use n>1. The question is: what do we know about q?
Let us suppose, that three missiles are not enough for acceptable 
probability to destroy the target. This means, that (0.4)^3>0.4(q)^3 and
q<0.542884. 
Among answers A)-D) only A) and B) can be reached for such q.
The answer A) stands for q=0.4 and "weather proof" missile. The number of
missiles to quarantee 99% success is smallest integer n such that
(0.4)^n<=0.01 and equals 6.
The answer B) stands for q=0.538609. The number of missiles to quarantee 
99% success is smallest integer n such that 0.4(0.538609)^n<=0.01 
and again equals 6.
--------------------------
With compliments,
A.Michalski                                
Geomedizinische Forschungsstelle
Heidelberger Akademie der Wissenschaften
Radford Neal wrote:

Return to Top

Subject: CpK for industrial SPC. open limits?
From: Howard Atkins
Date: Tue, 12 Nov 1996 13:08:24 -0800

I am very interested in the subject of CPK.
We are requested by our customers - we are a custom plastics injection
company- to provid eproof of capability.
We are requested to show cpk greater than 1.66 and sometimes 
Sometimes we have to show this for performance tests that have just a
minimum or maximum value.
We have tried to explain that this is not a reasonable demand because of
the one sidedness of the tolerances.
Can any one help me to conviencingly persuade our customers that this is
an unreasonable demand.
-- 
Atkins Family, Kibbutz Revivim, D.N. Halutza, 85515, ISRAEL
howarda@ramat-negev.org.il.
http://www.ramat-negev.org.il/~howarda/

Return to Top

Downloaded by WWW Programs
Byron Palmer

Newsgroup sci.stat.math 11653

Directory

Articles